Tern

From Wildsong
Jump to navigationJump to search

History

  • 2020-Jul-29 Tern is standing in for Bellman. RIP Bellman.
  • 2018-May-27 Now living on the workbench in eLab.
  • 2016-Oct-18 Replacing the Seagate 2TB drives with new 3TB ones, one of the drives has been throwing errors for a few weeks now.
  • 2016-Oct-16 Made it a fogg machine. (Ansiblized with fogg-ansible.)
  • 2016-Jan-20 set up as Time Machine and it backs up all 3 of our Macs.

Hardware

Hardware is a MiniITX server that used to be Bellman https://www.asrock.com/mb/AMD/E350M1USB3/

  • ASRock E350M1/USB3 AMD E-350 APU (1.6GHz, Dual-Core, AMD64) 6400.14 BogoMIPS
  • AMD A50M Hudson M1 Mini ITX Motherboard $125
  • Kingston 8GB (2 x 4GB) 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) Desktop Memory Model KVR1333D3N9K2/8G $35
  • OCZ Vertex 3 120GB S/N OCZ-9UDI676M56Z4IR8P
  • Case: brand name?? need to look it up. Cost about $50 250W power supply

Video drivers

The ASRock card has an onboard AMD Radeon HD 6310 graphics chip. Therefore I need to follow these instructions: http://wiki.debian.org

% lspci -v | grep VGA
00:01.0 VGA compatible controller: ATI Technologies Inc Device 9802 (prog-if 00 [VGA controller])

Flags: bus master, VGA palette snoop, 66MHz, medium devsel, latency 64

Dang but the aticonfig --initial command fails, unsupported hardware!

Maybe getting the driver from AMD support page will do it but I don't have time right now to deal with this.

See http://www.sensicomm.com/main/linux/acer_5253/index.shtml

Bringing up new 3TB drives as RAID 0 stripe using btrfs

apt-get install btrfs-progs
mkfs.btrfs -m raid0 -d raid0 /dev/sdb1 /dev/sdc1
blkid /dev/sdb1

Add entry to /etc/fstab using UUID and fs-type btrfs:wq In df -h it shows up as

/dev/sdb1       5.5T  1.2M  5.5T   1% /green

Changing over 2TB drives from mirror to stripe

It ended up on /dev/md126 (he shrugs).

mdadm --detail /dev/md127
mdadm --fail /dev/md127 /dev/sdc --remove /dev/sdc
mdadm --stop /dev/md127
mdadm --remove /dev/md127
fdisk /dev/sdb
fdisk /dev/sdc
mdadm --create /dev/md127 --level=0 --raid-devices=2 /dev/sdb1 /dev/sdc1
mkfs.ext4 /dev/md127
blkid /dev/md127 >> /etc/fstab
emacs /etc/fstab

lm-sensors

This was precipitated by the strange hum that comes through the floor from upstairs; the metrorack couples the fan vibration to the floor. I could replace the fan, or move the server, but I started wondering if I could simply turn the fan off most of the time.

The fix was to put the server on a soft pad. No more hum.

  1. Install from package.
  2. Run sensors-detect to add required kernel module(s).
  3. Manually load the module with 'modprobe w83627ehf'
  4. Run sensors to see what I can see:
# sensors
k10temp-pci-00c3
Adapter: PCI adapter
temp1:        +52.0�C  (high = +70.0�C)
                      (crit = +75.0�C, hyst = +72.0�C)

nct6775-isa-0290
Adapter: ISA adapter
Vcore:        +1.00 V  (min =  +0.00 V, max =  +1.74 V)
in1:          +0.93 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
AVCC:         +3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
+3.3V:        +3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:          +1.35 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:          +1.89 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:          +1.73 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
3VSB:         +3.44 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
Vbat:         +3.41 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:           0 RPM  (min =    0 RPM, div = 16)
fan2:           0 RPM  (min =    0 RPM, div = 16)
fan3:           0 RPM  (min =    0 RPM, div = 2)
fan4:           0 RPM  (div = 2)
SYSTIN:       +37.0�C  (high =  +0.0�C, hyst =  +0.0�C)  ALARM  sensor = thermistor
CPUTIN:       +36.5�C  (high = +80.0�C, hyst = +75.0�C)  sensor = thermistor
AUXTIN:       -14.5�C  (high = +80.0�C, hyst = +75.0�C)  sensor = thermistor
cpu0_vid:    +0.000 V
intrusion0:  ALARM

AUXTIN -14.5C wow that sensor is COLD. And does not exist... safe to ignore. Hmm, I wish it could read at least one fan RPM. I had to move the wire. It shows up as FAN2 now.

  1. Add temperature tracking to cacti. SYSTIN and CPUTIN look like the ones to watch.

Test snmp output; might need to enable MIBS in /etc/default/snmpd. This should give you interesting output.

#snmpwalk -v 2c -c public localhost 1.3.6.1.4.1.2021.13.16
LM-SENSORS-MIB::lmTempSensorsIndex.1 = INTEGER: 1
LM-SENSORS-MIB::lmTempSensorsIndex.15 = INTEGER: 15
LM-SENSORS-MIB::lmTempSensorsIndex.16 = INTEGER: 16
LM-SENSORS-MIB::lmTempSensorsIndex.17 = INTEGER: 17
LM-SENSORS-MIB::lmTempSensorsDevice.1 = STRING: temp1
LM-SENSORS-MIB::lmTempSensorsDevice.15 = STRING: SYSTIN
LM-SENSORS-MIB::lmTempSensorsDevice.16 = STRING: CPUTIN
LM-SENSORS-MIB::lmTempSensorsDevice.17 = STRING: AUXTIN
LM-SENSORS-MIB::lmTempSensorsValue.1 = Gauge32: 52000
LM-SENSORS-MIB::lmTempSensorsValue.15 = Gauge32: 37000
LM-SENSORS-MIB::lmTempSensorsValue.16 = Gauge32: 36500
..ad infinitum

I followed Eric A Hall's instructions to get the graphs for temps and fan2 into Cacti.

Software

I maintain this server using Ansible, so the details are tucked away in the fogg-ansible git project.

Netatalk

Update 2017 -- Don't do this! Use Docker now! See Netatalk 3 on Debian.

Installing Netatalk makes Tern visible as an Apple server.

https://daniel-lange.com/archives/102-Apple-Timemachine-backups-on-Debian-8-Jessie.html

Built deb packages for netatalk from git on bellman and install packages here.

apt-get install mysql-common libcrack2 libmysqlclient18 avahi-daemon
dpkg --install libatalk16_3.1.7-1_amd64.deb netatalk_3.1.7-1_amd64.deb

emacs /etc/netatalk/afp.conf

systemctl enable avahi-daemon
systemctl enable netatalk
systemctl start avahi-daemon
systemctl start netatalk

Contents of afp.conf file

The "valid user" is the name of a user in the local passwd file on Tern.

[Global]
; Global server settings
vol preset = default_for_all
log file = /var/log/netatalk.log
uam list = uams_dhx2.so,uams_clrtxt.so
save password = no

[default_for_all]
file perm = 0664
directory perm = 0774
cnid scheme = dbd
 
[Homes]
basedir regex = /home

[TimeMachine_Swift]
path = /home/timemachine/swift
time machine = yes
vol size limit = 3000000
valid users = julie

[TimeMachine_Stellar]
# both plover and stellar get backed up here
path = /home/timemachine/stellar
time machine = yes
vol size limit = 3000000
valid users = bwilson

Time Machine, on the MAC

You have to allow TimeMachine to write to an "unsupported volume" with this command

defaults write com.apple.systempreferences TMShowUnsupportedNetworkVolumes 1

On Stellar only, for some reason I had to manually create a sparse bundle on the server to make it work. Time Machine would not connect to Tern until I did. Then it would not use the bundle and I deleted it. Read "Note" below. Crazy. Connect to server from Finder. Swift and Plover had no problems creating their own bundles. Then in Terminal,

hdiutil create -size 300G -type SPARSEBUNDLE -verbose -fs HFS+J -volname "Time Machine" /Volumes/Timemachine_Stellar/stellar.sparsebundle

The SPARSEBUNDLE type means the filesystem will grow in 1M pieces, it won't immediately allocate 300GB. 300GB is the limit and you can resize it later if you need to. Stellar has a 256GB drive and right now has 115GB on it. After executing this command, the empty bundle consumed 430M of space. I remember when 430M was a huge amount of space. :-)

Note on filenames First I mimicked what I saw created by Plover: Plover.sparsebundle. I created Stellar.sparsebundle and ran Time Machine. It smiled and happily accepted that as a destination and immediately created 'stellar.sparsebundle' next to it. I think this is a screw up waiting to bite me, so I shut the Time Machine backup down, removed both bundles from the server with "rm -rf", revised the above sample command and created 'stellar.sparsebundle'.

NOW it's busily creating "stellar 1.tmp". Sigh, seems very Mac-ish. I wonder if that's good or bad. I will let it play and go do something else. This Mac is backed up to an external drive right now anyway so I can't think of any unbearable worst-case scenarios.

I am thinking about what a great program 'rsync' is...

After the "preparing" stage it renamed the folder to "stellar 1.sparsebundle", apparently ignoring the one I created. When it's done backing up I will nuke the one I created manually and see if it remains happy.

Timemachine and wake-on-lan

I put this machine on a scehdule to power off every night and BIOS power on at 6AM (1300 UTC). The intent is to reduce run time and power usage. Running the spinning drives only 12 hours a day means they should last twice as long.

To allow waking up the system remotely when it's off, I enabled wake-on-lan. In BIOS, I enabled wake-on-PCI and in a shell I used ethtool -s eth0 wol g to enable the magic packet (g) in the ethernet interface.

On bellman I can use this command to wake up tern: sudo etherwake 00:25:22:c4:0f:b6. I put this in a daily cron job so that even if the BIOS thing does not work then tern will still come online every morning.