Bellman: Difference between revisions
Brian Wilson (talk | contribs) m →todo |
Brian Wilson (talk | contribs) |
||
Line 111: | Line 111: | ||
Install Docker: https://docs.docker.com/install/linux/docker-ce/debian/ | Install Docker: https://docs.docker.com/install/linux/docker-ce/debian/ | ||
Set up /etc/default/docker - use this | |||
DOCKER_OPTS="--dns 192.168.123.2" | |||
I tried to use systemd to start and stop docker containers but ran into a systemd bug that causes a MAC address error. | |||
I decided to use the "restart" option instead. | |||
docker update | |||
Click through on each link to see details on the particular service. | Click through on each link to see details on the particular service. |
Revision as of 14:18, 11 April 2018
Bellman is a Debian Linux server. It is on a UPS and lives in my studio.
"What's the good of Mercator's North Poles and Equators,
Tropics, Zones, and Meridian Lines?"'
So the Bellman would cry: and the crew would reply
"They are merely conventional signs!"
--Lewis Carroll, The Hunting of the Snark
todo
owncloud asterisk 15 backups
Packages I installed on Debian 9
curl emacs-nox dnsmasq dnsutils docker-ce mailutils net-tools postfix rsync sudo
Firewall
See https://blog.daknob.net/debian-firewall-docker/ for ideas.
I use my own bash script to load iptables rules.
Notes on printing
The Brother drivers provides for Linux for my HL-L2320D printer don't work, so I set up a raw driver on Bellman and then use the appropriate driver (manually selected) on client computers. It works fine.
Back ups
Disk
sudo mkdir bellman bwilson@bellman:/green/BACKUPS$ cd bellman bwilson@bellman:/green/BACKUPS/bellman$ sudo rsync -av --exclude proc --exclude /var/tmp -exclude /proc --exclude /sys --exclude /dev --exclude /home --exclude /green / .
Back up mysql - important ones are asterisk and owncloud, everything else can go. Well okay I guess maybe phpmyadmin can stay too.
sudo mkdir bellman_mysql cd bellman_mysql for i in asterisk mysql owncloud phpmyadmin yaris ; do mysqldump $i > $i.sql done
BBR congestion
See https://www.cyberciti.biz/cloud-computing/increase-your-linux-server-internet-speed-with-tcp-bbr-congestion-control/ for example.
Is kernel ready?
uname -a Linux bellman 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux grep 'CONFIG_TCP_CONG_BBR' /boot/config-$(uname -r) grep 'CONFIG_NET_SCH_FQ' /boot/config-$(uname -r) egrep 'CONFIG_TCP_CONG_BBR|CONFIG_NET_SCH_FQ' /boot/config-$(uname -r) sudo -s cat > /etc/sysctl.d/10-custom-kernel-bbr.conf <<EOF net.core.default_qdisc=fq net.ipv4.tcp_congestion_control=bbr EOF '''sysctl --system''' * Applying /etc/sysctl.d/10-custom-kernel-bbr.conf ... net.core.default_qdisc = fq net.ipv4.tcp_congestion_control = bbr * Applying /etc/sysctl.d/30-postgresql-shm.conf ... * Applying /etc/sysctl.d/99-sysctl.conf ... net.ipv4.ip_forward = 1 * Applying /etc/sysctl.d/asterisk.conf ... kernel.core_uses_pid = 1 kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t fs.suid_dumpable = 2 * Applying /etc/sysctl.conf ... net.ipv4.ip_forward = 1
That's that.
Services that run here
I am in the process of migrating some of these services to run in Docker containers.
- Asterisk to run our phones
- Festival for text to speech in Asterisk
- git (see Running my own git server)
- gpsd
- Nginx as a reverse proxy for dockerized services
- cups to share Brother and Canon printers
- ssh to allow remote access
- fail2ban to cut off break in attempts via ssh
I wasted time trying to get "nut" installed to monitor the Cyberpower 1500AVR and failed. It's not worth the effort. (2017-09-07)
DOCKERIZED!
Install Docker: https://docs.docker.com/install/linux/docker-ce/debian/
Set up /etc/default/docker - use this
DOCKER_OPTS="--dns 192.168.123.2"
I tried to use systemd to start and stop docker containers but ran into a systemd bug that causes a MAC address error. I decided to use the "restart" option instead.
docker update
Click through on each link to see details on the particular service.
- Logitech squeezebox server SqueezeBox See Streaming media for installation notes.
- Owncloud server, linked to the mysql container.
- MySQL for asterisk and Owncloud
- Nginx web server.
- Apple Time Machine and Netatalk 3 for Mac backups.
- Unifi to manage my Ubiquiti WiFi access point.
- Vault for secure storage of credentials.
Router migration
First I had a Mikrotik router (RB750GL)
Then I ran Bellman as the router for a few years
Currently I use a Ubiquiti EdgeRouter-X
Bellman as a router
I dumped my little Mikrotik router and then used Bellman as a router and firewall. The main reason was that MT does not support OpenVPN over UDP and I needed to set up a connection to a remote site with OpenVPN. I want to do site-to-site (not machine-to-machine).
- DHCP migration was easy.
- DNS migration: same
- Firewall: meh, probably easier to maintain in Bellman, I have scripts there already; replicated the same rules
DHCP migration
I had to open port 67 on Bellman's firewall, after that it was done.
See /etc/dnsmasq.d/wildsong.biz
DNS migration
I read out the mappings from the MT and put them into /etc/hosts and it works.
Since dnsmasq is doing DHCP on Bellman, it catches names when devices register and puts them into DNS too. Slick.
Firewall migration
Most of the Mikrotik port forwarding rules simply passed traffic to Bellman, so instead of NAT I have to open outside ports for access to Twilio. I had already done that as part of the Vastra installation.
These services run on Bellman, so they no longer require NAT and port forwarding. This makes life a lot easier.
- SIP over TCP
- SIP over UDP
- RTP for VOIP
- IAX (disabled)
- HTTP
Bellman has to do NAT for LAN traffic outbound.
To configure the firewall, I implemented two bash scripts, one fires on eth0 (LAN port) and the other fires on eth1 (WAN port).
I continue to use eth0 as the LAN port and I put DHCLIENT onto eth1 and it is the WAN port.
OpenVPN configuration
Additional software tools installed
- X11 desktop so I can use it from my workbench.
"autole" script to automate renewal of Let's Encrypt certificates, in /usr/local/sbin. See https://eblog.damia.net/2015/12/03/lets-encrypt-automation-on-debian/
location /.well-known {
alias le/.well-known
}
History
2018-03-20 - Installed 8TB Archive drive, for TimeMachine and Owncloud storage. Moved from 120GB SSD to 750GB Samsung Evo 840. Installed clean copy of Stretch on the SSD.
2017-09-06 - Upgrade to 32GB RAM, yay! I need to do something with all that space. I did move /tmp to RAM; see SSD optimizations. I also removed a lot of dead code including lightdm (how'd that get in there?)
bwilson@bellman:~$ free total used free shared buff/cache available Mem: 32937080 2287376 27811208 25700 2838496 30153064
2017-08-25 - Migrated mariabdb and owncloud to Docker
2017-07-25 - Migrated logitech media server to Docker
2017-07-25 - Upgraded to Debian 9 (Stretch)
2016-10-16 - Seeing disk errors in the WDC. It's 6 years old! REPLACE!!! Installed new Seagate Barracuda ST2000DM006 2TB $70 10-26-16 Added a fan in the hard drive section of the case, too.
2016-01-26 - Installed VirtualBox 5.0.14 and Vagrant 1.8.1 (from DEB files, repos are too old) and started migration of services.
2015-12-?? - Moved to hardware formerly used for Vastra2
2015-07-10 - Added lm-sensors and added temperature tracking to Cacti.
2015-07-01 - Replaced APC UPS with Cyberpower. Installed monitoring software.
2015-06-19 - reconnected the MX330 printer and shared it.
2015-06-18 - upgraded to Debian 8 Jessie
2013-12-29 - returned from X-Mas and discovered Bellman won't boot. Snarks about a degraded RAID. Darn.
2013 Mar - Installed Linux Mint 14 so that I could use Makerware with my new Replicator 2
2013 Jan - Seagate Barracuda 2TB Green drive died. ST2000DL003 S/N 5YD77CTE Replaced with a Barracuda 2TB mirror
2011 Dec - Been doing PostGIS experiments so I upgraded the hardware.
2010 Jan - I just started this section but I have had this machine online for at least a couple years now.
2015-06-19 back up
Note this includes /home but not /green.
cd / tar --one-file-system czvf /mnt/bellman_root.tar.gz .
2013-12-29 Rescue from boot fail
I no longer need a desktop environment on the small server, because I moved my main desktop next to the 3D printer. So I put Debian back on the server again. So I am going to try a Debian rescue image.
Diagnosis
Step 1. Build rescue thumbdrive. Download from http://debian.osuosl.org/ and copy image to thumbdrive
sudo cp debian-live-7.2-amd64-rescue.iso /dev/sdX sudo sync sudo eject /dev/sdX
where X is the appropriate drive letter, do NOT use the wrong letter!
Step 2. Boot Bellman with the thumb drive
Step 3. Look around
Using hdparm -i
- sda Vertex SSD S/N OCZ-9UDI676M56Z4IR8P
- sdb Seagate 2TB ST2000DM001-9YN164 S/N Z240BVP5
- sdc Seagate 2TB ST2000DM001-9YN164 S/N Z240A0H1
- sdd rescue drive
# fdisk -l /dev/sda Disk /dev/sda: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders, total 234441648 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009c7c9 Device Boot Start End Blocks Id System /dev/sda1 * 2048 218460159 109229056 83 Linux /dev/sda2 218462206 234440703 7989249 5 Extended /dev/sda5 218462208 234440703 7989248 82 Linux swap / Solaris
sdb and sdc don't have partition tables as they are used in a RAID (see 2013 Jan entry)
See LVM page
cat /proc/mdstat Personalities : [raid1] md126 : active raid1 sda[1] 117218240 blocks [2/1] [_U] md127 : active raid1 sdb[0] sdc[1] 1953514496 blocks [2/2] [UU] unused devices: <none> mdadm --detail /dev/md126 /dev/md126: Version : 0.90 Creation Time : Thu Feb 21 06:23:36 2013 Raid Level : raid1 Array Size : 117218240 (111.79 GiB 120.03 GB) Used Dev Size : 117218240 (111.79 GiB 120.03 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 126 Persistence : Superblock is persistent Update Time : Thu Feb 21 06:30:49 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 9f48e120:81a0f612:edd8d016:611227ea Events : 0.12 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 0 1 active sync /dev/sda mdadm --detail /dev/md127 /dev/md127: Version : 0.90 Creation Time : Mon Jan 7 04:12:45 2013 Raid Level : raid1 Array Size : 1953514496 (1863.02 GiB 2000.40 GB) Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 127 Persistence : Superblock is persistent Update Time : Mon Dec 30 17:21:21 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 462f6c0c:68770b3a:b268e686:64f77a36 Events : 0.131 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc
Looks like there are 2 RAID's, and md126 is the broken one. It should be the SSD and something else? Time to open the box and see what's in there.
fdisk /dev/md126 Command (m for help): p Disk /dev/md126: 120.0 GB, 120031477760 bytes 255 heads, 63 sectors/track, 14592 cylinders, total 234436480 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009c7c9 Device Boot Start End Blocks Id System /dev/md126p1 * 2048 218460159 109229056 83 Linux /dev/md126p2 218462206 234440703 7989249 5 Extended /dev/md126p5 218462208 234440703 7989248 82 Linux swap / Solaris Command (m for help):
Conclusion - I was planning on doing RAID mirror and never got the second drive installed. I think I might have used it in Stellar instead. Steller's drive failed and needed immediate replacement. Something failed on the SSD and now it's not booting, but this has nothing to do with the hardware from what I can tell. It complains about the RAID missing a drive but that's not new.
2014 Jan 01 rebuild
Do as in the Linux Mint section below
Also note:
PRESERVE MYSQL!!
/etc/hdparm.conf
2013 Jan data mirror build
apt-get install mdadm lvm2 mdadm --create --metadata=0.90 --level=mirror --raid-devices=2 /dev/md0 /dev/sdb /dev/sdc cat /proc/mdstat pvcreate /dev/md0 vgcreate vg_mirror /dev/md0 lvcreate --verbose --extents 100%FREE -n lv_mirror vg_mirror mkfs.ext4 /dev/vg_mirror/lv_mirror mount /dev/vg_mirror/lv_mirror /green dd if=/dev/zero of=/green/swapfile1 bs=1024 count=1048576
2013 Mar Linux Mint rebuild
Had to install mdadm and lvm2 but then it recognized the LVM drives All I had to do was mount the RAID on /green.
sudo apt-get install synaptic nfs-kernel-server ssh mysql-server phpmyadmin ntp winbind smartmontools postfix
Re-install dropbox
Re-install squeezeboxserver from Logitech. http://bellman:9000/
Set up cups again
Copy over /etc/exports file
Need AFP support for Apple Timemachine. See Netatalk 3 on Debian
December 2011 upgrade
Bellman had an Intel Little Falls Atom 230 mini-itx main board + 2GB RAM until Dec 2011. Bellman used to be an Athlon desktop system, I recycled the name because I like it.
Hardware
Newegg 09/03/2017 Inv 153021116
Newegg 10/16/2016 Inv 143374043
Newegg 11/21/2014 Inv 120335149
- SUPERMICRO SYS-5018A-FTN4 1U Rackmount Server Barebone FCBGA 1283 DDR3 1600/1333
- SUPERMICRO MCP-220-00051-0N Single 2.5" Fixed HDD Mounting Bracket
- 4 x Kingston 8GB 204-Pin DDR3 SO-DIMM ECC Unbuffered DDR3 1600 (PC3 12800) Server Memory Model KVR16LSE11 (3 added 2017-09-07)
- sda = Samsung MZ7WD120HCFV-00003 120GB
- sdb = Seagate Archive 8TB (Installed 3/18/18, purchased 9/03/17)
eth0 00:25:90:F7:37:72
Bellman is configured to bring up a management interface on this ethernet interface too. (Optionally there is a separate management interface. This server has 5 ethernet ports, 4 on the motherboard and 1 on the management card.)
Operating system
- Debian 8
Using BTRFS now on the Seagate drive. Sort of just to be consistent with what is on Tern though this is not RAID 0. Just one drive. I partitioned the Seagate this time, partition 1 could be a 50GB OS install, 2 is 50GB swap, and 3 is data (/green)
fstab
Printing
Canon MX330 "All in one" -- CUPS finds and sets it up if you plug it in and power it on.
This is my current /etc/cups/printers.conf
# Written by cupsd # DO NOT EDIT THIS FILE WHEN CUPSD IS RUNNING <Printer Brother_HL-2140_series> UUID urn:uuid:24067d9a-1b41-370d-5ecf-dbb408aaa659 Info Brother HL-2140 series Location Electronic Chronometry Laboratory MakeModel Brother HL-2140 Foomatic/hl1250 DeviceURI usb://Brother/HL-2140%20series?serial=J8J894840 State Idle StateTime 1388723199 Type 8433668 Accepting Yes Shared Yes JobSheets none none QuotaPeriod 0 PageLimit 0 KLimit 0 OpPolicy default ErrorPolicy stop-printer </Printer> <Printer MX330-series> UUID urn:uuid:54a86dc0-0994-37af-7d65-f084999a7307 Info Canon MX330 series Location Electronic Chronometry Laboratory MakeModel Canon PIXMA MX330 - CUPS+Gutenprint v5.2.9 DeviceURI usb://Canon/MX330%20series?serial=22F601&interface=1 State Idle StateTime 1388637781 Type 4 Accepting Yes Shared Yes JobSheets none none QuotaPeriod 0 PageLimit 0 KLimit 0 OpPolicy default ErrorPolicy retry-job </Printer>
Software
Media server: it hosts my music collection. I keep the files in MP3 format, having transferred them from my CD's using grip. Music collection
File server: I keep my home directory here and NFS mount it on the desktop machine Raven. I removed Samba when upgrading to Stretch, I was not using it.
I edit files with emacs23
Spin down the Seagate drive
To reduce wear on the spinning hard drive, I am setting "apm" down to 127 (default is 254) so that it can spin down the drive. I use the server mostly for ownCloud and media storage so it can go to sleep at night and during long breaks this should make it last longer.
smartctl -s apm,127 /dev/sdb
=== START OF ENABLE/DISABLE COMMANDS SECTION === APM set to level 127 (intermediate level with standby)
smartctl -A /dev/sdb
=== START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 115 100 006 Pre-fail Always - 92858576 3 Spin_Up_Time 0x0003 095 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 32 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 149013 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 619 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 32 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 065 056 045 Old_age Always - 35 (Min/Max 19/35) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 67 194 Temperature_Celsius 0x0022 035 044 000 Old_age Always - 35 (0 17 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 619 (178 156 0) 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 1169083792 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 156038821
Automatic boot
I know it's possible to get this system to boot every day at a specific time because it's set to do that right now. I cannot find the setting! It's not in BIOS anywhere that I can see and I can't find in ipmitool either.
Sometimes I shut Bellman down at night, but it needs to boot in the morning before we get up so that the Logitech radio will work. The way to set it is NOT from BIOS, there is no user interface there. It's not from the IPMI web page either.
Maybe it's in here http://www.accuratesolution.net/asd/resume.htm
IPMI
Tips from Oracle: http://docs.oracle.com/cd/E19464-01/820-6850-11/IPMItool.html
Using ipmitool you can connect remotely so if the system is off you can turn it on. This means I could just script the turn on from another server...
ipmitool -H 192.168.1.3 -U ADMIN -P password chassis status System Power : off Power Overload : false Power Interlock : inactive Main Power Fault : false Power Control Fault : false Power Restore Policy : always-off Last Power Event : Chassis Intrusion : inactive Front-Panel Lockout : inactive Drive Fault : false Cooling/Fan Fault : false
Read environmental sensors
ipmitool -I lanplus -H 192.168.1.3 -P password -U ADMIN sdr elist full
CPU Temp | 01h | lnr | 3.1 | 36 degrees C System Temp | 0Bh | ok | 7.1 | 33 degrees C Peripheral Temp | 0Ch | ok | 7.2 | 34 degrees C DIMMA1 Temp | B0h | ok | 32.64 | 29 degrees C DIMMA2 Temp | B1h | ns | 32.65 | No Reading DIMMB1 Temp | B4h | ns | 32.68 | No Reading DIMMB2 Temp | B5h | ns | 32.69 | No Reading FAN1 | 41h | ok | 29.1 | 3200 RPM FAN2 | 42h | ns | 29.2 | No Reading FAN3 | 43h | ns | 29.3 | No Reading VCCP | 20h | ok | 3.2 | 0.82 Volts VDIMM | 24h | ok | 32.1 | 1.33 Volts 12V | 30h | ok | 7.17 | 12.32 Volts 5VCC | 31h | ok | 7.33 | 4.95 Volts 3.3VCC | 32h | ok | 7.32 | 3.30 Volts VBAT | 33h | ok | 7.18 | 2.97 Volts 5V Dual | 37h | ok | 7.15 | 4.95 Volts 3.3V AUX | 38h | ok | 7.12 | 3.28 Volts Chassis Intru | AAh | ok | 23.1 |
System event log (SEL)
ipmitool -I lanplus -H 192.168.1.3 -P password -U ADMIN sel list last 10
43 | Pre-Init |0004692099| Unknown #0xff | | Asserted 44 | Pre-Init |0004692100| Unknown #0xff | | Asserted 45 | Pre-Init |0004692106| Unknown #0xff | | Asserted 46 | Pre-Init |0004692108| Unknown #0xff | | Asserted 47 | Pre-Init |0004692109| Unknown #0xff | | Asserted 48 | Pre-Init |0004692110| Unknown #0xff | | Asserted 49 | Pre-Init |0004692116| Unknown #0xff | | Asserted 4a | Pre-Init |0004692118| Unknown #0xff | | Asserted 4b | Pre-Init |0004692119| Unknown #0xff | | Asserted 4c | Pre-Init |0004692120| Unknown #0xff | | Asserted
Backups
I am about to try Using Bacula for backups