Bellman: Difference between revisions

From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs)
mNo edit summary
Brian Wilson (talk | contribs)
Line 109: Line 109:
* [[Owncloud]] server, linked to the mysql container.
* [[Owncloud]] server, linked to the mysql container.
* [[MySQL]] for asterisk and [[Owncloud]]
* [[MySQL]] for asterisk and [[Owncloud]]
* [[Nginx]] web server
* [[Nginx]] web server.
* [[Timemachine]] for Mac backups.
* [[Unifi]] to manage my Ubiquiti WiFi access point.
* [[Unifi]] to manage my Ubiquiti WiFi access point.
* [[Vault]] for secure storage of credentials.
* [[Vault]] for secure storage of credentials.

Revision as of 03:29, 23 March 2018

Bellman is a Debian Linux server. It is on a UPS and normally lives in my electronics lab. Right now it's in a self-storage locker in Cotati. Waaa!!! I hate this part of moving.

todo

email timemachine owncloud asterisk 15 firewall backups

Stuff I installed docker-ce net-tools sudo postfix emacs-nox

"What's the good of Mercator's North Poles and Equators,
Tropics, Zones, and Meridian Lines?"'
So the Bellman would cry: and the crew would reply
"They are merely conventional signs!"
--Lewis Carroll, The Hunting of the Snark

Notes on printing

The Brother drivers for my HL-L2320D suck, so I set up a raw driver on Bellman and then use the appropriate driver (manually selected) on client computers. It works fine.

Back ups

Disk

sudo mkdir bellman
bwilson@bellman:/green/BACKUPS$ cd bellman
bwilson@bellman:/green/BACKUPS/bellman$ sudo rsync -av --exclude proc --exclude /var/tmp -exclude /proc --exclude /sys --exclude /dev --exclude /home --exclude /green / .

Back up mysql - important ones are asterisk and owncloud, everything else can go. Well okay I guess maybe phpmyadmin can stay too.

sudo mkdir bellman_mysql
cd bellman_mysql
for i in asterisk mysql owncloud phpmyadmin yaris ; do 
  mysqldump $i > $i.sql
done


BBR congestion

See https://www.cyberciti.biz/cloud-computing/increase-your-linux-server-internet-speed-with-tcp-bbr-congestion-control/ for example.

Is kernel ready?

uname -a
Linux bellman 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux
grep 'CONFIG_TCP_CONG_BBR' /boot/config-$(uname -r)
grep 'CONFIG_NET_SCH_FQ' /boot/config-$(uname -r)
egrep 'CONFIG_TCP_CONG_BBR|CONFIG_NET_SCH_FQ' /boot/config-$(uname -r)

sudo -s 
cat > /etc/sysctl.d/10-custom-kernel-bbr.conf <<EOF
net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr
EOF

'''sysctl --system'''
* Applying /etc/sysctl.d/10-custom-kernel-bbr.conf ...
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
* Applying /etc/sysctl.d/30-postgresql-shm.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
net.ipv4.ip_forward = 1
* Applying /etc/sysctl.d/asterisk.conf ...
kernel.core_uses_pid = 1
kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t
fs.suid_dumpable = 2
* Applying /etc/sysctl.conf ...
net.ipv4.ip_forward = 1

That's that.

Services that run here

I am in the process of migrating some of these services to run in Docker containers.

  • Asterisk to run our phones
  • Festival for text to speech in Asterisk
  • git (see Running my own git server)
  • gpsd
  • Nginx as a reverse proxy for dockerized services
  • cups to share Brother and Canon printers
  • ssh to allow remote access
  • fail2ban to cut off break in attempts via ssh

I wasted time trying to get "nut" installed to monitor the Cyberpower 1500AVR and failed. It's not worth the effort. (2017-09-07)

DOCKERIZED!

Install Docker: https://docs.docker.com/install/linux/docker-ce/debian/

Click through on each link to see details on the particular service.

Router migration

First I had a Mikrotik router (RB750GL)

Then I ran Bellman as the router for a few years

Currently I use a Ubiquiti EdgeRouter-X

Bellman as a router

I dumped my little Mikrotik router and then used Bellman as a router and firewall. The main reason was that MT does not support OpenVPN over UDP and I needed to set up a connection to a remote site with OpenVPN. I want to do site-to-site (not machine-to-machine).

  1. DHCP migration was easy.
  2. DNS migration: same
  3. Firewall: meh, probably easier to maintain in Bellman, I have scripts there already; replicated the same rules

DHCP migration

I had to open port 67 on Bellman's firewall, after that it was done.

See /etc/dnsmasq.d/wildsong.biz

DNS migration

I read out the mappings from the MT and put them into /etc/hosts and it works.

Since dnsmasq is doing DHCP on Bellman, it catches names when devices register and puts them into DNS too. Slick.

Firewall migration

Most of the Mikrotik port forwarding rules simply passed traffic to Bellman, so instead of NAT I have to open outside ports for access to Twilio. I had already done that as part of the Vastra installation.

These services run on Bellman, so they no longer require NAT and port forwarding. This makes life a lot easier.

  • SIP over TCP
  • SIP over UDP
  • RTP for VOIP
  • IAX (disabled)
  • HTTP

Bellman has to do NAT for LAN traffic outbound.

To configure the firewall, I implemented two bash scripts, one fires on eth0 (LAN port) and the other fires on eth1 (WAN port).

I continue to use eth0 as the LAN port and I put DHCLIENT onto eth1 and it is the WAN port.

OpenVPN configuration

OpenVPN

Additional software tools installed

  • X11 desktop so I can use it from my workbench.

"autole" script to automate renewal of Let's Encrypt certificates, in /usr/local/sbin. See https://eblog.damia.net/2015/12/03/lets-encrypt-automation-on-debian/

location /.well-known {

 alias le/.well-known

}

History

2018-03-20 - Installed 8TB Archive drive, for TimeMachine and Owncloud storage. Moved from 120GB SSD to 750GB Samsung Evo 840. Installed clean copy of Stretch on the SSD.

2017-09-06 - Upgrade to 32GB RAM, yay! I need to do something with all that space. I did move /tmp to RAM; see SSD optimizations. I also removed a lot of dead code including lightdm (how'd that get in there?)

bwilson@bellman:~$ free
              total        used        free      shared  buff/cache   available
Mem:       32937080     2287376    27811208       25700     2838496    30153064


2017-08-25 - Migrated mariabdb and owncloud to Docker

2017-07-25 - Migrated logitech media server to Docker

2017-07-25 - Upgraded to Debian 9 (Stretch)

2016-10-16 - Seeing disk errors in the WDC. It's 6 years old! REPLACE!!! Installed new Seagate Barracuda ST2000DM006 2TB $70 10-26-16 Added a fan in the hard drive section of the case, too.

2016-01-26 - Installed VirtualBox 5.0.14 and Vagrant 1.8.1 (from DEB files, repos are too old) and started migration of services.

2015-12-?? - Moved to hardware formerly used for Vastra2

2015-07-10 - Added lm-sensors and added temperature tracking to Cacti.

2015-07-01 - Replaced APC UPS with Cyberpower. Installed monitoring software.

2015-06-19 - reconnected the MX330 printer and shared it.

2015-06-18 - upgraded to Debian 8 Jessie

2013-12-29 - returned from X-Mas and discovered Bellman won't boot. Snarks about a degraded RAID. Darn.

2013 Mar - Installed Linux Mint 14 so that I could use Makerware with my new Replicator 2

2013 Jan - Seagate Barracuda 2TB Green drive died. ST2000DL003 S/N 5YD77CTE Replaced with a Barracuda 2TB mirror

2011 Dec - Been doing PostGIS experiments so I upgraded the hardware.

2010 Jan - I just started this section but I have had this machine online for at least a couple years now.

2015-06-19 back up

Note this includes /home but not /green.

cd /
tar --one-file-system czvf /mnt/bellman_root.tar.gz .

2013-12-29 Rescue from boot fail

I no longer need a desktop environment on the small server, because I moved my main desktop next to the 3D printer. So I put Debian back on the server again. So I am going to try a Debian rescue image.

Diagnosis

Step 1. Build rescue thumbdrive. Download from http://debian.osuosl.org/ and copy image to thumbdrive

sudo cp debian-live-7.2-amd64-rescue.iso /dev/sdX
sudo sync
sudo eject /dev/sdX

where X is the appropriate drive letter, do NOT use the wrong letter!

Step 2. Boot Bellman with the thumb drive

Step 3. Look around

Using hdparm -i

  • sda Vertex SSD S/N OCZ-9UDI676M56Z4IR8P
  • sdb Seagate 2TB ST2000DM001-9YN164 S/N Z240BVP5
  • sdc Seagate 2TB ST2000DM001-9YN164 S/N Z240A0H1
  • sdd rescue drive
# fdisk -l /dev/sda

Disk /dev/sda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders, total 234441648 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009c7c9

  Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048   218460159   109229056   83  Linux
/dev/sda2       218462206   234440703     7989249    5  Extended
/dev/sda5       218462208   234440703     7989248   82  Linux swap / Solaris

sdb and sdc don't have partition tables as they are used in a RAID (see 2013 Jan entry)

See LVM page

cat /proc/mdstat 
Personalities : [raid1] 
md126 : active raid1 sda[1]
      117218240 blocks [2/1] [_U]
      
md127 : active raid1 sdb[0] sdc[1]
      1953514496 blocks [2/2] [UU]
      
unused devices: <none>

mdadm --detail /dev/md126
/dev/md126:
        Version : 0.90
  Creation Time : Thu Feb 21 06:23:36 2013
     Raid Level : raid1
     Array Size : 117218240 (111.79 GiB 120.03 GB)
  Used Dev Size : 117218240 (111.79 GiB 120.03 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 126
    Persistence : Superblock is persistent

    Update Time : Thu Feb 21 06:30:49 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 9f48e120:81a0f612:edd8d016:611227ea
         Events : 0.12

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8        0        1      active sync   /dev/sda

mdadm --detail /dev/md127
/dev/md127:
        Version : 0.90
  Creation Time : Mon Jan  7 04:12:45 2013
     Raid Level : raid1
     Array Size : 1953514496 (1863.02 GiB 2000.40 GB)
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Mon Dec 30 17:21:21 2013
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 462f6c0c:68770b3a:b268e686:64f77a36
         Events : 0.131

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc

Looks like there are 2 RAID's, and md126 is the broken one. It should be the SSD and something else? Time to open the box and see what's in there.

fdisk /dev/md126

Command (m for help): p

Disk /dev/md126: 120.0 GB, 120031477760 bytes
255 heads, 63 sectors/track, 14592 cylinders, total 234436480 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009c7c9

      Device Boot      Start         End      Blocks   Id  System
/dev/md126p1   *        2048   218460159   109229056   83  Linux
/dev/md126p2       218462206   234440703     7989249    5  Extended
/dev/md126p5       218462208   234440703     7989248   82  Linux swap / Solaris

Command (m for help): 

Conclusion - I was planning on doing RAID mirror and never got the second drive installed. I think I might have used it in Stellar instead. Steller's drive failed and needed immediate replacement. Something failed on the SSD and now it's not booting, but this has nothing to do with the hardware from what I can tell. It complains about the RAID missing a drive but that's not new.

2014 Jan 01 rebuild

Do as in the Linux Mint section below

Also note:

PRESERVE MYSQL!!

/etc/hdparm.conf

2013 Jan data mirror build

apt-get install mdadm lvm2
mdadm --create --metadata=0.90 --level=mirror --raid-devices=2 /dev/md0 /dev/sdb /dev/sdc
cat /proc/mdstat 
pvcreate /dev/md0 
vgcreate vg_mirror /dev/md0 
lvcreate --verbose --extents 100%FREE -n lv_mirror vg_mirror
mkfs.ext4 /dev/vg_mirror/lv_mirror 
mount /dev/vg_mirror/lv_mirror /green
dd if=/dev/zero of=/green/swapfile1 bs=1024 count=1048576

2013 Mar Linux Mint rebuild

Had to install mdadm and lvm2 but then it recognized the LVM drives All I had to do was mount the RAID on /green.

sudo apt-get install synaptic nfs-kernel-server ssh mysql-server phpmyadmin ntp winbind smartmontools postfix

Re-install dropbox

Re-install squeezeboxserver from Logitech. http://bellman:9000/

Set up cups again

Copy over /etc/exports file

Need AFP support for Apple Timemachine. See Netatalk 3 on Debian

December 2011 upgrade

Bellman had an Intel Little Falls Atom 230 mini-itx main board + 2GB RAM until Dec 2011. Bellman used to be an Athlon desktop system, I recycled the name because I like it.

Hardware

Newegg 09/03/2017 Inv 153021116
Newegg 10/16/2016 Inv 143374043
Newegg 11/21/2014 Inv 120335149

  • SUPERMICRO SYS-5018A-FTN4 1U Rackmount Server Barebone FCBGA 1283 DDR3 1600/1333
  • SUPERMICRO MCP-220-00051-0N Single 2.5" Fixed HDD Mounting Bracket
  • 4 x Kingston 8GB 204-Pin DDR3 SO-DIMM ECC Unbuffered DDR3 1600 (PC3 12800) Server Memory Model KVR16LSE11 (3 added 2017-09-07)
  • sda = Samsung MZ7WD120HCFV-00003 120GB
  • sdb = Seagate Archive 8TB (Installed 3/18/18, purchased 9/03/17)

eth0 00:25:90:F7:37:72

Bellman is configured to bring up a management interface on this ethernet interface too. (Optionally there is a separate management interface. This server has 5 ethernet ports, 4 on the motherboard and 1 on the management card.)

Operating system

  • Debian 8

Using BTRFS now on the Seagate drive. Sort of just to be consistent with what is on Tern though this is not RAID 0. Just one drive. I partitioned the Seagate this time, partition 1 could be a 50GB OS install, 2 is 50GB swap, and 3 is data (/green)

fstab


Printing

Canon MX330 "All in one" -- CUPS finds and sets it up if you plug it in and power it on.

This is my current /etc/cups/printers.conf

# Written by cupsd
# DO NOT EDIT THIS FILE WHEN CUPSD IS RUNNING
<Printer Brother_HL-2140_series>
UUID urn:uuid:24067d9a-1b41-370d-5ecf-dbb408aaa659
Info Brother HL-2140 series
Location Electronic Chronometry Laboratory
MakeModel Brother HL-2140 Foomatic/hl1250
DeviceURI usb://Brother/HL-2140%20series?serial=J8J894840
State Idle
StateTime 1388723199
Type 8433668
Accepting Yes
Shared Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
OpPolicy default
ErrorPolicy stop-printer
</Printer>
<Printer MX330-series>
UUID urn:uuid:54a86dc0-0994-37af-7d65-f084999a7307
Info Canon MX330 series
Location Electronic Chronometry Laboratory
MakeModel Canon PIXMA MX330 - CUPS+Gutenprint v5.2.9
DeviceURI usb://Canon/MX330%20series?serial=22F601&interface=1
State Idle
StateTime 1388637781
Type 4
Accepting Yes
Shared Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
OpPolicy default
ErrorPolicy retry-job
</Printer>

Software

Media server: it hosts my music collection. I keep the files in MP3 format, having transferred them from my CD's using grip. Music collection

File server: I keep my home directory here and NFS mount it on the desktop machine Raven. I removed Samba when upgrading to Stretch, I was not using it.

I edit files with emacs23

Spin down the Seagate drive

To reduce wear on the spinning hard drive, I am setting "apm" down to 127 (default is 254) so that it can spin down the drive. I use the server mostly for ownCloud and media storage so it can go to sleep at night and during long breaks this should make it last longer.

smartctl -s apm,127 /dev/sdb

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
APM set to level 127 (intermediate level with standby)

smartctl -A /dev/sdb

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   115   100   006    Pre-fail  Always       -       92858576
  3 Spin_Up_Time            0x0003   095   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       32
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       149013
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       619
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       32
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   065   056   045    Old_age   Always       -       35 (Min/Max 19/35)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       67
194 Temperature_Celsius     0x0022   035   044   000    Old_age   Always       -       35 (0 17 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       619 (178 156 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       1169083792
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       156038821

Automatic boot

I know it's possible to get this system to boot every day at a specific time because it's set to do that right now. I cannot find the setting! It's not in BIOS anywhere that I can see and I can't find in ipmitool either.

Sometimes I shut Bellman down at night, but it needs to boot in the morning before we get up so that the Logitech radio will work. The way to set it is NOT from BIOS, there is no user interface there. It's not from the IPMI web page either.

Maybe it's in here http://www.accuratesolution.net/asd/resume.htm

IPMI

Tips from Oracle: http://docs.oracle.com/cd/E19464-01/820-6850-11/IPMItool.html

Using ipmitool you can connect remotely so if the system is off you can turn it on. This means I could just script the turn on from another server...

ipmitool -H 192.168.1.3 -U ADMIN -P password chassis status
System Power         : off
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

Read environmental sensors

ipmitool -I lanplus -H 192.168.1.3 -P password -U ADMIN sdr elist full

CPU Temp         | 01h | lnr |  3.1 | 36 degrees C
System Temp      | 0Bh | ok  |  7.1 | 33 degrees C
Peripheral Temp  | 0Ch | ok  |  7.2 | 34 degrees C
DIMMA1 Temp      | B0h | ok  | 32.64 | 29 degrees C
DIMMA2 Temp      | B1h | ns  | 32.65 | No Reading
DIMMB1 Temp      | B4h | ns  | 32.68 | No Reading
DIMMB2 Temp      | B5h | ns  | 32.69 | No Reading
FAN1             | 41h | ok  | 29.1 | 3200 RPM
FAN2             | 42h | ns  | 29.2 | No Reading
FAN3             | 43h | ns  | 29.3 | No Reading
VCCP             | 20h | ok  |  3.2 | 0.82 Volts
VDIMM            | 24h | ok  | 32.1 | 1.33 Volts
12V              | 30h | ok  |  7.17 | 12.32 Volts
5VCC             | 31h | ok  |  7.33 | 4.95 Volts
3.3VCC           | 32h | ok  |  7.32 | 3.30 Volts
VBAT             | 33h | ok  |  7.18 | 2.97 Volts
5V Dual          | 37h | ok  |  7.15 | 4.95 Volts
3.3V AUX         | 38h | ok  |  7.12 | 3.28 Volts
Chassis Intru    | AAh | ok  | 23.1 | 

System event log (SEL)

ipmitool -I lanplus -H 192.168.1.3 -P password -U ADMIN sel list last 10

  43 |  Pre-Init  |0004692099| Unknown #0xff |  | Asserted
  44 |  Pre-Init  |0004692100| Unknown #0xff |  | Asserted
  45 |  Pre-Init  |0004692106| Unknown #0xff |  | Asserted
  46 |  Pre-Init  |0004692108| Unknown #0xff |  | Asserted
  47 |  Pre-Init  |0004692109| Unknown #0xff |  | Asserted
  48 |  Pre-Init  |0004692110| Unknown #0xff |  | Asserted
  49 |  Pre-Init  |0004692116| Unknown #0xff |  | Asserted
  4a |  Pre-Init  |0004692118| Unknown #0xff |  | Asserted
  4b |  Pre-Init  |0004692119| Unknown #0xff |  | Asserted
  4c |  Pre-Init  |0004692120| Unknown #0xff |  | Asserted

Backups

I am about to try Using Bacula for backups