Docker Swarm: Difference between revisions

From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs)
mNo edit summary
Brian Wilson (talk | contribs)
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
Docker Swarm is an orchestrator and so is Kubernetes.  
Docker Swarm is an orchestrator for Docker containers and so is Kubernetes.
Today I spend an hour examining [[Kubernetes]] and it just adds more complexity that I don't need.


[[Kubernetes]] is breathing down my neck too and today I am thinking, "What the hell, go for it! Why NOT run a single node with Kubernetes?"
I ran it for about a year at home then went back to [[Docker Compose]]. Swarm is extra work I don't need.
I spent an hour looking at it and it just adds more complexity.


Some day I will spin up some IoT and Edge nodes for buzzword compliance but for now it's just one node. [[Bellman]]
== Initialization ==


Note, it made me pick an ethernet address, this is Bellman's primary internet interface.
Here is the command to turn Bellman into a Docker Swarm Manager and a Node.
The command made me pick an ethernet address, because Bellman has more than one.
This is Bellman's primary internet interface.


  bellman> '''docker swarm init --advertise-addr 192.168.123.2'''
  bellman> '''docker swarm init --advertise-addr 192.168.123.2'''
Line 17: Line 19:
  To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
  To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.


I can add a node on another machine using that token. I won't be doing this today. It would look like this.
I can add more nodes on other machines using that token. I won't be doing this today. It would look like this.


  '''tern>''' docker swarm join --token SWMTKN-1-5b81dywl9xkis6769fxnsvjahfy361w2kxkz69nc35bz3nxt6s-43jxeopl6inw8xur1vpcl23w7 192.168.123.2:2377
  '''tern>''' docker swarm join --token SWMTKN-1-5b81dywl9xkis6769fxnsvjahfy361w2kxkz69nc35bz3nxt6s-43jxeopl6inw8xur1vpcl23w7 192.168.123.2:2377
Line 26: Line 28:
  isk0jocx0rb37yonoafstyvoj *  bellman            Ready              Active              Leader              19.03.5
  isk0jocx0rb37yonoafstyvoj *  bellman            Ready              Active              Leader              19.03.5
  vjbx2h8n8280ecib2btzkwcxw    tern                Ready              Active                                  18.09.1
  vjbx2h8n8280ecib2btzkwcxw    tern                Ready              Active                                  18.09.1
=== Create a network ===


  bellman> '''docker network create -d overlay --attachable testing'''
  bellman> '''docker network create -d overlay --attachable testing'''
Line 32: Line 36:
The "attachable" option is for containers not yet running in swarm. I will need that soon, when I run bash in Debian for tests.
The "attachable" option is for containers not yet running in swarm. I will need that soon, when I run bash in Debian for tests.


Now ordinarily I'd use Docker Compose to start a reverse proxy (my favorite today is Varnish).
== Spinning up my first Swarm-managed container ==


Before worrying about Varnish though I will spin up simple web server on just the one existing node.
To try things out I will spin up 4 copies of a simple web server.
If I add tern it will spread them over the two nodes. I want it to use that "testing" network.
When I add Tern it will spread them over the two nodes. I want it to use that "testing" network.


  docker service create --name web --replicas 4 -p 80:80 --network testing --detach nginx:latest
  docker service create --name web --replicas 4 -p 80:80 --network testing --detach nginx:latest


Now I have 4 copies of nginx running. I can see that they were published on port 80 but that's inside the funny swarm network, how to see them? They are on localhost, I can do this "curl http://localhost". I can get the id (or just use the name "web")  
Now I have 4 copies of nginx running. They were published on port 80 but that's inside the funny swarm network, how to see them?  
and then kill them off,
Well, they are also exposed directly on the host when I used the -p "publish" option. I can do "curl http://localhost".  
I can get the id (or just use the name "web") and then kill them off,


  docker service ls
  docker service ls
Line 46: Line 51:
  curl http://localhost
  curl http://localhost


When I do the "curl" with the nginx replicas shut down, I can see the page served by Varnish (still running in Compose), it's the Home Assistant instance.
When I do "curl" with the nginx replicas shut down, I see the page served by Varnish (still running in Compose), it's showing me the Home Assistant instance.
So I guess '''the swarm takes precedence over whatever is running in Compose'''.  
So I guess '''the swarm takes precedence over whatever is running in Compose'''.  
'''This bit me when I accidentally masked Psono by putting the test for nginx on port 81. That's where I run Psono.'''  
'''This bit me when I accidentally masked Psono by putting the test for nginx on port 81. That's where I run Psono.'''  


I skipped the "--network testing" parameter and everything still worked. I think maybe that's just so I can attach more services later?? Here's a test. This looks the way I expected.
I also tried skipping the "--network testing" parameter and everything still worked. Creating a separate network allows me to isolate
containers, just like with Docker and Docker Compose. For example, there is no reason for my Home Assistant and Pihole containers to
know about each other.


  docker run -it --rm --network testing debian:bullseye bash
  docker run -it --rm --network testing debian:bullseye bash
Line 69: Line 75:
The standard nginx page is returned from curl both times, so I know it's hitting a replica and running under the name "web", which is what I assigned.
The standard nginx page is returned from curl both times, so I know it's hitting a replica and running under the name "web", which is what I assigned.
Inside the container I can see my local LAN too, for example from the Debian instance I can "curl http://bellman.wildsong.biz:8123/" and get the Home Assistant page.
Inside the container I can see my local LAN too, for example from the Debian instance I can "curl http://bellman.wildsong.biz:8123/" and get the Home Assistant page.
So far, easy easy.
So far, all this is easy.


== That healthcheck thing ==
== Healthcheck ==


It's not too soon to think about it. ;-)
With nginx I can create my own Docker image and bake the healthcheck right into the Docker image.
 
I know "curl" is not the best way to do this. https://blog.sixeyed.com/docker-healthchecks-why-not-to-use-curl-or-iwr/ but for now it's what I am using!   
With nginx I can create my own Docker image and build the healthcheck right into the image.
 
Yes, I know "curl" is not the answer. https://blog.sixeyed.com/docker-healthchecks-why-not-to-use-curl-or-iwr/ but for now it's what I am using!   


In my Dockerfile, I have this  
In my Dockerfile, I have this  
Line 90: Line 93:
  01a3f36f7580  wildsong/nginx:latest      "/docker-entrypoint.…"  About a minute ago  Up About a minute (healthy)  80/tcp  web.1.ssssfmwmwp8je7g1dnsabevew
  01a3f36f7580  wildsong/nginx:latest      "/docker-entrypoint.…"  About a minute ago  Up About a minute (healthy)  80/tcp  web.1.ssssfmwmwp8je7g1dnsabevew


When I create the service, I will get a warning because I have not pushed that image (wildsong/nginx) to a registry, but it still works because I am running only one node for now. When I do "docker ps" I can see that the container is marked as "healthy".
When I create the service, I will get a warning because I have not pushed that image (wildsong/nginx) to a registry,  
but it still works because I am running only one node for now.  
When I do "docker ps" I can see that the container is marked as "healthy".
 
The service is bound to the IP address I created with the "docker swarm init" command. So I can hit it with the localhost address or the one I specified, with nginx, pointing a browser at http://192.168.123.2/ works.
 
I suspect this means Varnish will redirect traffic to it??? https://bellman.wildsong.biz/ should work if that's true. It does not, for some reason it's going to Home Assistant.
 
Okay. Some rule in Varnish was kicking in, not sure what, but I added Bellman support in there and now it's working as expected.
 
== Replicas 0, Problems 1 ==
 
Earlier in my testing regimen I was encountering creating a stack
and having lots of services running but showing "Replicas 0/1".
This means, nothing running. When I searched for the containers with "docker ps", there was nothing.
I kept going and got past this. When I figure out in 5 minutes what I did wrong I will
write it up here. But that is what 0/1 means: nothing running for that service.


== Service migration plan ==


The service is bound to the IP address I created with the "docker swarm init" command. So I can hit it with the localhost address or the one I specified, with nginx, pointing a browser at http://192.168.123.2/ works.
I moved most of the services I normally use on Bellman into the Swarm to try it out, for about a year.
 
# Varnish (and hitch)
# [[Pihole]]
# Psono (Including PostgreSQL) (retired 9/9/2023 in favor of [[Bit Warden]].)
# Unifi (retired 9/9/2023)
# Logitech Media Server (Squeezebox)
 
# [[Home Assistant]] (Including Mosquitto) This cannot move to Swarm because it uses a USB device currently for Zigbee.
 
Psono has to be accessible from the Internet but none of the others do; some things are in Varnish but really currently only for testing.
 
I see a copy of mysql, I wonder who that belongs to.
 
Before I can proceed, I need persistent data, and I need to deal with these issues.
 
=== Persistence: How do volumes work in a swarm? ===
 
Normally my life revolves around file storage.
I never noticed there are also block storage and object storage options. For example you can use a block device and then use the btrfs driver in Docker. There's a thing called the "devicemapper". All so exotic, beyond my attention span presently. Go read here: [https://docs.docker.com/storage/storagedriver/select-storage-driver/ httpgeners://docs.docker.com/storage/storagedriver/select-storage-driver/]


I suspect this means Varnish will redirect traffic to it??? https://bellman.wildsong.biz/ should work if that's true. It does not, for some reason it's going to Home Assistant.
On a Docker Compose set up, the Docker Engine manages the volumes
and I generally ignore how it does that. These are called "local" volumes.


=== How do volumes work in a swarm? ===
With Swarm I would have to consider what happens if I have replica servers spread across several physical servers. But I don't.


docker service create --name proxy \
Refer to Nigel Poulton's "Docker Deep Dive" chapter 13.
  -p 80:80 -p 443:443 \
He suggests putting the volumes onto a shared NFS server and briefly says you can corrupt files quickly this way.
  -e DHPARAM_GENERATION="false" \
Yeah I can see that. Ha. Fortunately I normally mount most volumes READ ONLY when I can.
  -v /var/run/docker.sock:/tmp/docker.sock:ro \
  -v ./network_internal.conf:/etc/nginx/network_internal.conf \
  -v ./vhost.d:/etc/nginx/vhost.d \
  -v proxy_html:/usr/share/nginx/html \
  -v proxy_dhparam:/etc/nginx/dhparam \
  -v proxy_certs:/etc/nginx/certs:ro \
  jwilder/nginx-proxy:alpine


== Bring in Compose ==
== Bring in Compose ==

Latest revision as of 21:51, 29 November 2023

Docker Swarm is an orchestrator for Docker containers and so is Kubernetes. Today I spend an hour examining Kubernetes and it just adds more complexity that I don't need.

I ran it for about a year at home then went back to Docker Compose. Swarm is extra work I don't need.

Initialization

Here is the command to turn Bellman into a Docker Swarm Manager and a Node. The command made me pick an ethernet address, because Bellman has more than one. This is Bellman's primary internet interface.

bellman> docker swarm init --advertise-addr 192.168.123.2
Swarm initialized: current node (isk0jocx0rb37yonoafstyvoj) is now a manager.

To add a worker to this swarm, run the following command:

   docker swarm join --token SWMTKN-1-5b81dywl9xkis6769fxnsvjahfy361w2kxkz69nc35bz3nxt6s-43jxeopl6inw8xur1vpcl23w7 192.168.123.2:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

I can add more nodes on other machines using that token. I won't be doing this today. It would look like this.

tern> docker swarm join --token SWMTKN-1-5b81dywl9xkis6769fxnsvjahfy361w2kxkz69nc35bz3nxt6s-43jxeopl6inw8xur1vpcl23w7 192.168.123.2:2377
This node joined a swarm as a worker.
bellman> docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE  VERSION
isk0jocx0rb37yonoafstyvoj *   bellman             Ready               Active              Leader              19.03.5
vjbx2h8n8280ecib2btzkwcxw     tern                Ready               Active                                  18.09.1

Create a network

bellman> docker network create -d overlay --attachable testing
shaboxhgakqer14j1ve7zyysj

The "attachable" option is for containers not yet running in swarm. I will need that soon, when I run bash in Debian for tests.

Spinning up my first Swarm-managed container

To try things out I will spin up 4 copies of a simple web server. When I add Tern it will spread them over the two nodes. I want it to use that "testing" network.

docker service create --name web --replicas 4 -p 80:80 --network testing --detach nginx:latest

Now I have 4 copies of nginx running. They were published on port 80 but that's inside the funny swarm network, how to see them? Well, they are also exposed directly on the host when I used the -p "publish" option. I can do "curl http://localhost". I can get the id (or just use the name "web") and then kill them off,

docker service ls
docker service rm pmbrvm6wow7q
curl http://localhost

When I do "curl" with the nginx replicas shut down, I see the page served by Varnish (still running in Compose), it's showing me the Home Assistant instance. So I guess the swarm takes precedence over whatever is running in Compose. This bit me when I accidentally masked Psono by putting the test for nginx on port 81. That's where I run Psono.

I also tried skipping the "--network testing" parameter and everything still worked. Creating a separate network allows me to isolate containers, just like with Docker and Docker Compose. For example, there is no reason for my Home Assistant and Pihole containers to know about each other.

docker run -it --rm --network testing debian:bullseye bash
# apt update
# apt install -y bind9-dnsutils
# nslookup web
Server:         127.0.0.11
Address:        127.0.0.11#53

Non-authoritative answer:
Name:   web
Address: 10.0.1.27
# apt install -y curl
# curl http://10.0.1.27
# curl http://web/

The standard nginx page is returned from curl both times, so I know it's hitting a replica and running under the name "web", which is what I assigned. Inside the container I can see my local LAN too, for example from the Debian instance I can "curl http://bellman.wildsong.biz:8123/" and get the Home Assistant page. So far, all this is easy.

Healthcheck

With nginx I can create my own Docker image and bake the healthcheck right into the Docker image. I know "curl" is not the best way to do this. https://blog.sixeyed.com/docker-healthchecks-why-not-to-use-curl-or-iwr/ but for now it's what I am using!

In my Dockerfile, I have this

FROM nginx:latest
HEALTHCHECK CMD curl --fail http://localhost || exit 1
$ docker build -t wildsong/nginx .
$ docker service rm web
$ docker service create --name web --replicas 1 -p 80:80 --network testing --detach wildsong/nginx
$ docker ps | grep web
01a3f36f7580   wildsong/nginx:latest      "/docker-entrypoint.…"   About a minute ago   Up About a minute (healthy)   80/tcp   web.1.ssssfmwmwp8je7g1dnsabevew

When I create the service, I will get a warning because I have not pushed that image (wildsong/nginx) to a registry, but it still works because I am running only one node for now. When I do "docker ps" I can see that the container is marked as "healthy".

The service is bound to the IP address I created with the "docker swarm init" command. So I can hit it with the localhost address or the one I specified, with nginx, pointing a browser at http://192.168.123.2/ works.

I suspect this means Varnish will redirect traffic to it??? https://bellman.wildsong.biz/ should work if that's true. It does not, for some reason it's going to Home Assistant.

Okay. Some rule in Varnish was kicking in, not sure what, but I added Bellman support in there and now it's working as expected.

Replicas 0, Problems 1

Earlier in my testing regimen I was encountering creating a stack and having lots of services running but showing "Replicas 0/1". This means, nothing running. When I searched for the containers with "docker ps", there was nothing. I kept going and got past this. When I figure out in 5 minutes what I did wrong I will write it up here. But that is what 0/1 means: nothing running for that service.

Service migration plan

I moved most of the services I normally use on Bellman into the Swarm to try it out, for about a year.

  1. Varnish (and hitch)
  2. Pihole
  3. Psono (Including PostgreSQL) (retired 9/9/2023 in favor of Bit Warden.)
  4. Unifi (retired 9/9/2023)
  5. Logitech Media Server (Squeezebox)
  1. Home Assistant (Including Mosquitto) This cannot move to Swarm because it uses a USB device currently for Zigbee.

Psono has to be accessible from the Internet but none of the others do; some things are in Varnish but really currently only for testing.

I see a copy of mysql, I wonder who that belongs to.

Before I can proceed, I need persistent data, and I need to deal with these issues.

Persistence: How do volumes work in a swarm?

Normally my life revolves around file storage. I never noticed there are also block storage and object storage options. For example you can use a block device and then use the btrfs driver in Docker. There's a thing called the "devicemapper". All so exotic, beyond my attention span presently. Go read here: httpgeners://docs.docker.com/storage/storagedriver/select-storage-driver/

On a Docker Compose set up, the Docker Engine manages the volumes and I generally ignore how it does that. These are called "local" volumes.

With Swarm I would have to consider what happens if I have replica servers spread across several physical servers. But I don't.

Refer to Nigel Poulton's "Docker Deep Dive" chapter 13. He suggests putting the volumes onto a shared NFS server and briefly says you can corrupt files quickly this way. Yeah I can see that. Ha. Fortunately I normally mount most volumes READ ONLY when I can.

Bring in Compose

By that I mean I want to deploy a stack of containers using a docker-compose.yml file as the configuration. So far I have not needed it, if I start just one container per project then "docker service" commands are fine.