Etcd and fleetd clustering on Google Compute Engine

I was trying to get Kubernetes running on an etcd/fleetd CoreOS cluster under Vagrant on my Mac, but Vagrant seems to love handing out the same IP address to NAT adapters, causing Flannel issues.

Given that nobody will actually be running production code in such a setup, it’s over to a proper Cloud provider, and Google are providing credit to get started.

The tl;dr summary:-

local$ curl -L -O https://bitbucket.org/AndrewGorton/googlecomputetesting/get/v1.0.0.tar.gz
local$ tar xvf v1.0.0.tar.gz 
local$ cd AndrewGorton-googlecomputetesting-ac3223c61502
local$ ./prestart.sh
local$ ./start_etcd.sh <your_project_id> etcd1 etcd2 etcd3
gcloud --project <your_project_id> compute instances create etcd1 etcd2 etcd3 --image https://www.googleapis.com/compute/v1/projects/coreos-cloud/global/images/coreos-alpha-457-0-0-v20141002 --zone europe-west1-a --machine-type f1-micro --metadata-from-file user-data=etcd.yml --metadata role=etcd
Created [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd1].
Created [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd2].
Created [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd3].
NAME  ZONE           MACHINE_TYPE INTERNAL_IP   EXTERNAL_IP    STATUS
etcd1 europe-west1-a f1-micro     xx.xxx.xx.xxx xxx.xxx.xx.xx  RUNNING
etcd2 europe-west1-a f1-micro     xx.xxx.xx.x   xxx.xxx.xx.xx  RUNNING
etcd3 europe-west1-a f1-micro     xx.xxx.xx.xx  xxx.xxx.xxx.xx RUNNING
 
local$ gcloud compute --project <your_project_id> ssh --zone europe-west1-a etcd1
user@etcd$ fleetctl list-machines
MACHINE		IP		METADATA
6908f980...	xxx.xxx.xx.xx	private_ip=xx.xxx.xx.x,public_ip=xxx.xxx.xx.xx,role=etcd
82262946...	xxx.xxx.xxx.xx	private_ip=xx.xxx.xx.xx,public_ip=xxx.xxx.xxx.xx,role=etcd
ec0f638c...	xxx.xxx.xx.xx	private_ip=xx.xxx.xx.xxx,public_ip=xxx.xxx.xx.xx,role=etcd
 
user@etcd$ exit
local$ ./start_etcd.sh <your_project_id> etcd4
gcloud --project <your_project_id> compute instances create etcd4 --image https://www.googleapis.com/compute/v1/projects/coreos-cloud/global/images/coreos-alpha-457-0-0-v20141002 --zone europe-west1-a --machine-type f1-micro --metadata-from-file user-data=etcd.yml --metadata role=etcd
Created [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd4].
NAME  ZONE           MACHINE_TYPE INTERNAL_IP    EXTERNAL_IP  STATUS
etcd4 europe-west1-a f1-micro     xx.xxx.xxx.xxx xxx.xxx.x.xx RUNNING
 
local$ gcloud compute --project <your_project_id> ssh --zone europe-west1-a etcd1
user@etcd1$ fleetctl list-machines
MACHINE		IP		METADATA
6908f980...	xxx.xxx.xx.xx	private_ip=xx.xxx.xx.x,public_ip=xxx.xxx.xx.xx,role=etcd
82262946...	xxx.xxx.xxx.xx	private_ip=xx.xxx.xx.xx,public_ip=xxx.xxx.xxx.xx,role=etcd
ec0f638c...	xxx.xxx.xx.xx	private_ip=xx.xxx.xx.xxx,public_ip=xxx.xxx.xx.xx,role=etcd
f09513bb...	xxx.xxx.x.xx	private_ip=xx.xxx.xxx.xxx,public_ip=xxx.xxx.x.xx,role=etcd
 
user@etcd$ exit
local$ ./quickstop.sh <your_project_id>
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd4].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd3].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd2].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd1].

local$ gcloud compute –project <your_project_id> ssh –zone europe-west1-a etcd1
user@etcd$ fleetctl list-machines
MACHINE IP METADATA
6908f980… xxx.xxx.xx.xx private_ip=xx.xxx.xx.x,public_ip=xxx.xxx.xx.xx,role=etcd
82262946… xxx.xxx.xxx.xx private_ip=xx.xxx.xx.xx,public_ip=xxx.xxx.xxx.xx,role=etcd
ec0f638c… xxx.xxx.xx.xx private_ip=xx.xxx.xx.xxx,public_ip=xxx.xxx.xx.xx,role=etcd

user@etcd$ exit
local$ ./start_etcd.sh <your_project_id> etcd4
gcloud –project <your_project_id> compute instances create etcd4 –image https://www.googleapis.com/compute/v1/projects/coreos-cloud/global/images/coreos-alpha-457-0-0-v20141002 –zone europe-west1-a –machine-type f1-micro –metadata-from-file user-data=etcd.yml –metadata role=etcd
Created [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd4].
NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS
etcd4 europe-west1-a f1-micro xx.xxx.xxx.xxx xxx.xxx.x.xx RUNNING

local$ gcloud compute –project <your_project_id> ssh –zone europe-west1-a etcd1
user@etcd1$ fleetctl list-machines
MACHINE IP METADATA
6908f980… xxx.xxx.xx.xx private_ip=xx.xxx.xx.x,public_ip=xxx.xxx.xx.xx,role=etcd
82262946… xxx.xxx.xxx.xx private_ip=xx.xxx.xx.xx,public_ip=xxx.xxx.xxx.xx,role=etcd
ec0f638c… xxx.xxx.xx.xx private_ip=xx.xxx.xx.xxx,public_ip=xxx.xxx.xx.xx,role=etcd
f09513bb… xxx.xxx.x.xx private_ip=xx.xxx.xxx.xxx,public_ip=xxx.xxx.x.xx,role=etcd

user@etcd$ exit
local$ ./quickstop.sh <your_project_id>
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd4].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd3].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd2].
Deleted [https://www.googleapis.com/compute/v1/projects/<your_project_id>/zones/europe-west1-a/instances/etcd1].

Here’s the in-depth dicussion:-

Here’s the basic configuration YML used by cloud-init.

#cloud-config

coreos:
  etcd:
    # generate a new token for each unique cluster from https://discovery.etcd.io/new
    # WARNING: replace each time you tear it down
    #discovery: https://discovery.etcd.io/<token>
    addr: $private_ipv4:4001
    peer-addr: $private_ipv4:7001
  fleet:
    public-ip: $public_ipv4
    metadata: public_ip=$public_ipv4,private_ip=$private_ipv4,role=etcd
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start

coreos:
etcd:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
# WARNING: replace each time you tear it down
#discovery: https://discovery.etcd.io/<token>
addr: $private_ipv4:4001
peer-addr: $private_ipv4:7001
fleet:
public-ip: $public_ipv4
metadata: public_ip=$public_ipv4,private_ip=$private_ipv4,role=etcd
units:
– name: etcd.service
command: start
– name: fleet.service
command: start

Note that for Google Compute Engine, you’ll want etcd to use a the private IP address ($private_ipv4 variable) – it took me a while to realise I was using the public IP and the nodes couldn’t connect to the peers!

Also, you’ll need a new token each time you create a new cluster, so here’s a nice script snippet which can modify the above (saved as ectd.yml.template) and place a new token into the file (etcd.yml).

cp etcd.yml.template etcd.yml
curl https://discovery.etcd.io/new > discovery.token
sed -i '' -e "s,#discovery: https://discovery.etcd.io/<token>,discovery: $(sed 's:/:\\/:g' discovery.token)," etcd.yml

And finally, to create an etcd cluster in Google Compute Engine, you need

gcloud --project <your_project_here> \\
  compute instances create etcd1 etcd2 \\
  --image "https://www.googleapis.com/compute/v1/projects/coreos-cloud/global/images/coreos-alpha-457-0-0-v20141002" \\
  --zone europe-west1-a \\
  --machine-type f1-micro \\
  --metadata-from-file user-data=etcd.yml \\
  --metadata role=etcd

You can then connect to your freshly connected cluster and view the machines in your cluster with

gcloud compute --project <your_project_here> ssh --zone europe-west1-a etcd1
fleetctl list-machines

(Sometimes it asks for a password – just try again after a bit and it should log you in without prompting for a password).

Provided you haven’t changed the discovery token, you can spin up more instances using the original gcloud creation script, and they’ll add themselves to the etcd cluster.

I’ve packaged up the scripts into a BitBucket repository. You’ll want to do

local$ ./prestart.sh <your_project_id>
local$ ./start_etcd.sh <your_project_id> <node1> [<nodeN> ...]
local$ ./quickstop.sh <your_project_id>

This is my personal blog - all views are my own.

Tagged with: , , , , ,