Kubernetes beyond 5k nodes etcd-sharding
- 2 minutes read - 410 wordsBackground
I was once asked how to run big scaled kubernetes clusters regarding etcd size limitation. For example how to handle 8G limitations. I was not satisfied with my answers such as increasing memory, compaction & defragmentation as I knew there were several big scale clusters in several companies and 8GB might be a hard limit. However I didn’t have any clues how they make that happened.
Since then, I kept tabs on etcd by reading etcd community meeting (Public). Last week, I noticed that etcd sharding is mentioned in The Implicit Kubernetes-ETCD Contract. I digged deep about this topic and spent a day to figure out how to implement a basic version for that.
Basic Idea
The idea is settting up extra etcd clusters for some type of resources, and adding a --etcd-servers-overrides configuration into the apiserver manifest file.
However it involves more than that, data replication, consistence, implication and cleaning up might needed to be considered for migrating an existing cluster to this kind of setup. I will write a complex one later after I figure out how to do that.
Sharded Environment
The testing environment is consisted of a kubernetes cluster with 1 control plane node and 1 worker node. The cluster is setup using KIND. There will be two etcd in the cluster.
-
etcd-kind-control-plane
-
etcd-kind-worker (with correct keys, certs prepared by the script)
The implementation is in configs/etcdsharding-hook.sh and configs/etcd/cfssl/gen-key-cert.sh.
# bootup a sharded etcd for events resources
git clone https://github.com/jackliusr/k8s
cd k8s
./up.sh etcdsharding
k get nodes -o custom-columns=NAME:metadata.name,IP:status.addresses[0].address
NAME IP
kind-control-plane 172.18.0.3
kind-worker 172.18.0.2
Implementation details
-
prepare certs for a new etcd instance (create and copy to correct locations on worker node kind-worker )
-
copy etcd.yaml manifest from kind-control-plane to kind-worker and replace IP address, a static pod will be started on kind-worker
-
add "--etcd-servers-overrides" into kube-apiserver.yaml in kind-control-plane
Verifying
docker exec -it kind-control-plane sed -i '/127.0.0.1:2379/a \ \ \ \ - --etcd-servers-overrides=\/events#https:\/\/172.18.0.2:2379' /etc/kubernetes/manifests/kube-apiserver.yaml
# kube-apiservice will take while
kubectl exec -it \
-n kube-system etcd-kind-worker \
-- sh -c 'ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt \
ETCDCTL_CERT=/etc/kubernetes/pki/etcd/peer.crt \
ETCDCTL_KEY=/etc/kubernetes/pki/etcd/peer.key \
ETCDCTL_API=3 \
etcdctl \
get \
--keys-only \
--prefix=true \
"/registry/events/" ' | wc -l
#note down the numbers
kubectl run redis --image=redis
kubectl exec -it \
-n kube-system etcd-kind-worker \
-- sh -c 'ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt \
ETCDCTL_CERT=/etc/kubernetes/pki/etcd/peer.crt \
ETCDCTL_KEY=/etc/kubernetes/pki/etcd/peer.key \
ETCDCTL_API=3 \
etcdctl \
get \
--keys-only \
--prefix=true \
"/registry/events/" ' | wc -l
# note down the numbers again, and compare it against the first number.