Kubernetes-in-Kubernetes with kubeadm and Sysbox

Stéphane Este-Gracias
4 min readNov 22, 2021

Using Kubernetes Pods as Kubernetes nodes (aka KinK)

This article presents a walkthrough to create Kubernetes clusters using Kubernetes Pods acting as the inner Kubernetes nodes with Sysbox as the container runtime. The outer Kubernetes nodes are installed with Ubuntu 20.04.

To know more about Sysbox, please read the What’s Sysbox by Nestybox? article.

Kubernetes-in-Kubernetes (KinK)

As a result, the inner Kubernetes cluster is a set of Kubernetes Pods with containers powered by Sysbox container runtime: one master node and two worker nodes.

Inside the Sysbox-powered containers, the inner containers are managed by containerd.

Container Image

The following image uses the sysbox-base image as a base (described in the What’s Sysbox by Nestybox? article).

Then, as the inner containers are managed by containerd, Docker service and socket could be disabled.

Next, kubeadm and kubelet packages are installed.

Finally, the required images are pre-pulled (with k8s-pull.sh script) into the built image to speed up the Kubernetes cluster creation.

Dockerfile

FROM sysbox-baseARG k8s_version=v1.20.12
ARG flannel_version=v0.15.1
# Disable Docker
RUN systemctl disable docker.service docker.socket \
&& rm -f /etc/containerd/config.toml
# Install kubeadm, kubelet
RUN curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add \
&& apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main" \
&& apt-get update && apt-get install --no-install-recommends -y \
kubeadm="${k8s_version#v}"-00 kubelet="${k8s_version#v}"-00 \
&& rm -rf /var/lib/apt/lists/*
# Pre-pull Kubernetes images
COPY k8s-pull.sh /usr/bin/
RUN chmod +x /usr/bin/k8s-pull.sh && k8s-pull.sh $k8s_version $flannel_version && rm /usr/bin/k8s-pull.sh

k8s-pull.sh

#!/usr/bin/shk8s_version=$1
flannel_version=$2
# Start containerd
containerd &
containerd_pid=$!
sleep 2
# Pull k8s images
kubeadm config images list --kubernetes-version=$k8s_version | xargs -n 1 ctr images pull
# Pull Flannel CNI images
ctr images pull docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.0.0
ctr images pull quay.io/coreos/flannel:$flannel_version
# List pulled images
ctr images list
# Stop containerd
kill $containerd_pid

Thus, you build the sysbox-k8s-node image with the regular docker build command, then push it to your container registry (identified by <registry> in the following commands).

$ docker build -t sysbox-k8s-node .
$ docker tag sysbox-k8s-node <registry>/sysbox-k8s-node:latest
$ docker push <registry>/sysbox-k8s-node:latest

kink-create script

The following bash script (kink-create) creates three Kubernetes Pods using the sysbox-k8s-node image and sysbox-runc container.

To avoid network collisions between outer Kubernetes cluster and inner Kubernetes cluster, take care to define different service-cidr, service-dns-name and pod-network-cidr.

sysbox.kubeconfig file is copied from master node to local machine for interacting with KubeAPI.

#!/usr/bin/env bash
set -x
set -e
k8s_version=v1.20.12
k8s_node_image=<registry>/sysbox-k8s-node:latest
k8s_dns_domain=inner-cluster.local
k8s_service_cidr=10.112.0.0/12
k8s_pod_cidr=10.245.0.0/16
flannel_version=v0.15.1
# Create Kubernetes nodes
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: sysbox-master
annotations:
io.kubernetes.cri-o.userns-mode: "auto:size=65536"
spec:
runtimeClassName: sysbox-runc
containers:
- name: node
image: $k8s_node_image
ports:
- containerPort: 22
- containerPort: 6443
restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
name: sysbox-worker-1
annotations:
io.kubernetes.cri-o.userns-mode: "auto:size=65536"
spec:
runtimeClassName: sysbox-runc
containers:
- name: node
image: $k8s_node_image
ports:
- containerPort: 22
restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
name: sysbox-worker-2
annotations:
io.kubernetes.cri-o.userns-mode: "auto:size=65536"
spec:
runtimeClassName: sysbox-runc
containers:
- name: node
image: $k8s_node_image
ports:
- containerPort: 22
restartPolicy: Always
EOF
# Initialise k8s-master
kubectl wait --timeout=300s --for=condition=ready pod sysbox-master
sleep 10 # Wait for containerd ready
kubectl exec sysbox-master -- \
kubeadm init --kubernetes-version=$k8s_version --cri-socket=/var/run/containerd/containerd.sock \
--service-cidr=$k8s_service_cidr \
--service-dns-domain=$k8s_dns_domain \
--pod-network-cidr=$k8s_pod_cidr
# Download kubeconfig
kubectl cp sysbox-master:/etc/kubernetes/admin.conf sysbox.kubeconfig
# Configure Flannel CNI
kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf apply -f https://raw.githubusercontent.com/coreos/flannel/$flannel_version/Documentation/kube-flannel.yml
# Verify master node is good
kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf get all -A
# Join the worker nodes
join_cmd=$(kubectl exec sysbox-master -- kubeadm token create --print-join-command 2> /dev/null)
kubectl wait --timeout=300s --for=condition=ready pod sysbox-worker-1
kubectl exec sysbox-worker-1 -- $join_cmd
kubectl wait --timeout=300s --for=condition=ready pod sysbox-worker-2
kubectl exec sysbox-worker-2 -- $join_cmd
# Verify all is good
kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf get all -A

kink-delete script

The following bash script (named kink-delete) deletes the Kubernetes nodes.

#!/usr/bin/env bash
set -x
# Delete Kubernetes nodes
kubectl delete pods sysbox-master sysbox-worker-1 sysbox-worker-2

Create the Kubernetes cluster

  • Execute kink-create script
  • Verify inner Kubernetes nodes (i.e. outer Kubernetes Pods) are running
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
sysbox-master 1/1 Running 0 5m49s
sysbox-worker-1 1/1 Running 0 5m49s
sysbox-worker-2 1/1 Running 0 5m49s
  • Verify inner Kubernetes nodes are running with kubectl
$ kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf get nodes -o wide

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
sysbox-master Ready control-plane,master 6m1s v1.20.12 10.244.3.56 <none> Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.12
sysbox-worker-1 Ready <none> 5m32s v1.20.12 10.244.3.57 <none> Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.12
sysbox-worker-2 Ready <none> 5m22s v1.20.12 10.244.3.58 <none> Ubuntu 20.04.3 LTS 5.4.0-89-generic containerd://1.4.12

Use the Kubernetes cluster

For example, run the NGINX image and verify it is running.

$ kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf run nginx --image=nginx

pod/nginx created
$ kubectl exec sysbox-master -- \
kubectl --kubeconfig=/etc/kubernetes/admin.conf get pods

NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 10s

Delete the Kubernetes cluster

  • Execute kink-delete script

References

--

--