Lesson 3.4: Static Pods, Manual Scheduling, Labels, and Selectors
Static Pods
Static Pods are a special type of Pod in Kubernetes that are managed directly by the kubelet on a specific node, rather than by the Kubernetes API server. They are defined in a node's filesystem and are not controlled by the Kubernetes control plane (e.g., the API server, scheduler, or controllers). This makes them useful for running critical system components, such as the Kubernetes control plane itself, before the API server is fully operational.
Key Characteristics of Static Pods
- Managed by the Kubelet:
- Static Pods are created and managed by the kubelet on a specific node.
- The kubelet monitors the Pod's manifest file and ensures the Pod is running as specified.
- Not Managed by the API Server:
- Unlike regular Pods, Static Pods are not created, updated, or deleted through the Kubernetes API server.
- However, the API server can "see" Static Pods and reflects them in the cluster's state, but it cannot modify them.
- Defined in the Node's Filesystem:
- Static Pods are defined using Pod manifest files stored in a specific directory on the node (e.g., /etc/kubernetes/manifests).
- The kubelet periodically checks this directory for changes and creates or updates Pods accordingly.
- Use Cases:
- Running critical system components like kube-apiserver, kube-scheduler, kube-controller-manager, and etcd in a self-hosted Kubernetes cluster.
- Running node-specific services that must start before the Kubernetes control plane is fully operational.
How Static Pods Work
- Manifest File Location:
- The kubelet is configured to watch a specific directory (e.g., /etc/kubernetes/manifests) for Pod manifest files.
- This directory is specified in the kubelet's configuration file (e.g., --pod-manifest-path flag).
- Pod Creation:
- When a Pod manifest file is placed in the directory, the kubelet reads the file and creates the Pod on the node.
- The kubelet ensures the Pod is always running and restarts it if it fails.
- Reflection in the API Server:
- The kubelet creates a "mirror Pod" object in the Kubernetes API server for each Static Pod.
- This allows the Static Pod to be visible in the cluster, but the API server cannot modify or delete it.
- Updates and Deletions:
- To update a Static Pod, you must modify the manifest file in the directory.
- To delete a Static Pod, you must remove the manifest file from the directory.
Explanation with example
The scenario demonstrates how Static Pods work in Kubernetes, specifically focusing on the kube-scheduler and its role in scheduling workloads. Here's a detailed breakdown of what happened:
- The Kubernetes cluster is running with all core components (e.g., kube-apiserver, kube-controller-manager, kube-scheduler, etcd) as Static Pods managed by the kubelet on the control plane node.
- The kube-scheduler is responsible for assigning Pods to nodes in the cluster.
- The kubelet on the control plane node monitors the
/etc/kubernetes/manifests/
directory for Static Pod manifests and ensures the corresponding Pods are running.
[root@master ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-76f75df574-bm9qd 1/1 Running 0 26h coredns-76f75df574-sjzlk 1/1 Running 0 26h etcd-cka-cluster2-control-plane 1/1 Running 0 26h kindnet-f9b74 1/1 Running 0 26h kindnet-hmlzh 1/1 Running 0 26h kindnet-sm7v8 1/1 Running 0 26h kube-apiserver-cka-cluster2-control-plane 1/1 Running 0 26h kube-controller-manager-cka-cluster2-control-plane 1/1 Running 0 26h kube-proxy-rdp47 1/1 Running 0 26h kube-proxy-rlbzn 1/1 Running 0 26h kube-proxy-vnm4k 1/1 Running 0 26h kube-scheduler-cka-cluster2-control-plane 1/1 Running 0 26h
[root@master ~]# docker exec -it cka-cluster2-control-plane bash root@cka-cluster2-control-plane:/# cd /etc/kubernetes/manifests/ root@cka-cluster2-control-plane:/etc/kubernetes/manifests# ls etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml # Kube Docker container for cka-cluster2-control-plane root@cka-cluster2-control-plane:/etc/kubernetes/manifests# mv kube-scheduler.yaml /tmp # Scheduler not present [root@master ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-76f75df574-bm9qd 1/1 Running 0 34h coredns-76f75df574-sjzlk 1/1 Running 0 34h etcd-cka-cluster2-control-plane 1/1 Running 0 34h kindnet-f9b74 1/1 Running 0 34h kindnet-hmlzh 1/1 Running 0 34h kindnet-sm7v8 1/1 Running 0 34h kube-apiserver-cka-cluster2-control-plane 1/1 Running 0 34h kube-controller-manager-cka-cluster2-control-plane 1/1 Running 0 34h kube-proxy-rdp47 1/1 Running 0 34h kube-proxy-rlbzn 1/1 Running 0 34h kube-proxy-vnm4k 1/1 Running 0 34h [root@master ~]# kubectl run nginx --image=nginx pod/nginx created [root@master ~]# kubectl get pods NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 8s [root@master ~]# kubectl describe pod nginx | grep Node: Node: <none> [root@master ~]# kubectl describe pod nginx | grep Status: Status: Pending # Kube Docker container for cka-cluster2-control-plane root@cka-cluster2-control-plane:/etc/kubernetes/manifests# mv /tmp/kube-scheduler.yaml . root@cka-cluster2-control-plane:/etc/kubernetes/manifests# ls etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml [root@master ~]# kubectl describe pod nginx | grep Node: Node: cka-cluster2-worker2/172.18.0.5 [root@master ~]# kubectl describe pod nginx | grep Status: Status: Running
- Removing the Scheduler Manifest:
- Moving kube-scheduler.yaml out of /etc/kubernetes/manifests/ caused the kubelet to terminate the kube-scheduler Pod.
- Result:
- kubectl get pods -n kube-system no longer showed the kube-scheduler Pod.
- Newly created Pods (e.g., nginx) stayed Pending with no node assignment.
- Restoring the Scheduler Manifest:
- Moving kube-scheduler.yaml back into the directory triggered the kubelet to restart the kube-scheduler Pod.
- Result:
- The kube-scheduler Pod reappeared in kube-system.
- The nginx Pod was scheduled to a node (cka-cluster2-worker2) and transitioned to Running.
Manual Scheduling
Manual scheduling refers to explicitly assigning a Pod to a specific node without relying on the Kubernetes scheduler (kube-scheduler). This is typically done by specifying the target node directly in the Pod’s configuration. While Kubernetes is designed to automate scheduling, manual scheduling can be useful in specific scenarios, such as debugging, testing, or enforcing strict placement policies.
How Manual Scheduling Works
- Using
nodeName
in Pod Spec:- Add the nodeName field to the Pod’s YAML definition to force it onto a specific node.
- Example:
[root@master manualscheduler]# cat pod.yml apiVersion: v1 kind: Pod metadata: name: nginx spec: nodeName: cka-cluster2-worker2 containers: - name: nginx image: nginx [root@master manualscheduler]# vim pod.yml [root@master manualscheduler]# kubectl apply -f pod.yml pod/nginx created [root@master manualscheduler]# kubectl describe pod nginx | grep Node: Node: cka-cluster2-worker2/172.18.0.5
Key Use Cases for Manual Scheduling
- Bypassing the Scheduler: When the kube-scheduler is unavailable (as in your earlier experiment), manually assigning Pods ensures they run.
- Example: Critical Pods that must run even if the scheduler is down.
- Debugging/Testing: Test Pod behavior on specific nodes (e.g., hardware compatibility).
- Workload Placement Control: Enforce strict policies (e.g., running a Pod on a node with GPU resources).
Labels & Selector
Labels are key/value pairs that are attached to objects such as Pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined. Each Key must be unique for a given object.
Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.
[root@master selectors]# cat pod.yml apiVersion: v1 kind: Pod metadata: labels: run: redis-pod tier: frontend type: app1 name: redis-pod spec: containers: - image: redis name: redis-pod [root@master selectors]# kubectl apply -f pod.yml [root@master selectors]# kubectl get pod --show-labels NAME READY STATUS RESTARTS AGE LABELS redis-pod 1/1 Running 0 44s run=redis-pod,tier=frontend,type=app1 [root@master selectors]# kubectl get pods --selector type=app1 NAME READY STATUS RESTARTS AGE redis-pod 1/1 Running 0 2m10s [root@master selectors]# kubectl get pods --selector tier=frontend NAME READY STATUS RESTARTS AGE redis-pod 1/1 Running 0 2m18s [root@master selectors]# kubectl get pods --selector run=redis-pod NAME READY STATUS RESTARTS AGE redis-pod 1/1 Running 0 2m25s