While Rancher makes it easy to create Kubernetes clusters, a production ready cluster takes more consideration and planning. There are three roles that can be assigned to nodes: etcd, controlplane and worker. In the next sections each of the roles will be described in more detail.
When designing your cluster(s), you have two options:
- Use dedicated nodes for each role. This ensures resource availability for the components needed for the specified role. It also strictly isolates network traffic between each of the roles according to the Port Requirements.
- Assign the
etcdandcontrolplaneroles to the same nodes. These nodes must meet the hardware requirements for both roles.
Note: Do not add the
workerrole to any node configured with either theetcdorcontrolplanerole. This will make the nodes schedulable for regular workloads, which could interfere with critical cluster components running on the nodes with theetcdorcontrolplanerole.
etcd
Nodes with the etcd role run etcd, which is a consistent and highly available key value store used as Kubernetes’ backing store for all cluster data. etcd replicates the data to each node.
Note: Nodes with the
etcdrole are shown asUnschedulablein the UI, meaning no pods will be scheduled to these nodes by default.
Hardware Requirements
Please see Kubernetes: Building Large Clusters and etcd: Hardware Recommendations for the hardware requirements.
Count of etcd Nodes
The number of nodes that you can lose at once while maintaining cluster availability is determined by the number of nodes assigned the etcd role. For a cluster with n members, the minimum is (n/2)+1. Therefore, we recommend creating an etcd node in 3 different availability zones to survive the loss of one availability zone within a region. If you use only two zones, you can only survive the loss of the zone where you don't lose the majority of nodes.
Nodes with etcd role |
Majority | Failure Tolerance |
|---|---|---|
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | 2 |
| 6 | 4 | 2 |
| 7 | 4 | 3 |
| 8 | 5 | 3 |
| 9 | 5 | 4 |
References:
Network Latency
Rancher recommends minimizing latency between the etcd nodes. The default setting for heartbeat-interval is 500, and the default setting for election-timeout is 5000. These settings allow etcd to run in most networks (except really high latency networks).
References:
Backups
etcd is the location where the state of your cluster is stored. Losing etcd data means losing your cluster. Make sure you configure etcd Recurring Snapshots for your cluster(s), and make sure the snapshots are stored externally (off the node) as well.
controlplane
Nodes with the controlplane role run the Kubernetes master components (excluding etcd, as it's a separate role). See Kubernetes: Master Components for a detailed list of components.
Note: Nodes with the
controlplanerole are shown asUnschedulablein the UI, meaning no pods will be scheduled to these nodes by default.
References:
Hardware Requirements
Please see Kubernetes: Building Large Clusters for the hardware requirements.
Count of controlplane Nodes
Adding more than one node with the controlplane role makes every master component highly available. See below for a breakdown of how high availability is achieved per component.
kube-apiserver
The Kubernetes API server (kube-apiserver) scales horizontally. Each node with the role controlplane will be added to the NGINX proxy on the nodes with components that need to access the Kubernetes API server. This means that if a node becomes unreachable, the local NGINX proxy on the node will forward the request to another Kubernetes API server in the list.
kube-controller-manager
The Kubernetes controller manager uses leader election using an endpoint in Kubernetes. One instance of the kube-controller-manager will create an entry in the Kubernetes endpoints and updates that entry in a configured interval. Other instances will see an active leader and wait for that entry to expire (for example, when a node is unresponsive).
kube-scheduler
The Kubernetes scheduler uses leader election using an endpoint in Kubernetes. One instance of the kube-scheduler will create an entry in the Kubernetes endpoints and updates that entry in a configured interval. Other instances will see an active leader and wait for that entry to expire (for example, when a node is unresponsive).
worker
Nodes with the worker role run the Kubernetes node components. See Kubernetes: Node Components for a detailed list of components.
References:
Hardware Requirements
The hardware requirements for nodes with the worker role mostly depend on your workloads. The minimum to run the Kubernetes node components is 1 CPU (core) and 1GB of memory.
Count of worker Nodes
Adding more than one node with the worker role will make sure your workloads can be rescheduled if a node fails.
Networking
Cluster nodes should be located within a single region. Most cloud providers provide multiple availability zones within a region, which can be used to create higher availability for your cluster. Using multiple availability zones is fine for nodes with any role. If you are using Kubernetes Cloud Provider resources, consult the documentation for any restrictions (i.e. zone storage restrictions).
Cluster Diagram
This diagram is applicable to Kubernetes clusters built using RKE or Rancher Launched Kubernetes.
Lines show the traffic flow between components. Colors are used purely for visual aid
Production checklist
- Nodes should have one of the following role configurations:
etcdcontrolplaneetcdandcontrolplaneworker(theworkerrole should not be used or added on nodes with theetcdorcontrolplanerole)- Network traffic is only strictly allowed according to Port Requirements.
- Have at least three nodes with the role
etcdto survive losing one node. Increase this count for higher node fault toleration, and spread them across (availability) zones to provide even better fault tolerance. - Assign two or more nodes the
controlplanerole for master component high availability. - Assign two or more nodes the
workerrole for workload rescheduling upon node failure. - Enable etcd snapshots. Verify that snapshots are being created, and run a disaster recovery scenario to verify the snapshots are valid.
- Perform load tests on your cluster to verify that its hardware can support your workloads.
- Configure alerts/notifiers for Kubernetes components (System Service).
- Configure logging for cluster analysis and post-mortems.
RKE cluster running Rancher HA
You may have noticed that our High Availability (HA) Install instructions do not meet our definition of a production-ready cluster, as there are no dedicated nodes for the worker role. However, for your Rancher installation, this three node cluster is valid, as:
- It allows one
etcdnode failure. - It maintains multiple instances of the master components by having multiple
controlplanenodes. - No other workloads than Rancher itself should be created on this cluster.