> ## Documentation Index
> Fetch the complete documentation index at: https://www.latitude.sh/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Kubernetes cluster with RKE2

> Step-by-step setup for building a reliable Kubernetes environment on bare metal

Developed in collaboration with [eOracle](https://www.latitude.sh/customers/eoracle), this guide will walk you through the process of setting up a K8s cluster on bare metal using [RKE2](https://docs.rke2.io/).

RKE2 (Rancher Kubernetes Engine 2) is a lightweight, secure, and production-ready Kubernetes distribution designed to simplify cluster deployment and management. Let's dive in!

## Prerequisites

* [Latitude.sh servers](https://www.latitude.sh/dashboard) (at least 3 control-plane nodes and 2 worker nodes).
* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed on your local machine.
* A DNS A-record pointing to the control-plane nodes.
* Properly configured IP addresses for all nodes.
* SSH access to all nodes with `root` privileges.
* A `server token `and `agent token` to securely join nodes to the cluster.
* [Cilium](https://cilium.io/) as the CNI plugin

## Step 1: Prepare DNS and generate tokens

<Steps>
  <Step title="Set up DNS record">
    Set up a DNS A-record that points to the external IP addresses of all control-plane nodes. This record will serve as the Kubernetes API endpoint and the RKE2 registration address.
  </Step>

  <Step title="Generate tokens">
    Run the following command twice to create a server-token (for control-plane nodes) and an agent-token (for worker nodes):

    ```shell theme={null}
    openssl rand -hex 32
    ```

    Save both tokens for use during the cluster setup.
  </Step>
</Steps>

## Step 2: Configure RKE2 on the first control-plane node

On the first server, create the RKE2 configuration file at `/etc/rancher/rke2/config.yaml` with the following content:

```shell theme={null}
token: <server-token>
agent-token: <agent-token>
tls-san:
  - rke2.ext.example.com
  - <control_plane_ip>
node-name: control-plane-01
advertise-address: <control_plane_ip>
node-ip: <node_internal_ip>
node-external-ip: <node_external_ip>
disable:
  - rke2-ingress-nginx
cni: "cilium"
disable-kube-proxy: true
```

Replace `<server-token>` and `<agent-token>` with the generated tokens.

## Step 3: Install RKE2

<Steps>
  <Step title="Set RKE2 version">
    Set the desired [RKE2 version](https://github.com/rancher/rke2/releases) as an environment variable:

    ```shell theme={null}
    export RKE2_VERSION=v1.31.1+rke2r1
    ```
  </Step>

  <Step title="Install RKE2 server">
    Run the following command to install the RKE2 distribution on the first control-plane node:

    ```shell theme={null}
    curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=${RKE2_VERSION} INSTALL_RKE2_TYPE=server INSTALL_RKE2_CHANNEL=stable INSTALL_RKE2_METHOD=tar sh -
    ```
  </Step>

  <Step title="Start RKE2 service">
    Enable RKE2 to start on boot, start the service, and observe the service logs:

    ```shell theme={null}
    systemctl enable rke2-server.service && systemctl restart rke2-server.service & journalctl -u rke2-server -f
    ```
  </Step>
</Steps>

## Step 4: Configure additional control-plane nodes

<Steps>
  <Step title="Get tokens from first node">
    After deploying the first control-plane node, use the tokens located at `/var/lib/rancher/rke2/server` as the server token and `/var/lib/rancher/rke2/server/agent-token` as the agent token.

    <Note>The `server` option applies to all nodes except the first.</Note>
  </Step>

  <Step title="Configure additional control-plane nodes">
    Create the configuration file `/etc/rancher/rke2/config.yaml` on each additional control-plane node.

    Example for `control-plane-02`:

    ```shell theme={null}
    server: https://rke2.ext.example.com:9345
    token: <server-token>
    agent-token: <agent-token>
    tls-san:
      - rke2.ext.example.com
      - <control_plane_ip>
    node-name: control-plane-02
    advertise-address: <control_plane_ip>
    node-ip: <node_internal_ip>
    node-external-ip: <node_external_ip>
    disable:
      - rke2-ingress-nginx
    cni: "cilium"
    disable-kube-proxy: true
    ```

    Example for `control-plane-03`:

    ```shell theme={null}
    server: https://rke2.ext.example.com:9345
    token: <server-token>
    agent-token: <agent-token>
    tls-san:
      - rke2.ext.example.com
      - <control_plane_ip>
    node-name: control-plane-03
    advertise-address: <control_plane_ip>
    node-ip: <node_internal_ip>
    node-external-ip: <node_external_ip>
    disable:
      - rke2-ingress-nginx
    cni: "cilium"
    disable-kube-proxy: true
    ```

    Repeat this process for all control-plane nodes, updating the node-name and IP addresses accordingly.
  </Step>

  <Step title="Configure worker nodes">
    Install agent nodes and create the following configuration file at `/etc/rancher/rke2/config.yaml`:

    ```shell theme={null}
    server: https://rke2.ext.example.com:9345
    token: <agent-token>
    node-name: worker-node-01
    node-ip: <worker_node_internal_ip>
    node-external-ip: <worker_node_external_ip>
    node-label:
      - "node.kubernetes.io/role=worker"
    ```
  </Step>

  <Step title="Install RKE2 agent">
    Run `export RKE2_VERSION=v1.31.1+rke2r1`

    Install the RKE2 distribution:

    ```shell theme={null}
    curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=${RKE2_VERSION} INSTALL_RKE2_TYPE=agent INSTALL_RKE2_CHANNEL=stable INSTALL_RKE2_METHOD=tar sh -
    ```

    Enable RKE2 to start on boot, start the service, and observe the service logs:

    ```shell theme={null}
    systemctl enable rke2-agent.service && systemctl restart rke2-agent.service & journalctl -u rke2-agent -f
    ```
  </Step>
</Steps>

## Step 5: Acquire the kubeconfig file

<Steps>
  <Step title="Locate kubeconfig">
    Locate the kubeconfig file on any control-plane node at `/etc/rancher/rke2/rke2.yaml.`

    The file should look similar to this:

    ```shell theme={null}
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: <CERTIFICATE_DATA>
        server: https://127.0.0.1:6443
      name: default
    contexts:
    - context:
        cluster: default
        user: default
      name: default
    current-context: default
    kind: Config
    preferences: {}
    users:
    - name: default
      user:
        client-certificate-data: <CLIENT_CERTIFICATE_DATA>
        client-key-data: <CLIENT_KEY_DATA>
    ```
  </Step>

  <Step title="Update server address">
    Replace the server value with the external registration address:

    `server: https://rke2.ext.example.com:6443`
  </Step>

  <Step title="Copy to local machine">
    Copy the file to your local machine and save it as `~/.kube/config` or another location (e.g., `~/.kube/test`).

    Use the following command to check the cluster nodes:

    ```shell theme={null}
    kubectl get nodes --kubeconfig ~/.kube/test
    ```
  </Step>
</Steps>

## Step 6: Configure Cilium for kube-proxy replacement

<Steps>
  <Step title="Create Cilium configuration">
    Create a file named `rke2-cilium-values.yaml`with this content:

    ```shell theme={null}
    apiVersion: helm.cattle.io/v1
    kind: HelmChartConfig
    metadata:
      name: rke2-cilium
      namespace: kube-system
    spec:
      valuesContent: |-
        k8sServiceHost: "rke2.ext.example.com"
        k8sServicePort: "6443"
        kubeProxyReplacement: "true"
    ```

    <Info>You must specify `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT` in the configuration to enable kube-proxy replacement mode. See the [HelmChartConfig](https://docs.rke2.io/helm#customizing-packaged-components-with-%20helmchartconfig) for more information.</Info>
  </Step>

  <Step title="Apply configuration">
    Run this command to apply the configuration:

    ```shell theme={null}
    kubectl apply -f ./rke2-cilium-values.yaml --kubeconfig ~/.kube/test
    ```

    Check if the configuration is applied:

    ```shell theme={null}
    helmchartconfig.helm.cattle.io/rke2-cilium created
    ```

    Cilium is now configured to replace kube-proxy.
  </Step>
</Steps>

## Step 7: Check cluster status

<Steps>
  <Step title="Check pod status">
    Run the following to check the status of all pods in all namespaces:

    ```shell theme={null}
    kubectl get pods -A --kubeconfig ~/.kube/test
    ```

    Ensure all pods are in the **Running** state.
  </Step>

  <Step title="Check node status">
    Check the status of all nodes:

    ```shell theme={null}
    kubectl get nodes --kubeconfig ~/.kube/test
    ```

    Nodes should show **Ready** in the Status column.
  </Step>
</Steps>

## Troubleshooting

Add these aliases to perform troubleshooting on control-plane nodes:

```shell theme={null}
alias kubectl='/var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml'
kubectl get nodes

alias crictl='/var/lib/rancher/rke2/bin/crictl'
export CONTAINER_RUNTIME_ENDPOINT="unix:///run/k3s/containerd/containerd.sock"
crictl ps
```

## That's it!

Your cluster is now set up and ready to handle workloads on bare metal. Check out our guide on [Kubernetes load balancing](/guides/kubernetes-load-balancing-on-bare-metal) to optimize traffic management across your nodes.
