Ready to customize your deployment? Use the pages in this section for the full checklist, installation methods, and platform-specific notes. Requirements always come first; after that, pick either manifests or Helm, then dive into the environment-specific guidelines.
This is the multi-page printable view of this section. Click here to print.
Advanced Installation
- 1: Kind Quickstart
- 2: Amazon EKS Notes
- 3: Bare Metal Notes
- 4: Install with Helm
- 5: Install with kubectl/manifests
1 - Kind Quickstart
This walkthrough targets Linux hosts with Docker/Podman because Kind worker nodes run as containers. macOS and Windows hosts cannot load kernel modules required by Lustre, but you can still observe the driver boot sequence. The shim below fakes mount.lustre with tmpfs so you can run the end-to-end demo locally.
Requirements
- Docker 20.10+ (or a compatible container runtime supported by Kind).
- Kind v0.20+.
kubectlv1.27+ pointed at your Kind context.- A GitHub personal access token with
read:packagesif you plan to pull images from GitHub Container Registry via an image pull secret (optional but recommended).
1. Create a Kind cluster
Save the following Kind configuration and create the cluster:
cat <<'EOF' > kind-klustre.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.29.2
- role: worker
image: kindest/node:v1.29.2
EOF
kind create cluster --name klustre-kind --config kind-klustre.yaml
kubectl cluster-info --context kind-klustre-kind
2. Install a Lustre shim inside the nodes
The CSI plugin shells out to mount.lustre and umount.lustre. Kind nodes do not ship with the Lustre client, so we create lightweight shims that mount a tmpfs and behave like a Lustre mount. This allows the volume lifecycle to complete even though no real Lustre server exists.
cat <<'EOF' > lustre-shim.sh
#!/bin/bash
set -euo pipefail
SOURCE="${1:-tmpfs}"
TARGET="${2:-/mnt/lustre}"
shift 2 || true
mkdir -p "$TARGET"
if mountpoint -q "$TARGET"; then
exit 0
fi
mount -t tmpfs -o size=512m tmpfs "$TARGET"
EOF
cat <<'EOF' > lustre-unmount.sh
#!/bin/bash
set -euo pipefail
TARGET="${1:?target path required}"
umount "$TARGET"
EOF
chmod +x lustre-shim.sh lustre-unmount.sh
for node in $(kind get nodes --name klustre-kind); do
docker cp lustre-shim.sh "$node":/usr/sbin/mount.lustre
docker cp lustre-unmount.sh "$node":/usr/sbin/umount.lustre
docker exec "$node" chmod +x /usr/sbin/mount.lustre /usr/sbin/umount.lustre
done
3. Prepare node labels
Label the Kind worker node so it is eligible to run Lustre workloads:
kubectl label node klustre-kind-worker lustre.csi.klustrefs.io/lustre-client=true
The default
klustre-csi-staticstorage class uses the label above insideallowedTopologies. Label any node that will run workloads needing Lustre.
4. Deploy Klustre CSI Plugin
Install the driver into the Kind cluster using the published Kustomize manifests:
export KLUSTREFS_VERSION=main
kubectl apply -k "github.com/klustrefs/klustre-csi-plugin//manifests?ref=$KLUSTREFS_VERSION"
Then watch the pods come up:
kubectl get pods -n klustre-system -o wide
Then wait for the daemonset rollout to complete:
kubectl rollout status daemonset/klustre-csi-node -n klustre-system --timeout=120s
Wait until the klustre-csi-node daemonset shows READY pods on the control-plane and worker nodes.
5. Mount the simulated Lustre share
Create a demo manifest that provisions a static PersistentVolume and a BusyBox deployment. Because the mount.lustre shim mounts tmpfs, data is confined to the worker node memory and disappears when the pod restarts. Replace the source string with the Lustre target you plan to use later—here it is only metadata.
Create the demo manifest with a heredoc:
cat <<'EOF' > lustre-demo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: lustre-demo-pv
spec:
storageClassName: klustre-csi-static
capacity:
storage: 1Ti
accessModes:
- ReadWriteMany
csi:
driver: lustre.csi.klustrefs.io
volumeHandle: lustre-demo
volumeAttributes:
# This is only metadata in the Kind lab; replace with a real target for production clusters.
source: 10.0.0.1@tcp0:/lustre-fs
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lustre-demo-pvc
spec:
storageClassName: klustre-csi-static
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Ti
volumeName: lustre-demo-pv
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: lustre-demo
spec:
replicas: 1
selector:
matchLabels:
app: lustre-demo
template:
metadata:
labels:
app: lustre-demo
spec:
containers:
- name: demo
image: busybox:1.36
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- name: lustre
mountPath: /mnt/lustre
volumes:
- name: lustre
persistentVolumeClaim:
claimName: lustre-demo-pvc
EOF
Apply the demo manifest:
kubectl apply -f lustre-demo.yaml
Wait for the demo deployment to become available:
kubectl wait --for=condition=available deployment/lustre-demo
Confirm the Lustre (tmpfs) mount is visible in the pod:
kubectl exec deploy/lustre-demo -- df -h /mnt/lustre
Write and read back a test file:
kubectl exec deploy/lustre-demo -- sh -c 'echo "hello from $(hostname)" > /mnt/lustre/hello.txt'
kubectl exec deploy/lustre-demo -- cat /mnt/lustre/hello.txt
You should see the tmpfs mount reported by df and be able to write temporary files.
6. Clean up (optional)
Remove the demo PV, PVC, and Deployment:
kubectl delete -f lustre-demo.yaml
If you want to tear down the Kind environment as well:
kubectl delete namespace klustre-system
kind delete cluster --name klustre-kind
rm kind-klustre.yaml lustre-shim.sh lustre-unmount.sh lustre-demo.yaml
Troubleshooting
- If the daemonset pods crash with
ImagePullBackOff, usekubectl describe daemonset/klustre-csi-node -n klustre-systemandkubectl logs daemonset/klustre-csi-node -n klustre-system -c klustre-csito inspect the error. The image is public onghcr.io, so no image pull secret is required; ensure your nodes can reachghcr.io(or your proxy) from inside the cluster. - If the demo pod fails to mount
/mnt/lustre, make sure the shim scripts were copied to every Kind node and are executable. You can rerun thedocker cp ... mount.lustre/umount.lustreloop from step 2 after adding or recreating nodes. - Remember that
tmpfslives in RAM. Large writes in the demo workload consume memory inside the Kind worker container and disappear after pod restarts. Move to a real Lustre environment for persistent data testing.
Use this local experience to get familiar with the manifests and volume lifecycle, then follow the main Introduction guide when you are ready to operate against real Lustre backends.
2 - Amazon EKS Notes
The AWS-oriented quickstart is under construction. It will cover:
- Preparing EKS worker nodes with the Lustre client (either Amazon Linux extras or the FSx-provided packages).
- Handling IAM roles for service accounts (IRSA) and pulling container images from GitHub Container Registry.
- Connecting to FSx for Lustre file systems (imported or linked to S3 buckets) and exposing them via static PersistentVolumes.
Until the full write-up lands, adapt the Introduction flow by:
- Installing the Lustre client on your managed node groups (e.g., with
yum install lustre-clientin your AMI or through user data). - Labeling the nodes that have Lustre access with
lustre.csi.klustrefs.io/lustre-client=true. - Applying the Klustre CSI manifests or Helm chart in the
klustre-systemnamespace.
Feedback on which AWS-specific topics matter most (FSx throughput tiers, PrivateLink, IAM policies, etc.) is welcome in the community discussions.
3 - Bare Metal Notes
This guide will describe how to prepare on-prem or colocation clusters where you manage the operating systems directly (kernel modules, Lustre packages, kubelet paths, etc.). While the detailed walkthrough is in progress, you can already follow the general Introduction page and keep the following considerations in mind:
- Ensure every node that should host Lustre-backed pods has the Lustre client packages installed via your distribution’s package manager (for example,
lustre-clientRPM/DEB). - Label those nodes with
lustre.csi.klustrefs.io/lustre-client=true. - Grant the
klustre-systemnamespace Pod Security admission exemptions (e.g.,pod-security.kubernetes.io/enforce=privileged) because the daemonset requireshostPID,hostNetwork, andSYS_ADMIN.
If you are interested in helping us document more advanced configurations (multiple interfaces, bonded networks, RDMA, etc.), please open an issue or discussion in the GitHub repository.
4 - Install with Helm
The Helm chart is published under oci://ghcr.io/klustrefs/charts/klustre-csi-plugin.
1. Authenticate (optional)
If you use a GitHub personal access token for GHCR:
helm registry login ghcr.io -u <github-user>
Skip this step if anonymous pulls are permitted in your environment.
2. Install or upgrade
helm upgrade --install klustre-csi \
oci://ghcr.io/klustrefs/charts/klustre-csi-plugin \
--version 0.1.1 \
--namespace klustre-system \
--create-namespace \
--set imagePullSecrets[0].name=ghcr-secret
Adjust the release name, namespace, and imagePullSecrets as needed. You can omit the secret if GHCR is reachable without credentials.
3. Override values
Common overrides:
nodePlugin.logLevel– adjust verbosity (debug,info, etc.).nodePlugin.pluginDir,nodePlugin.kubeletRegistrationPath– change if/var/lib/kubeletdiffers on your hosts.storageClass.mountOptions– add Lustre mount flags such asflockoruser_xattr.
View the full schema:
helm show values oci://ghcr.io/klustrefs/charts/klustre-csi-plugin --version 0.1.1
4. Check status
kubectl get pods -n klustre-system
helm status klustre-csi -n klustre-system
When pods are ready, continue with the validation instructions or deploy a workload that uses the Lustre-backed storage class.
5 - Install with kubectl/manifests
1. Install directly with Kustomize (no clone)
If you just want a default install, you don’t need to clone the repository. You can apply the published manifests directly from GitHub:
export KLUSTREFS_VERSION=main
kubectl apply -k "github.com/klustrefs/klustre-csi-plugin//manifests?ref=$KLUSTREFS_VERSION"
The manifests/ directory includes the namespace, RBAC, CSIDriver, daemonset, node service account, default StorageClass (klustre-csi-static), and settings config map.
2. Work from a local clone (recommended for customization)
If you plan to inspect or customize the manifests, clone the repo and work from a local checkout:
git clone https://github.com/klustrefs/klustre-csi-plugin.git
cd klustre-csi-plugin
You can perform the same default install from the local checkout:
kubectl apply -k manifests
3. Customize with a Kustomize overlay (optional)
To change defaults such as logLevel, nodeImage, or the CSI endpoint path without editing the base files, create a small overlay that patches the settings config map.
Create overlays/my-cluster/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../manifests
patchesStrategicMerge:
- configmap-klustre-csi-settings-patch.yaml
Create overlays/my-cluster/configmap-klustre-csi-settings-patch.yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: klustre-csi-settings
namespace: klustre-system
data:
logLevel: debug
nodeImage: ghcr.io/klustrefs/klustre-csi-plugin:0.1.1
Then apply your overlay instead of the base:
kubectl apply -k overlays/my-cluster
You can add additional patches in the overlay (for example, to tweak the daemonset or StorageClass) as your cluster needs grow.
4. Verify rollout
kubectl get pods -n klustre-system -o wide
kubectl describe daemonset klustre-csi-node -n klustre-system
kubectl logs daemonset/klustre-csi-node -n klustre-system -c klustre-csi
After the daemonset is healthy on all Lustre-capable nodes, continue with the validation steps or jump to the sample workload.