KubeLedger Installation Guide

Step-by-step procedure to install KubeLedger on Kubernetes and OpenShift.

Prerequisites

Verify Metrics Server

Before installing, ensure the Metrics Server is running:

# Check if metrics-server is deployed
kubectl -n kube-system get deploy | grep metrics-server

# Verify it's working
kubectl top nodes

If not installed:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Installation

Option 1: Kustomize (Quick Start)

This approach deploys KubeLedger with default settings. Review the resources in ./manifests/kubeledger/kustomize/resources/.

For advanced customization, use Helm instead.

Default settings:

  • Persistent volume with 1Gi storage request
  • Standard Kubernetes (not OpenShift)
  • Pod runs with UID/GID 4583
# Create namespace and deploy
kubectl create ns kubeledger
kubectl -n kubeledger apply -k ./manifests/kubeledger/kustomize

# Wait for the pod to start
kubectl -n kubeledger get po -w

Option 2: Helm

# Add the Helm repository
helm repo add kubeledger https://realopslabs.github.io/kubeledger
helm repo update

# Install latest version
helm install kubeledger kubeledger/kubeledger \
  --namespace kubeledger \
  --create-namespace

# Or install a specific version
helm install kubeledger kubeledger/kubeledger \
  --version 1.0.0 \
  --namespace kubeledger \
  --create-namespace

Configuration

Create a values.yaml file and customize your installation parameters.

image:
  repository: ghcr.io/realopslabs/kubeledger
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80

persistence:
  enabled: true
  size: 1Gi
  storageClass: ""  # Uses default storage class

dcgm:
  enabled: false
  endpoint: "http://dcgm-exporter.gpu-operator:9400/metrics"

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "512Mi"
    cpu: "500m"

Common Customizations

SettingDescription
.dataVolume.persist: falseUse emptyDir for local testing
.dataVolume.persist: trueUse persistent volume (default)
.dataVolume.capacityPersistent volume size (default: 1Gi)
.dataVolume.storageClassStorage class name (uses cluster default if unset)
.securityContext.openshift: trueEnable OpenShift mode (binds nonroot-v2 SCC)
.dcgm.enable: trueEnable DCGM integration for GPU monitoring
.dcgm.endpointDCGM metrics endpoint URL (e.g., http://dcgm-exporter.gpu-operator.svc:9400/metrics)
.resources.requests.cpuCPU request
.resources.requests.memoryMemory request
.envsEnvironment variables (see Configuration Settings)

Install with custom values:

helm install kubeledger oci://ghcr.io/realopslabs/charts/kubeledger \
  --namespace kubeledger \
  --create-namespace \
  -f values.yaml

OpenShift Installation

# Create project
oc new-project kubeledger

# Install with OpenShift-specific settings
helm install kubeledger oci://ghcr.io/realopslabs/charts/kubeledger \
  --namespace kubeledger \
  --set securityContext.openshift=true

Verify Installation

# Check pod status
kubectl -n kubeledger get pods

# Check events if no pod appears
kubectl -n kubeledger get ev

# Check logs
kubectl -n kubeledger logs -l app=kubeledger

# Port-forward to access the dashboard
kubectl -n kubeledger port-forward svc/kubeledger 5483:80

Open http://localhost:5483 in your browser.


GPU Metrics (Optional)

Ensure DCGM Exporter is deployed:

# Check if DCGM Exporter is running
kubectl get daemonset -A | grep dcgm

If not installed:

helm repo add gpu-helm-charts https://nvidia.github.io/dcgm-exporter/helm-charts
helm install dcgm-exporter gpu-helm-charts/dcgm-exporter \
  --namespace gpu-operator \
  --create-namespace

Enable GPU metrics in KubeLedger:

helm upgrade kubeledger oci://ghcr.io/realopslabs/charts/kubeledger \
  --namespace kubeledger \
  --set dcgm.enabled=true \
  --set dcgm.endpoint=http://dcgm-exporter.gpu-operator:9400/metrics

Upgrade

helm repo update
helm upgrade kubeledger kubeledger/kubeledger \
  --namespace kubeledger

Uninstall

helm uninstall kubeledger -n kubeledger
kubectl delete namespace kubeledger

Prometheus Integration

KubeLedger exposes metrics at /metrics. Add the following scrape config:

- job_name: 'kubeledger'
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_namespace]
      regex: kubeledger
      action: keep
    - source_labels: [__meta_kubernetes_pod_label_app]
      regex: kubeledger
      action: keep
    - source_labels: [__address__]
      regex: (.+):.*
      replacement: ${1}:5483
      target_label: __address__

Troubleshooting

Pod stuck in CrashLoopBackOff

# Check logs
kubectl logs -f deployment/kubeledger -n kubeledger

# Verify RBAC permissions
kubectl auth can-i get pods --as=system:serviceaccount:kubeledger:kubeledger

No data appearing in dashboard

  • Wait 5-10 minutes for initial data collection
  • Verify the pod can reach the Kubernetes API
  • Confirm Metrics Server is working: kubectl top nodes

Metrics not appearing in Prometheus

  • Ensure the /metrics endpoint is accessible
  • Check ServiceMonitor/PodMonitor configuration if using Prometheus Operator
  • Verify network policies allow Prometheus to scrape the pod

Support