Understand the core architecture and accounting principles of KubeLedger.
KubeLedger is designed to be the System of Record for Kubernetes resource accounting. Unlike transient monitoring tools, it focuses on long-term data retention, precise attribution of “hidden” overhead, and lightweight operation.
Data Collection & The “Ledger” Concept
KubeLedger operates as a highly efficient sampler. It periodically polls the Kubernetes API for CPU, Memory, and GPU usage, then consolidates this data into Round Robin Databases (RRD).
This design allows KubeLedger to maintain 12 months of historical data with a negligible memory footprint (<100MB), adhering to HPC (High-Performance Computing) efficiency principles.
Key Design Pillars
Namespace-as-Tenant: The fundamental unit of accounting is the Namespace. KubeLedger treats namespaces as distinct tenants to calculate resource shares, making it ideal for multi-tenant clusters.
Revealing Hidden Costs (Non-Allocatable): Standard monitoring often ignores the “tax” of running Kubernetes itself (OS overhead, Kubelet, eviction thresholds). KubeLedger explicitly accounts for Non-Allocatable Capacities, ensuring that 100% of the cluster cost is visible—not just the portion used by pods.
Hourly Resolution & Trends: Resource consumption is consolidated on an hourly basis. This granular approach captures usage spikes that daily averages might miss, providing a fair basis for cost allocation and identifying trends.
Long-Term Retention: Data is pre-aggregated into Daily and Monthly views. This allows for instant retrieval of yearly reports without expensive query computation.
Usage Accounting Models
KubeLedger supports different accounting models to address various business needs, from simple capacity planning to strict financial chargeback.
The accounting model is defined at startup using the KL_COST_MODEL environment variable.
1. Cumulative Usage Model (Default)
Setting: KL_COST_MODEL=CUMULATIVE_RATIO
- How it works: Computes costs based on the accumulated resource usage (e.g., Core-Hours or GiB-Hours) over a period.
- Best for: Capacity Planning & Efficiency Analysis.
2. Normalized Ratio Model (Showback)
Setting: KL_COST_MODEL=RATIO
- How it works: Computes costs as a percentage (
%) of the total cluster resources available during the period. - Best for: Fair Share Showback.
3. Chargeback Model (Invoicing)
Setting: KL_COST_MODEL=CHARGE_BACK
- How it works: Computes actual monetary costs by multiplying the cumulative usage by a user-defined hourly rate (configured via
KL_HOURLY_BILLING_RATE). - Best for: Internal Invoicing & Cloud Bill Allocation.
Architecture Efficiency
KubeLedger is designed to be “set and forget,” leveraging a streamlined stack for maximum stability and minimal resource usage.
- Efficient Backend (Python): The core logic is implemented in Python, orchestrating data collection and RRDtool interactions with precision.
- Lightweight Visualization (D3.js): The default dashboard is built with pure Javascript and D3.js, ensuring fast rendering of complex financial data without the bloat of heavy frontend frameworks.
- Stateless Persistence: No complex external database is required (RRD files are stored locally or on a PVC).
- Zero External Dependencies: Everything needed to run (including the embedded database engine) is contained in the container image.
