Odibi Philosophy¶
The core principles that guide Odibi's design and evolution.
Why Odibi Exists¶
Odibi was built for data engineers who work alone or in small teamsβpeople who need to move fast without sacrificing quality. It encodes best practices into reusable patterns so you can focus on solving business problems, not reinventing infrastructure.
Core Belief: Data pipelines should be declared, not coded. You describe what you want; the framework handles how.
Non-Negotiable Principles¶
- Declarative over imperative β YAML is the source of truth
- Pydantic models, not dicts β Type safety at the boundary
- Explicit dependencies over implicit behavior β No magic
- Composition over inheritance β Build from proven patterns
- Fail-fast validation β Catch errors early, with context
- Performance awareness over cleverness β Simple scales
The Five Laws¶
These laws guide every design decision:
-
Robots Remember, Humans Forget
β Checkpoint bookkeeping, not manual tracking. State is managed automatically. -
Raw is Sacred
β Append-only, immutable. Never destroy original data. -
Rebuild the Bucket, Don't Patch the Hole
β Reprocess windows, don't patch aggregates. When something breaks, rebuild cleanly. -
One-off logic is a smell
β If you do it twice, canonize it. Patterns emerge from repetition. -
Proven patterns should be reused forever
β Don't invent without evidence. Build on what works.
The Chimera Mindset¶
Odibi evolves through a continuous cycle:
We don't invent abstractions without proof. Features graduate to the framework only when real pipelines demonstrate the need.
How Features Get Added¶
- Build it only if 2+ real use cases show the need
- Reuse it if a pattern already exists
- Discard it if no one has used it in 3+ months
- Propose it with evidence (real examples), not speculation
Engine Parity Rule¶
If Pandas has it, Spark and Polars must too. No exceptions.
All three engines must produce identical results for the same input. You can develop locally with Pandas and deploy to production with Sparkβsame YAML config, same behavior.
What Odibi Is¶
| β Odibi Is | Description |
|---|---|
| A pipeline framework | Declarative YAML β executed pipelines |
| Multi-engine | Pandas, Spark, Polars with identical behavior |
| Pattern-driven | Dimension, Fact, SCD2, Aggregation built-in |
| Self-documenting | Every run generates a "Data Story" audit report |
| Quality-focused | Validation, contracts, quarantine, gates |
What Odibi Is NOT¶
| β Odibi Is Not | Use Instead |
|---|---|
| A scheduler | Airflow, Databricks Workflows, Prefect |
| A BI tool | PowerBI, Tableau, Looker |
| A data catalog | Unity Catalog, DataHub, Amundsen |
| An agent/chat framework | LangChain, CrewAI, custom agents |
Contributing¶
Want to contribute? Read CONTRIBUTING.md first.
Remember: Propose features with evidence from real use cases, not speculation. The Chimera Mindset applies to contributions too.