Odibi Roadmap¶
Current State (v3.4.3)¶
Last updated: 2026-02-22
| Metric | Value |
|---|---|
| Tests | 4,832 collected, 4,694 passed, 33 skipped, 0 failures |
| Test Coverage | 62% (target: 80%) |
| Transformers | 54 |
| Patterns | 6 (Dimension, Fact, SCD2, Merge, Aggregation, Date Dimension) |
| Engines | 3 (Pandas/DuckDB, Spark, Polars) |
| Open Issues | 18 (3 bugs, 6 docs, 5 enhancements, 4 feature requests) |
| Python Support | 3.9-3.12 |
Coverage by Module (Low Priority = >80%, Medium = 50-80%, High = <50%)¶
Note on Spark Coverage: The 3% for Spark is a CI artifact — Spark tests are skipped in CI (no JVM). Spark is tested via mock-based tests in tests/integration/test_patterns_spark_mock.py and validated in production on Databricks.
| Module | Coverage | Priority | Notes |
|---|---|---|---|
engine/spark_engine.py |
3% | OK | CI-skipped; mock-tested + Databricks validated |
diagnostics/delta.py |
13% | HIGH | Needs mock tests |
catalog_sync.py |
43% | HIGH | |
cli/catalog.py |
46% | HIGH | |
catalog.py |
47% | HIGH | Core module, biggest blind spot |
node.py |
51% | HIGH | Core module, second biggest blind spot |
derived_updater.py |
52% | MEDIUM | |
story/generator.py |
54% | MEDIUM | |
story/doc_generator.py |
55% | MEDIUM | |
engine/polars_engine.py |
57% | MEDIUM | Up from 7% — major improvement |
transformers/merge_transformer.py |
61% | MEDIUM | Up from 36% |
engine/pandas_engine.py |
62% | MEDIUM | Up from 25% — primary engine |
connections/factory.py |
69% | LOW | |
validation/engine.py |
69% | LOW | |
state/__init__.py |
73% | LOW | Up from 33% |
connections/azure_adls.py |
76% | LOW | Up from 10% |
context.py |
77% | LOW | Up from 62% |
connections/api_fetcher.py |
79% | LOW | |
transformers/advanced.py |
79% | LOW | Up from 44% |
pipeline.py |
89% | LOW | |
diagnostics/manager.py |
95% | ✅ | |
graph.py |
98% | ✅ | |
utils/logging_context.py |
98% | ✅ | |
diagnostics/diff.py |
99% | ✅ |
Priority 1: Stability & Defect Triage ✅ LARGELY COMPLETE¶
Goal: Zero critical bugs, reliable cross-engine behavior
1.1 Bug Audit (Feb 2026) — DONE¶
Completed a comprehensive audit filing 46 bug issues (#238–#280). 43 bugs fixed and closed.
Remaining open bugs (3): - [x] ~~All critical and high-priority bugs~~ — Fixed - [ ] #268 SECURITY: MCP discovery tool vulnerable to SQL injection (deferred — MCP not in use) - [ ] #265 SECURITY: MCP execute.py allows arbitrary code execution (deferred — MCP not in use) - [ ] #199 AggregationPattern._load_existing_spark lacks multi-format support (low-pri, depends on #192)
Unlabeled open issue to triage: - [ ] #248 SCD2 Pandas change detection unreliable for float/NaN comparisons
1.2 Test Infrastructure — IN PROGRESS¶
- [x] Run
pytest --cov=odibi --cov-report=htmland identify modules < 80% coverage ✓ - [x] Establish strict 5-point fix checklist (ruff check, ruff format, pytest, no conftest changes, correct @patch targets) ✓
- [ ] Add parametrized tests for transformers across all 3 engines
- [ ] Add edge-case tests: empty DataFrames, null-only columns, Unicode data
- [ ] Increase
catalog.pycoverage from 47% to 80%+ - [ ] Increase
node.pycoverage from 51% to 80%+
Priority 2: Engine Parity ✅ FUNCTIONALLY COMPLETE¶
Goal: Every feature works identically on Pandas, Spark, and Polars
Engine parity is achieved — all features work across all 3 engines. See docs/reference/PARITY_TABLE.md for the full matrix (all ✅).
2.1 Parity Status¶
| Transformer | Pandas | Spark | Polars | Notes |
|---|---|---|---|---|
scd2 |
✓ | ✓ | ✓ | Verified |
pivot |
✓ | ✓ | ✓ | Verified |
window_calculation |
✓ | ✓ | ✓ | Verified |
normalize_json |
✓ | ✓ | ✓ | Verified |
2.2 Ongoing Maintenance¶
Engine parity is now a maintain-as-you-go concern, not a dedicated effort: - When adding a feature to one engine, add the matching implementation to the others - Polars coverage improved from 7% → 57% organically - [ ] #212 Missing Polars branches in some transformers (low-pri) - [ ] Document Spark SQL vs DuckDB SQL differences
Priority 3: Error Diagnostics & CLI¶
Goal: Misconfiguration is obvious, actionable errors
3.1 YAML Validation¶
- [x] Ensure
odibi validate config.yamlcatches ALL common mistakes ✓ - [ ] Add validation for node name format (alphanumeric + underscore only)
- [ ] Add validation for missing
format:in inputs/outputs - [ ] Improve error messages with line numbers and suggestions
3.2 Runtime Diagnostics¶
- [ ] Expand
diagnose_errorMCP tool with more error patterns - [ ] Add
odibi doctorchecks for environment issues - [x] Improve traceback cleaning for node execution errors ✓
Priority 4: Documentation & Examples¶
Goal: Golden path is clear, pitfalls are documented
4.1 Examples¶
- [ ] Verify all examples in
examples/run successfully - [ ] Add end-to-end example for each pattern
- [ ] Add "migration from raw SQL" example
4.2 Pitfall Documentation¶
- [ ] Document anti-patterns (e.g., using hyphens in node names)
- [ ] Document engine-specific gotchas
- [ ] Add troubleshooting section to AI assistant setup guide
Priority 5: Feature Gaps¶
Goal: Cover common use cases, stay focused
5.1 Potential Additions (Evaluate Need First)¶
- [ ]
apply_mappingtransformer for lookup-based value replacement - [ ]
flatten_structfor deeply nested JSON (beyond single level) - [ ]
row_numberas standalone transformer (simpler thanwindow_calculation) - [ ] CDC (Change Data Capture) pattern variant
5.2 NOT Adding¶
- Complex orchestration (use Dagster/Airflow)
- Agent/chat infrastructure (use Amp/Cursor/Cline)
- GUI/web interface
Next Actions¶
Immediate: Core Coverage Gaps¶
- catalog.py (47%) — Core data backbone, biggest blind spot
- System catalog registration, lineage tracking, metadata queries
-
Target: 80%+
-
node.py (51%) — Core execution unit, second biggest blind spot
- Node execution paths, error handling, retry logic
- Target: 80%+
Then: Remaining Coverage¶
- catalog_sync.py (43%) — Catalog synchronization logic
- derived_updater.py (52%) — Derived metric updates
- story/generator.py (54%) — Story generation paths
Completed (previously planned)¶
- ~~Pandas engine tests (25% → 62%)~~ ✅
- ~~Polars engine tests (7% → 57%)~~ ✅
- ~~Merge transformer (36% → 61%)~~ ✅
- ~~Advanced transformers (44% → 79%)~~ ✅
- ~~State management (33% → 73%)~~ ✅
- ~~Review GitHub issues and close stale ones~~ ✅ (43 bugs closed)
Existing Test Infrastructure (Not Reflected in Coverage)¶
The scripts/run_test_campaign.py runs end-to-end validation that pytest coverage doesn't capture:
| Phase | What It Tests | Engine |
|---|---|---|
| Phase 1 | CSV read, Parquet write, schema validation | Pandas |
| Phase 3 | State/HWM persistence | Pandas |
| Phase 4 | Merge pattern (upsert) | Pandas |
| Phase 5 | SCD2 pattern | Pandas |
| Phase 6 | Logical path resolution | Pandas |
| Phase 11 | 10k row scaling | Pandas |
Production validation: Spark engine runs in Databricks - not tested in CI but validated in production.
Documentation Gaps to Address¶
- [x] Add
scripts/run_test_campaign.pytodocs/guides/testing.md✓ - [x] Update
docs/features/engines.mdwith engine-specific testing notes ✓ - [ ] Document Spark/Databricks testing approach in
docs/tutorials/spark_engine.md - [ ] Add "How to run the test campaign" section to AGENTS.md or CONTRIBUTING.md
- [ ] #229 Add engine parity table for transformers
- [ ] #228 CHANGELOG missing entries for recent bug fixes
- [ ] #225 Add Delta Lake troubleshooting section
- [ ] #224 Add tutorial for validation and contracts workflow
- [ ] #223 Add tutorial for delete detection workflow
- [ ] #222 Add tutorial for quarantine/orphan handling workflow
Success Metrics¶
| Goal | Target | Current | Status |
|---|---|---|---|
| Test coverage | 80%+ | 62% | 🔶 In progress (+16% from 46%) |
| CI pass rate | 100% | 100% | ✅ Done |
| Engine parity | 100% | 100% | ✅ Done (all features ✅ across engines) |
| Bug backlog | 0 critical | 0 critical | ✅ Done (3 low-pri/deferred remain) |
| Open issues | < 10 | 18 | 🔶 Reduced from 46+ filed |
Long-term Vision¶
Odibi should be: - Declarative: YAML-first, SQL-based transformations - Portable: Works on laptop (Pandas), cluster (Spark), or serverless (Polars) - Stable: Comprehensive tests, predictable behavior - Documented: AI assistants can generate correct configs without trial and error
The ultimate test: Can you hand a business analyst the docs and have them build a working pipeline without your help?