Odibi Roadmap¶

Current State (v3.4.3)¶

Last updated: 2026-04-29

Metric	Value
Tests	4,832+ collected (expanding — see AGENTS.md for per-module counts)
Test Coverage	80% (34,363 stmts, 6,854 missed)
Transformers	54
Patterns	6 (Dimension, Fact, SCD2, Merge, Aggregation, Date Dimension)
Engines	3 (Pandas/DuckDB, Spark, Polars)
Open Issues	18 (3 bugs, 6 docs, 5 enhancements, 4 feature requests)
Python Support	3.9-3.12

Coverage by Module (Low Priority = >80%, Medium = 50-80%, High = <50%)¶

Note on Spark Coverage: The 4% for Spark is a CI artifact — Spark tests are skipped in CI (no JVM). Spark is tested via mock-based tests in tests/integration/test_patterns_spark_mock.py and validated in production on Databricks.

Module	Coverage	Priority	Notes
`engine/spark_engine.py`	4%	OK	CI-skipped; mock-tested + Databricks validated
`diagnostics/delta.py`	13%	OK	Hard skip — Delta/Spark-bound
`pipeline.py`	20%	HIGH	Utility methods covered; run() + PipelineManager uncovered (~1000 lines)
`transformers/scd.py`	49%	HIGH	Pandas/DuckDB paths covered; Spark/Delta MERGE uncovered
`patterns/date_dimension.py`	57%	MEDIUM	Pandas paths covered; Spark paths remaining
`engine/polars_engine.py`	57%	MEDIUM	Up from 7% — major improvement
`transformers/merge_transformer.py`	61%	MEDIUM	Pandas/DuckDB merge covered; Spark uncovered
`patterns/aggregation.py`	63%	MEDIUM	validate + Pandas aggregate covered; Spark paths remaining
`transformers/delete_detection.py`	64%	MEDIUM	Pandas paths covered; Spark skipped
`validation/engine.py`	67%	MEDIUM	Bug fixed; 11 test types × Pandas + Polars; Spark `_validate_spark` remaining
`writers/sql_server_writer.py`	67%	MEDIUM	208 tests; all non-Spark paths covered
`transformers/manufacturing.py`	67%	MEDIUM	Polars paths fully covered; Spark remaining
`transformers/relational.py`	67%	MEDIUM	Join/union/pivot covered; Spark paths remaining
`validation/quarantine.py`	69%	MEDIUM	Pandas + Polars covered; Spark remaining
`patterns/dimension.py`	70%	MEDIUM	SCD0/1/2 Pandas covered; Spark remaining
`transformers/thermodynamics.py`	71%	MEDIUM	CoolProp Pandas/Polars tested; Spark UDF remaining
`derived_updater.py`	74%	LOW	Up from 52%; Pandas lifecycle + SQL Server covered; Spark remaining
`engine/pandas_engine.py`	74%	LOW	Up from 62% — 118 new tests; Delta maintenance ops remaining
`patterns/fact.py`	74%	LOW	Dimension lookups + quarantine covered; Spark remaining
`context.py`	77%	LOW	Up from 62%
`transformers/units.py`	78%	LOW	Pandas + Polars covered; Spark UDF remaining
`transformers/advanced.py`	80%	✅	Up from 44%; Pandas sessionize/split fully covered
`state/__init__.py`	80%	✅	Up from 33%; all backends + factories + sync covered
`catalog_sync.py`	81%	✅	Up from 43%; 88 tests; SQL Server + Delta sync covered
`connections/factory.py`	81%	✅	Up from 69%; all factory functions + auth auto-detection
`node.py`	81%	✅	Up from 51%; 89 tests; execution paths + error handling covered
`story/generator.py`	83%	✅	Up from 54%; 100 tests; generate + cleanup + remote paths
`catalog.py`	85%	✅	Up from 47%; 126 tests; Pandas + Spark branches via mocks
`introspect.py`	88%	✅	55 tests; all rendering + module discovery covered
`connections/azure_adls.py`	89%	✅	Up from 10%; 126 total tests; discovery + auth covered
`cli/system.py`	89%	✅	42 tests; all commands + helper functions covered
`story/doc_generator.py`	90%	✅	Up from 55%; 117 tests; all rendering helpers covered
`connections/azure_sql.py`	93%	✅	87 tests; all methods + auth modes covered
`transformers/sql_core.py`	93%	✅	31 tests; 16 SQL transformers covered
`connections/api_fetcher.py`	94%	✅	Up from 79%; 126 total tests; all pagination strategies
`cli/catalog.py`	94%	✅	Up from 46%; 77 tests; all query commands + sync covered
`validation/gate.py`	94%	✅	32 tests; evaluate_gate + row count checks covered
`utils/setup_helpers.py`	94%	✅	22 tests; KeyVault + parallel connections covered
`diagnostics/manager.py`	95%	✅
`lineage.py`	96%	✅	71 tests; OpenLineageAdapter + LineageTracker fully covered
`connections/postgres.py`	96%	✅	85 tests; all methods + error suggestions covered
`cli/story.py`	96%	✅	44 tests; all commands covered
`cli/test.py`	97%	✅	41 tests; slugify + run_test_case + test_command loop
`cli/init_pipeline.py`	97%	✅	38 tests; interactive prompts + force overwrite
`cli/list_cmd.py`	97%	✅	41 tests; all list/explain commands
`cli/main.py`	98%	✅	70 tests; all commands + scaffold dispatch
`connections/local.py`	98%	✅	64 tests; all methods covered
`story/lineage.py`	98%	✅	70 tests; LineageGenerator fully covered
`graph.py`	98%	✅
`utils/logging_context.py`	98%	✅
`diagnostics/diff.py`	99%	✅
`tools/adf_profiler.py`	99%	✅	70 tests; all REST API + report generation covered
`utils/progress.py`	100%	✅	60 tests; Rich + plain text paths
`cli/lineage.py`	100%	✅	27 tests; all functions fully covered
`cli/schema.py`	100%	✅	27 tests; all functions fully covered
`story/lineage_utils.py`	100%	✅	62 tests; all 4 functions fully covered
`testing/source_pool.py`	100%	✅	44 tests; all Pydantic models + validators
`transformers/__init__.py`	100%	✅	register_standard_library fully tested
`transformers/validation.py`	100%	✅

Priority 1: Stability & Defect Triage ✅ LARGELY COMPLETE¶

Goal: Zero critical bugs, reliable cross-engine behavior

1.1 Bug Audit (Feb 2026) — DONE¶

Completed a comprehensive audit filing 46 bug issues (#238–#280). 43 bugs fixed and closed.

Remaining open bugs (3): - [x] ~~All critical and high-priority bugs~~ — Fixed - [ ] #268 SECURITY: MCP discovery tool vulnerable to SQL injection (deferred — MCP not in use) - [ ] #265 SECURITY: MCP execute.py allows arbitrary code execution (deferred — MCP not in use) - [ ] #199 AggregationPattern._load_existing_spark lacks multi-format support (low-pri, depends on #192)

Unlabeled open issue to triage: - [ ] #248 SCD2 Pandas change detection unreliable for float/NaN comparisons

1.2 Test Infrastructure — ✅ TARGET REACHED (80%)¶

[x] Run pytest --cov=odibi --cov-report=html and identify modules < 80% coverage ✓
[x] Establish strict 5-point fix checklist (ruff check, ruff format, pytest, no conftest changes, correct @patch targets) ✓
[ ] Add parametrized tests for transformers across all 3 engines
[ ] Add edge-case tests: empty DataFrames, null-only columns, Unicode data
[x] Increase catalog.py coverage from 47% to 80%+ ✓ (now 85%, 126 tests)
[x] Increase node.py coverage from 51% to 80%+ ✓ (now 81%, 89 tests)

Priority 2: Engine Parity ✅ FUNCTIONALLY COMPLETE¶

Goal: Every feature works identically on Pandas, Spark, and Polars

Engine parity is achieved — all features work across all 3 engines. See docs/reference/PARITY_TABLE.md for the full matrix (all ✅).

2.1 Parity Status¶

Transformer	Pandas	Spark	Polars	Notes
`scd2`	✓	✓	✓	Verified
`pivot`	✓	✓	✓	Verified
`window_calculation`	✓	✓	✓	Verified
`normalize_json`	✓	✓	✓	Verified

2.2 Ongoing Maintenance¶

Engine parity is now a maintain-as-you-go concern, not a dedicated effort: - When adding a feature to one engine, add the matching implementation to the others - Polars coverage improved from 7% → 57% organically - [ ] #212 Missing Polars branches in some transformers (low-pri) - [ ] Document Spark SQL vs DuckDB SQL differences

Priority 3: Error Diagnostics & CLI¶

Goal: Misconfiguration is obvious, actionable errors

3.1 YAML Validation¶

[x] Ensure odibi validate config.yaml catches ALL common mistakes ✓
[ ] Add validation for node name format (alphanumeric + underscore only)
[ ] Add validation for missing format: in inputs/outputs
[ ] Improve error messages with line numbers and suggestions

3.2 Runtime Diagnostics¶

[ ] Expand diagnose_error MCP tool with more error patterns
[ ] Add odibi doctor checks for environment issues
[x] Improve traceback cleaning for node execution errors ✓

Priority 4: Documentation & Examples¶

Goal: Golden path is clear, pitfalls are documented

4.1 Examples¶

[ ] Verify all examples in examples/ run successfully
[ ] Add end-to-end example for each pattern
[ ] Add "migration from raw SQL" example

4.2 Pitfall Documentation¶

[ ] Document anti-patterns (e.g., using hyphens in node names)
[ ] Document engine-specific gotchas
[ ] Add troubleshooting section to AI assistant setup guide

Priority 5: Feature Gaps¶

Goal: Cover common use cases, stay focused

5.1 Potential Additions (Evaluate Need First)¶

[ ] apply_mapping transformer for lookup-based value replacement
[ ] flatten_struct for deeply nested JSON (beyond single level)
[ ] row_number as standalone transformer (simpler than window_calculation)
[ ] CDC (Change Data Capture) pattern variant

5.2 NOT Adding¶

Complex orchestration (use Dagster/Airflow)
Agent/chat infrastructure (use Amp/Cursor/Cline)
GUI/web interface

Priority 6: Agent Hardening Campaign¶

Goal: Ensure AI agents produce correct, first-attempt code when working on odibi

See docs/AGENT_CAMPAIGN.md for the full campaign plan — structured prompts, validation tasks, and acceptance criteria for agent reliability.

Next Actions¶

Remaining Coverage Gaps (Spark paths dominate)¶

Most modules below 80% are there because Spark-specific branches are untested in CI (no JVM). The remaining coverage work is primarily:

pipeline.py (20%) — Pipeline.run() execution paths, PipelineManager.from_yaml/__init__/validate/run_node (~1000 lines, mostly integration-level)
transformers/scd.py (49%) — _scd2_spark and Delta MERGE paths
patterns/date_dimension.py (57%) — Spark generation paths
writers/sql_server_writer.py (67%) — Spark-only paths (~466 stmts)

Completed (previously planned)¶

~~Pandas engine tests (25% → 74%)~~ ✅
~~Polars engine tests (7% → 57%)~~ ✅
~~Merge transformer (36% → 61%)~~ ✅
~~Advanced transformers (44% → 80%)~~ ✅
~~State management (33% → 80%)~~ ✅
~~Review GitHub issues and close stale ones~~ ✅ (43 bugs closed)
~~catalog.py (47% → 85%)~~ ✅ (126 tests)
~~node.py (51% → 81%)~~ ✅ (89 tests)
~~catalog_sync.py (43% → 81%)~~ ✅ (88 tests)
~~story/generator.py (54% → 83%)~~ ✅ (100 tests)
~~story/doc_generator.py (55% → 90%)~~ ✅ (117 tests)
~~connections/azure_adls.py (10% → 89%)~~ ✅ (126 tests)
~~connections/api_fetcher.py (79% → 94%)~~ ✅ (126 tests)
~~cli/catalog.py (46% → 94%)~~ ✅ (77 tests)
~~connections/factory.py (69% → 81%)~~ ✅ (25 tests)

Existing Test Infrastructure (Not Reflected in Coverage)¶

The scripts/run_test_campaign.py runs end-to-end validation that pytest coverage doesn't capture:

Phase	What It Tests	Engine
Phase 1	CSV read, Parquet write, schema validation	Pandas
Phase 3	State/HWM persistence	Pandas
Phase 4	Merge pattern (upsert)	Pandas
Phase 5	SCD2 pattern	Pandas
Phase 6	Logical path resolution	Pandas
Phase 11	10k row scaling	Pandas

Production validation: Spark engine runs in Databricks - not tested in CI but validated in production.

Documentation Gaps to Address¶

[x] Add scripts/run_test_campaign.py to docs/guides/testing.md ✓
[x] Update docs/features/engines.md with engine-specific testing notes ✓
[ ] Document Spark/Databricks testing approach in docs/tutorials/spark_engine.md
[ ] Add "How to run the test campaign" section to AGENTS.md or CONTRIBUTING.md
[ ] #229 Add engine parity table for transformers
[ ] #228 CHANGELOG missing entries for recent bug fixes
[ ] #225 Add Delta Lake troubleshooting section
[ ] #224 Add tutorial for validation and contracts workflow
[ ] #223 Add tutorial for delete detection workflow
[ ] #222 Add tutorial for quarantine/orphan handling workflow

Success Metrics¶

Goal	Target	Current	Status
Test coverage	80%+	80% (34,363 stmts, 6,854 missed)	✅ Done
CI pass rate	100%	100%	✅ Done
Engine parity	100%	100%	✅ Done (all features ✅ across engines)
Bug backlog	0 critical	0 critical	✅ Done (3 low-pri/deferred remain)
Open issues	< 10	18	🔶 Reduced from 46+ filed

Long-term Vision¶

Odibi should be: - Declarative: YAML-first, SQL-based transformations - Portable: Works on laptop (Pandas), cluster (Spark), or serverless (Polars) - Stable: Comprehensive tests, predictable behavior - Documented: AI assistants can generate correct configs without trial and error

The ultimate test: Can you hand a business analyst the docs and have them build a working pipeline without your help?