Skip to content

Odibi CLI: Zero to Hero

Ultimate Cheatsheet & Reference (v3.4.3)

The Command Line Interface (CLI) is your primary tool for managing Odibi projects.


🟒 Level 1: The Basics

1. Create a New Pipeline File

Generate a reference YAML config file with all features enabled.

odibi scaffold project my_pipeline

2. Run a Pipeline

Execute the pipeline defined in your YAML file.

odibi run my_pipeline.yaml

Common Flags: * --dry-run: Simulate execution (don't write data). * --resume: Resume from the last failure (skips successful nodes). * --env prod: Load production environment variables.


🟑 Level 2: Intermediate (Management)

1. Initialize a Full Project

Create a full folder structure from a template. Aliases: init, init-pipeline, create, generate-project.

Available Templates: | Template | Description | | :--- | :--- | | hello | Hello World - Simple CSV read/write (start here) | | scd2 | SCD Type 2 - Slowly Changing Dimension pattern | | star-schema | Star Schema - Full dimensional model with fact table |

# Start with the simplest example
odibi init my_project --template hello

# SCD2 pattern
odibi init-pipeline my_project --template scd2

# Full star schema project
odibi init my_project --template star-schema

2. Validate Configuration

Check if your YAML is valid before running it.

odibi validate my_pipeline.yaml

What it checks: - YAML schema validity (required fields, types) - Pipeline logic (node dependencies, circular references) - Transformer parameters (valid operations, required params) - Connection references (all referenced connections exist)

Example output (valid config):

[OK] Config is valid

Example output (invalid config):

[!] Pipeline 'main_etl' Errors:
  - Node 'load_data': missing required field 'format'
  - Node 'transform': unknown transformer 'invalid_op'

[?] Pipeline 'main_etl' Warnings:
  - Node 'save_output': 'overwrite' mode will delete existing data

[X] Validation failed

3. Visualize Dependencies

Generate a dependency graph to understand flow.

# ASCII Art (Default)
odibi graph my_pipeline.yaml

# Mermaid Diagram (for Markdown)
odibi graph my_pipeline.yaml --format mermaid


πŸ”΄ Level 3: Hero (Advanced Tools)

1. Deep Diff (Compare Runs)

Did a pipeline run suddenly output fewer rows? Use story diff to compare two runs.

# List available runs
odibi story list

# Compare two story JSON files
odibi story diff stories/runs/20231027_120000.json stories/runs/20231027_120500.json
Output: Shows execution time differences, row count changes, and success rates.

2. Manage Secrets

Securely manage local secrets for your pipelines.

# Initialize secrets store (creates .env.template)
odibi secrets init odibi.yaml

# Validate all secrets are configured
odibi secrets validate odibi.yaml


🧠 Level 4: System Catalog (The Brain)

Query the System Catalog

The System Catalog stores metadata about all your runs, pipelines, nodes, and state. Query it without manually reading Delta tables.

# List recent runs
odibi catalog runs config.yaml

# Filter by pipeline and status
odibi catalog runs config.yaml --pipeline my_etl --status SUCCESS --days 14

# List registered pipelines
odibi catalog pipelines config.yaml

# List nodes (optionally filter by pipeline)
odibi catalog nodes config.yaml --pipeline my_etl

# View HWM state checkpoints
odibi catalog state config.yaml

# Get execution statistics
odibi catalog stats config.yaml --days 30

Catalog Subcommands: | Subcommand | Description | | :--- | :--- | | runs | List execution runs from meta_runs | | pipelines | List registered pipelines from meta_pipelines | | nodes | List registered nodes from meta_nodes | | state | List HWM state checkpoints from meta_state | | tables | List registered assets from meta_tables | | metrics | List metrics definitions from meta_metrics | | patterns | List pattern compliance from meta_patterns | | stats | Show execution statistics (success rate, avg duration, etc.) |

Common Flags: * --format json: Output as JSON instead of ASCII table * --pipeline <name>: Filter by pipeline name * --days <n>: Show data from last N days (default: 7) * --limit <n>: Limit number of results (default: 20)


πŸ” Level 5: Schema & Lineage Tracking

Schema Version History

Track how table schemas evolve over time.

# View schema history for a table
odibi schema history silver/customers --config config.yaml

# Compare two schema versions
odibi schema diff silver/customers --config config.yaml --from-version 3 --to-version 5

# Output as JSON
odibi schema history silver/customers --config config.yaml --format json

Example Output:

Schema History: silver/customers
================================================================================
Version    Captured At            Changes
--------------------------------------------------------------------------------
v5         2024-01-30 10:15:00    +loyalty_tier
v4         2024-01-15 08:30:00    ~email (VARCHAR→STRING)
v3         2024-01-01 12:00:00    -legacy_id
v2         2023-12-15 09:00:00    +created_at, +updated_at
v1         2023-12-01 10:00:00    Initial schema (12 columns)

Cross-Pipeline Lineage

Trace data dependencies across pipelines.

# Trace upstream sources
odibi lineage upstream gold/customer_360 --config config.yaml

# Trace downstream consumers
odibi lineage downstream bronze/customers_raw --config config.yaml

# Impact analysis - what would be affected by changes?
odibi lineage impact bronze/customers_raw --config config.yaml

Example Output (upstream):

Upstream Lineage: gold/customer_360
============================================================
gold/customer_360
└── silver/dim_customers (silver_pipeline.process_customers)
    └── bronze/customers_raw (bronze_pipeline.ingest_customers)

Example Output (impact):

⚠️  Impact Analysis: bronze/customers_raw
============================================================

Changes to bronze/customers_raw would affect:

  Affected Tables:
    - silver/dim_customers (pipeline: silver_pipeline)
    - gold/customer_360 (pipeline: gold_pipeline)
    - gold/churn_features (pipeline: ml_pipeline)

  Summary:
    Total: 3 downstream table(s) in 2 pipeline(s)

Schema Subcommands: | Subcommand | Description | | :--- | :--- | | history | View schema version history for a table | | diff | Compare two schema versions |

Lineage Subcommands: | Subcommand | Description | | :--- | :--- | | upstream | Trace upstream sources of a table | | downstream | Trace downstream consumers of a table | | impact | Impact analysis for schema changes |

Common Flags: * --config <path>: Path to YAML config file (required) * --depth <n>: Maximum depth to traverse (default: 3) * --format json: Output as JSON * --limit <n>: Limit results (schema history only)


πŸ€– Level 5: AI-Friendly Introspection

These commands help AI tools (and developers) discover available features programmatically.

List Available Features

# List all 54 transformers
odibi list transformers

# List all 6 patterns with descriptions
odibi list patterns

# List all connection types
odibi list connections

# JSON output (for AI parsing)
odibi list transformers --format json

Explain Any Feature

# Get detailed docs for a transformer
odibi explain fill_nulls

# Get detailed docs for a pattern (includes example YAML)
odibi explain dimension

# Get detailed docs for a connection type
odibi explain azure_sql

Generate YAML Templates

# List all available template types
odibi templates list

# Show connection template with all auth options
odibi templates show azure_blob

# Show all 11 validation test types
odibi templates show validation

# Show transformer params + example YAML
odibi templates transformer scd2
odibi templates transformer derive_columns

# Generate JSON schema for VS Code autocomplete
odibi templates schema

Templates are generated directly from Pydantic modelsβ€”always in sync with code.

AI Workflow Example:

# AI checks what's available
odibi list transformers --format json | jq '.[] | .name'

# AI looks up specific usage
odibi explain derive_columns

# AI validates generated config
odibi validate generated_pipeline.yaml


πŸ“„ Command Reference

Command Description
run Execute a pipeline.
discover Discover data sources.
scaffold Generate scaffolds (project, sql-pipeline).
validate Check YAML syntax and logic.
doctor Run environment diagnostics.
doctor-path Diagnose a path.
init / init-pipeline Initialize a new Odibi project from a template.
story Generate and manage pipeline documentation stories (generate, diff, list, last, show).
catalog Query System Catalog metadata (runs, pipelines, nodes, state, tables, metrics, patterns, stats, sync).
lineage Cross-pipeline lineage (upstream, downstream, impact).
schema Schema version tracking (history, diff).
secrets Manage secrets and environment variables (init, validate).
system Manage System Catalog operations (sync, rebuild-summaries, optimize, cleanup).
templates Generate YAML templates from Pydantic models (list, show, transformer, schema).
list List available transformers, patterns, or connections.
explain Explain a transformer, pattern, or connection.
export Export pipeline to orchestration code (--target airflow\|dagster).
ui Launch observability UI.
graph Visualize pipeline dependency graph (ascii, dot, mermaid).
deploy Deploy pipeline definitions to System Catalog.
test Run Odibi unit tests.