Skip to main content

From Zero to Semantic Model

This tutorial walks you through the full RevOS developer workflow — from a fresh install to a live semantic model with quality checks and a data chart. You'll use the bigquery-public-data.thelook_ecommerce public dataset throughout, so no data ingestion setup is required.

Persona: Data engineer at a B2C ecommerce company, first time using RevOS.
Time: ~30 minutes.

AI coding agents

This tutorial is written for use with an AI coding agent and has been validated with Claude Code. It should also work with any agent that reads the project's AGENTS.md file — it gives the agent the full context it needs to run the right skills.


Step 1 — Install the CLI and log in

Install the RevOS CLI globally:

npm install -g @revos/cli

Verify the installation:

revos --version

Then authenticate. This opens a browser window for OAuth:

revos auth login

After login, confirm you can see your organizations:

revos org list

For more details see Installation and Authentication.


Step 2 — Scaffold a project

Create a new RevOS project called my-shop:

revos init my-shop

The CLI will prompt you to select an organization, then generate the following structure:

my-shop/
├── .devcontainer/ # VS Code Dev Container (Python, Node, dbt, gcloud, gh pre-installed)
├── .claude/
│ ├── settings.json # security rules
│ └── skills/ # RevOS AI skills (explore-lakehouse, create-cubes, …)
├── dbt/models/
│ ├── bronze/ # Source declarations only (schema.yml — no SQL)
│ ├── silver/ # Cleaned & conformed (reads raw via source('bronze', …))
│ └── gold/ # Business-ready marts (reads silver via ref())
├── cubes/ # Cube.dev semantic model definitions (IaC)
├── connections/ # Sync connection definitions (IaC; reference Sources by id)
├── CLAUDE.md # AI companion context
└── AGENTS.md # Skill index for AI agents

Open my-shop in VS Code and click Reopen in Container when prompted. All subsequent commands in this tutorial run inside the Dev Container.

One-time setup inside the container

The Dev Container starts with no credentials shared in from your host. Open a terminal inside the container — the welcome banner tells you what to run. Do this once:

revos init # signs you in via browser (if needed), then provisions the GCP service-account key

You won't need to repeat this. Credentials persist in per-project Docker volumes across Rebuild Container, restarts, and reopens — every future container start lands in a fully-authenticated state and prints REVOS DEV CONTAINER — READY.

tip

The project is permanently bound to the organization you selected. Every revos command inside the container targets that org automatically. See Project Scaffolding for details.


Step 3 — Load sample data

The sample dataset (bigquery-public-data.thelook_ecommerce) contains users, orders, order items, products, and events — a realistic ecommerce schema.

Inside the Dev Container, open your AI coding agent and ask:

Load the thelook_ecommerce sample data into our BigQuery dataset.

The agent will use the load-sample-data skill to copy the public tables into your organization's BigQuery dataset as Bronze-layer tables.

After the skill completes you should have tables like sample_users, sample_orders, sample_order_items, sample_products, and sample_events in your dataset.


Step 4 — Explore the lakehouse

Before transforming anything, understand what you have. Ask your AI coding agent:

What data do we have in our lakehouse?

The agent uses the explore-lakehouse skill to inspect your BigQuery dataset and return a plain-English summary: table names, row counts, column types, and interesting fields.

This is useful for deciding which columns to clean in the next step.


Step 5 — Build Bronze → Silver → Gold transformations

Now ask your AI coding agent to create dbt models across all three medallion layers:

Create dbt transformations:
- Silver: clean the bronze sources (cast types, remove duplicates, null-safe joins)
- Gold:
- gold_order_items_enriched (grain: order item) — fact spine joining order items
to users, products, and distribution centers; carries traffic_source, country,
category, brand, cost, and order_date on every row
- gold_users_with_order_stats (grain: user) — pre-aggregated: order count, total
spend, avg order value, first/last order, return rate
- gold_product_performance (grain: product) — pre-aggregated: units sold, revenue,
avg sale price, gross margin %, return rate
- gold_traffic_source_revenue (grain: traffic source × day) — pre-aggregated:
orders, revenue, avg order value, distinct users

The agent uses the create-dbt-transformations skill to write .sql files into dbt/models/silver/ and dbt/models/gold/. Review the generated SQL, then build all models:

dbt run

The Bronze layer is source-onlydbt/models/bronze/ holds just a schema.yml declaring the raw tables loaded in Step 3 as dbt sources. There are no .sql files in Bronze. The create-dbt-transformations skill adds your tables to that schema as needed, then writes silver/gold SQL under dbt/models/silver/ and dbt/models/gold/. Silver reads raw data via {{ source('bronze', '<table>') }}; Gold reads silver via {{ ref() }}. dbt run therefore only builds silver and gold — bronze is never recreated.

Once the run completes, verify the models with:

dbt test

After a successful run you'll have four Gold tables: the gold_order_items_enriched fact spine plus the gold_users_with_order_stats, gold_product_performance, and gold_traffic_source_revenue marts.

tip

The Dev Container has dbt-bigquery pre-installed and pre-configured for your org's dataset. You don't need to edit profiles.yml.


Step 6 — Create the semantic model

The semantic model defines the measures, dimensions, and relationships (joins) that RevOS and downstream consumers use to query your Gold tables. The create-cubes skill detects join keys across your gold models, classifies each link (one_to_one, one_to_many, many_to_one, many_to_many), and emits a joins: block in each cube definition so cubes can be queried together.

Ask your AI coding agent:

Create a semantic model for our gold layer (gold_order_items_enriched,
gold_users_with_order_stats, gold_product_performance,
gold_traffic_source_revenue). Use gold_order_items_enriched as the fact spine
and define relationships between the cubes.

The agent will generate one cube YAML per gold model in cubes/. Across the four cubes it defines:

  • Dimensions — e.g. user_id, country, traffic_source, category, brand, order_date on the fact spine; per-cube identifiers and attributes on the marts.
  • Measures — counts, revenue, avg order value, gross margin %, return rate, distinct users.
  • Relationshipsgold_order_items_enriched (fact spine) joined to gold_users_with_order_stats, gold_product_performance, and gold_traffic_source_revenue with cardinality declared in each direction.

Once the files look right, reconcile them with RevOS:

revos apply

Confirm the cubes were registered:

revos status

Visualize the semantic model

With the cubes in place, ask your AI coding agent:

Visualize the semantic model.

The agent uses the visualize-semantic-model skill to parse the cube files in cubes/, detect the fact spine, and render a directed graph of your cubes, their join keys, and cardinality. Open the PNG to confirm the fact spine sits in the middle and the relationships you intended are wired up correctly.

See Project Scaffolding for the full IaC reconciliation workflow.


Step 7 — Data quality and pipeline review

With all layers built and the semantic model pushed, ask your AI coding agent to review everything and produce a quality report:

Review all the transformations and semantic models we built, run quality checks across all layers, and write an engineering summary of what was done and what the pipeline now enables.

The agent will run dbt run && dbt test across all layers, inspect the cubes, and produce a written summary covering what was built, which quality checks passed, and what the pipeline now enables for downstream consumers.


Step 8 — Ask a business question and see a chart

The whole point of building a semantic model is to ask business questions without writing SQL. Try one now. Ask your AI coding agent:

Show me the top 5 traffic sources by revenue.

The agent uses the query-semantic-model skill to:

  1. Call revos cubes meta to discover the measures and dimensions your cubes expose.
  2. Compose a Cube.js query against the right cube (gold_order_items_enriched here).
  3. Run it with revos cubes query and render the result inline in the chat — first as an ASCII table, then as an ASCII bar chart underneath:
traffic_source revenue
───────────────── ─────────
Search € 142,318
Organic € 88,204
Email € 41,907
Facebook € 22,015
Display € 9,471
Search ████████████████████████████████████████ € 142,318
Organic ████████████████████████▊ € 88,204
Email ███████████▊ € 41,907
Facebook ██████▏ € 22,015
Display ██▋ € 9,471

The same skill handles other shapes — try:

Revenue by month for the last 12 months.
How many orders did we get yesterday from organic search?

Any business question that can be answered from your cubes works here. The chart appears in the chat — no notebook, no UI hop.


Step 9 — Wrap-up

In ~30 minutes you've built a complete data engineering foundation:

LayerWhat you built
BronzeRaw thelook_ecommerce tables in BigQuery, declared as dbt sources in schema.yml
SilverCleaned, typed, deduplicated tables
GoldFact spine gold_order_items_enriched + 3 pre-aggregated marts (users, products, traffic source × day)
Semantic modelCube definitions with measures, dimensions, and relationships — plus model-graph.png visualization
Quality reportAgent-run dbt tests + engineering summary
First chartA real business question answered from the semantic model, rendered inline in chat

Where to go next

  • Project Scaffolding — IaC reconciliation workflow (revos apply, diff, pull, status)
  • Configuration — global options and environment variables
  • Sources — connect live data sources instead of sample data
  • Business-layer tutorial (coming soon) — Segments, Tables, and scheduled Actions built on top of this semantic model