Build, Test & Publish a Connector
This walkthrough takes a custom low-code connector from an empty folder to a published Source you can sync from — authoring it as code, testing it entirely on your machine, then registering it with RevOS. We'll build a real one: a connector that pulls NYC yellow-taxi trips from the City of New York's open-data API. It's public — no sign-up, no key — so you can run every step end to end, and it's a rich analytics source (fares, tips, distance, pickup/drop-off zones, by hour).
The same steps fit any REST API — your CRM, billing, or support tool. For the full command reference and the connector file model, see Connectors; this page is the hands-on path.
Before you start
- A RevOS project created with
revos init(you run these commands from inside it). - The connector test runtime on
PATH, needed bycheck/discover/read. It's pre-installed in the Dev Container — if you work outside the container and it's missing, the command stops and prints the one-line install to run.
new, spec, apply, and delete don't need the runtime; only the local-run verbs do.
1. Scaffold a connector
revos connectors new nyc-taxi-trips
This creates a runnable hello-world under connectors/nyc-taxi-trips/:
connectors/nyc-taxi-trips/
connector.yaml # RevOS resource wrapper (kind: Connector)
manifest.yaml # the low-code connector manifest
secrets/
config.json # config values for local runs (gitignored)
integration_tests/
configured_catalog.json
It ships pointed at a public demo API so you can confirm your toolchain works before writing a line of your own. We'll smoke-test that next, then replace the manifest with the real taxi source.
2. Inspect the config schema
revos connectors spec nyc-taxi-trips
Pure-local — reads the spec block from manifest.yaml and prints the config fields the connector expects (name, type, whether it's a secret). No runtime, no network. Use it to see exactly what belongs in secrets/config.json.
3–5. Smoke-test the local runtime
Run the three local verbs against the scaffolded demo to confirm the runtime is wired up:
revos connectors check nyc-taxi-trips # connects? → SUCCEEDED / FAILED
revos connectors discover nyc-taxi-trips # lists the streams it exposes
revos connectors read nyc-taxi-trips # streams a sample of records
check / discover / read all run the connector locally against its manifest using secrets/config.json — nothing is uploaded. Once these pass on the demo, you know the loop works; now make it real.
6. Point it at the taxi data
The dataset is served over a plain REST endpoint that returns a JSON array and pages with $limit / $offset — no authentication:
https://data.cityofnewyork.us/resource/4b4i-vvec.json?$limit=100&$offset=0
secrets/config.json can stay almost empty — the API needs no key. Public open-data endpoints are rate-limited, so we keep an optional app token as a config field for when you need a higher ceiling:
{ "app_token": "" }
Now edit manifest.yaml so the connector:
- calls base URL
https://data.cityofnewyork.us, path/resource/4b4i-vvec.json, with$limit=100; - pages by advancing
$offsetby the page size each request; - optionally sends
{{ config['app_token'] }}as a request header to lift the rate limit; - exposes a
tripsstream mapping the fields you'll model downstream —tpep_pickup_datetime,tpep_dropoff_datetime,passenger_count,trip_distance,fare_amount,tip_amount,total_amount,payment_type,pulocationid,dolocationid; - derives a primary key from
tpep_pickup_datetime+pulocationid+dolocationid— taxi trips have no natural id, so synthesize one (the same trick the cube layer uses for composite keys).
Declare app_token in the manifest's config schema (mark it a secret) so revos connectors spec shows it and the UI prompts for it later.
You renamed the stream from the demo's, so point the read catalog at the new one: in integration_tests/configured_catalog.json, set the stream name to trips (the scaffold pre-fills it with the demo stream). read uses that file by default.
Re-run the loop — this time against the live taxi API — until it's green:
revos connectors spec nyc-taxi-trips # shows app_token
revos connectors check nyc-taxi-trips # reaches the open-data API → SUCCEEDED
revos connectors discover nyc-taxi-trips # lists the trips stream
revos connectors read nyc-taxi-trips # streams real taxi trips (Ctrl-C to stop)
read performs a full read through the paginator — against a public dataset that's millions of rows. Once the first records confirm the shape, stop it with Ctrl-C; while iterating, keep $limit small. check and discover are quick and don't read the whole stream.
Swap the base URL, auth, and field mapping and this same shape ingests from any REST API — the local loop never changes.
Bound the data with a config option
A full read of every trip is a lot of rows. Instead of hardcoding a date window, expose it as a config option with a default — tunable per Source without touching the manifest. Add a start_date field to the config schema with a default, then filter the request by it:
- in the config schema, declare
start_datewithdefault: "2023-12-31T00:00:00"— the last few days of this dataset (≈ 77,000 trips, a first sync of about 8 minutes — small enough to finish in a tutorial, real enough to be useful); - in the request, add a filter that reads the value and falls back to the default when it's unset:
$where: tpep_pickup_datetime >= '{{ config['start_date'] | default('2023-12-31T00:00:00', true) }}'
revos connectors spec now lists start_date with its default, and the Source form pre-fills it (editable). Verify both paths:
revos connectors read nyc-taxi-trips # no start_date set → trips from 2023-12-31 (the default)
# set "start_date": "2023-12-01T00:00:00" in secrets/config.json to widen the window, then:
revos connectors read nyc-taxi-trips # now trips from 2023-12-01
The same pattern fits any knob — page size, a category, a region: declare it in the config schema with a sensible default, reference it with {{ config['…'] }}, and each Source overrides only what it needs.
7. Name it
Open connector.yaml — it carries two names:
apiVersion: revos/v1
kind: Connector
metadata:
name: nyc-taxi-trips # local slug — the folder name and the address other resources use
spec:
name: NYC Taxi Trips # the registered name shown in the RevOS UI
type: declarative
metadata.name is never sent to the API; editing spec.name renames the connector upstream on the next apply. The full name model (and why renaming the folder is a local-only move) is in Connectors.
8. Publish
revos apply connectors/nyc-taxi-trips # one connector
# or
revos apply # the whole project, connectors included
apply registers the connector (or updates it, bumping the version when the manifest changed) and writes the assigned id back into connector.yaml. The payload contains no credentials — secrets/config.json stays on your machine.
9. Use it as a Source
Once applied, your connector shows up as a selectable Source type. Create a Source against it — any real credentials are entered server-side and never uploaded by apply:
revos sources create # opens the RevOS UI to configure the Source
Then reference that Source from a Connection to start syncing trips into your warehouse — from there dbt and a Cube turn them into metrics (total fares, average tip, trips by pickup zone and hour). See Sources and Connections.
10. Clean up (optional)
revos connectors delete nyc-taxi-trips # unpublish from the workspace, keep the files
revos connectors delete nyc-taxi-trips --purge # also delete the local directory
Where to go next
- Connectors — full command reference and file model
- Resource types — how connectors fit alongside Connections, Cubes, and dbt models
- From Zero to Semantic Model — build a full pipeline on top of your synced data