Skip to main content

Data Masking & Stream Mappers

Mask, redact, encrypt, or filter sensitive data before it lands in your warehouse.

Connections expose server-side stream mappers: per-stream rules that transform records as they sync from the source into your warehouse. Because the transformation runs at ingest, even a direct warehouse query sees only the masked values — there is no clean copy to leak.

Where rules live

Mappers are declared on a Connection's stream in YAML, under streams[].mappers. revos apply ships them to the API, which enforces them at ingest.

apiVersion: revos/v1
kind: Connection
metadata:
name: customers-pipeline
spec:
name: "Customers pipeline"
source: { id: src_abc123 }
schedule: { units: 24, timeUnit: hours }
status: active
streams:
- name: customers
namespace: public
syncMode: incremental_deduped_history
cursorField: [updated_at]
primaryKey: [[id]]
mappers:
- type: hashing
mapperConfiguration:
targetField: email
method: SHA-256
fieldNameSuffix: _hashed

Each mapper has a type and a mapperConfiguration whose shape depends on the type.

Supported mapper types

hashing

One-way hash of a field's value. Keeps rows joinable across pipelines on the hash, but the original value is unrecoverable.

- type: hashing
mapperConfiguration:
targetField: email
method: SHA-256
fieldNameSuffix: _hashed

Methods: MD2, MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512.

field-renaming

Rename a column on the way in.

- type: field-renaming
mapperConfiguration:
originalFieldName: ssn
newFieldName: ssn_redacted

field-filtering

Drop a column entirely; it never reaches the destination.

- type: field-filtering
mapperConfiguration:
targetField: internal_notes

row-filtering

Keep rows for which the predicate is true and drop the rest. Conditions support EQUAL for direct comparisons and NOT for negation; NOT wraps a non-empty list of nested conditions. Compare against an empty string is unreliable — pin to a real sentinel value instead.

- type: row-filtering
mapperConfiguration:
conditions:
type: NOT
conditions:
- type: EQUAL
fieldName: is_test
comparisonValue: "true"

The example above keeps every row except those flagged as test data.

encryption

Reversible encryption with either RSA (asymmetric, supply a publicKey) or AES (symmetric, supply a key plus mode and padding).

# RSA
- type: encryption
mapperConfiguration:
algorithm: RSA
targetField: card_number
fieldNameSuffix: _enc
publicKey: ${env.RSA_PUBLIC_KEY}

# AES
- type: encryption
mapperConfiguration:
algorithm: AES
targetField: card_number
fieldNameSuffix: _enc
key: ${env.AES_KEY}
mode: GCM
padding: NoPadding

AES modes: CBC, CFB, OFB, CTR, GCM, ECB. AES paddings: NoPadding, PKCS5Padding.

Use environment-variable interpolation (${env.VAR}) for keys so secrets stay out of the YAML committed to git.

Workflow

  1. Edit connections/<your-connection>.yaml to add a mappers: block under the relevant stream.
  2. revos diff — confirm the planned change.
  3. revos apply — push the new mappers to the API.
  4. The next sync uses the updated rules. Existing rows already in BigQuery are not retroactively transformed; trigger a full_refresh_overwrite if you need to re-mask history.

Round-trip with pull

revos pull preserves any mappers configured server-side. After a successful apply, running pull and inspecting git diff should show no changes.