Data Masking & Stream Mappers
Mask, redact, encrypt, or filter sensitive data before it lands in your warehouse.
Connections expose server-side stream mappers: per-stream rules that transform records as they sync from the source into your warehouse. Because the transformation runs at ingest, even a direct warehouse query sees only the masked values — there is no clean copy to leak.
Where rules live
Mappers are declared on a Connection's stream in YAML, under streams[].mappers. revos apply ships them to the API, which enforces them at ingest.
apiVersion: revos/v1
kind: Connection
metadata:
name: customers-pipeline
spec:
name: "Customers pipeline"
source: { id: src_abc123 }
schedule: { units: 24, timeUnit: hours }
status: active
streams:
- name: customers
namespace: public
syncMode: incremental_deduped_history
cursorField: [updated_at]
primaryKey: [[id]]
mappers:
- type: hashing
mapperConfiguration:
targetField: email
method: SHA-256
fieldNameSuffix: _hashed
Each mapper has a type and a mapperConfiguration whose shape depends on the type.
Supported mapper types
hashing
One-way hash of a field's value. Keeps rows joinable across pipelines on the hash, but the original value is unrecoverable.
- type: hashing
mapperConfiguration:
targetField: email
method: SHA-256
fieldNameSuffix: _hashed
Methods: MD2, MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512.
field-renaming
Rename a column on the way in.
- type: field-renaming
mapperConfiguration:
originalFieldName: ssn
newFieldName: ssn_redacted
field-filtering
Drop a column entirely; it never reaches the destination.
- type: field-filtering
mapperConfiguration:
targetField: internal_notes
row-filtering
Keep rows for which the predicate is true and drop the rest. Conditions support EQUAL for direct comparisons and NOT for negation; NOT wraps a non-empty list of nested conditions. Compare against an empty string is unreliable — pin to a real sentinel value instead.
- type: row-filtering
mapperConfiguration:
conditions:
type: NOT
conditions:
- type: EQUAL
fieldName: is_test
comparisonValue: "true"
The example above keeps every row except those flagged as test data.
encryption
Reversible encryption with either RSA (asymmetric, supply a publicKey) or AES (symmetric, supply a key plus mode and padding).
# RSA
- type: encryption
mapperConfiguration:
algorithm: RSA
targetField: card_number
fieldNameSuffix: _enc
publicKey: ${env.RSA_PUBLIC_KEY}
# AES
- type: encryption
mapperConfiguration:
algorithm: AES
targetField: card_number
fieldNameSuffix: _enc
key: ${env.AES_KEY}
mode: GCM
padding: NoPadding
AES modes: CBC, CFB, OFB, CTR, GCM, ECB. AES paddings: NoPadding, PKCS5Padding.
Use environment-variable interpolation (${env.VAR}) for keys so secrets stay out of the YAML committed to git.
Workflow
- Edit
connections/<your-connection>.yamlto add amappers:block under the relevant stream. revos diff— confirm the planned change.revos apply— push the new mappers to the API.- The next sync uses the updated rules. Existing rows already in BigQuery are not retroactively transformed; trigger a
full_refresh_overwriteif you need to re-mask history.
Round-trip with pull
revos pull preserves any mappers configured server-side. After a successful apply, running pull and inspecting git diff should show no changes.