Data Tooling Boundary v1¶
Summary¶
This note is historical context for why Retriever keeps dataset/export contracts in
retriever.types.data while keeping runtime execution semantics elsewhere.
The near-term recommendation is:
- Keep Retriever runtime independent from large external data-tool layers.
- Isolate a small read tool and a mock data write tool outside the main Retriever runtime package.
- Treat those tools as a small standalone test surface for data workflows.
- Keep only a narrow adapter boundary inside Retriever for any integration points that are genuinely useful today.
This lets data-focused collaborators experiment on a simpler surface without having to absorb the full Retriever runtime model.
Why Isolate This¶
Today, the strongest use cases for the data contract layer are around:
- typed stream identifiers and schema references
- recording and replay utilities
- offline data exchange and dataset tooling
- experiments that do not need the full runtime
The weaker use case is trying to make the data contract layer the main execution model for Retriever itself. The current runtime already has working flow, step, and replay semantics. Forcing data/export concerns into the core runtime would create unnecessary coupling.
In practice, the current shape suggests three layers:
- Retriever runtime
- flow execution, scheduling, backends, replay orchestration, visualization
- Data tools
- reading structured recordings or datasets
- producing mock data for testing pipelines
- Shared data contracts
- minimal typed identifiers and schemas used by the data tools
Only the last two need to be shared with external collaborators at this stage.
Recommended Split¶
1. Keep In Retriever¶
These belong in the main repo and should not depend on a broad all-in-one data layer:
- runtime stepping and flow execution
- backend-specific transport and scheduling
- pipeline composition
- visualization and live logging
- replay orchestration from persisted artifacts
Retriever should only keep small adapter points for external data tooling.
2. Move Out As Data-Collaborator Tools¶
These are good candidates for isolation into a small companion package or repo:
- read tool
- load recordings or dataset shards
- enumerate streams
- return typed records in a stable offline-friendly format
- mock data write tool
- emit synthetic or hand-authored test streams
- generate deterministic fixtures
- write small datasets or recording-like artifacts for collaborator testing
- minimal shared types
- stream IDs
- schema references
- optional clock-domain metadata
This gives collaborators a self-contained loop:
- generate mock data
- write artifact
- read artifact
- validate expected typed structure
They should not need to run a full Retriever pipeline to do this.
Minimal Shared Surface¶
The shared layer should stay intentionally small.
Suggested scope:
StreamIdSchemaRef- optional
ClockDomain - a small record envelope for offline events or samples
Suggested non-goals for now:
- full runtime event buffer semantics
- execution-time scheduling semantics
- broad flow typing integration
- trying to unify all Retriever runtime data movement under one new abstraction
If the shared layer cannot be explained on one page, it is too large for the current collaborator-testing phase.
Proposed External Tooling Shape¶
One reasonable split is:
data_tools/read_tool.pymock_write_tool.pytypes.pyfixtures/tests/
The read tool should answer questions like:
- what streams exist?
- what schema is attached to each stream?
- what records are available in a given range?
- can I materialize those records as plain Python objects, Arrow-like rows, or simple typed payloads?
The mock write tool should answer questions like:
- can I generate a tiny deterministic test artifact?
- can I generate a multi-stream artifact with known ordering?
- can I generate malformed or partial cases for robustness testing?
Retriever Integration Boundary¶
Retriever should integrate with these tools through a narrow boundary:
- import a minimal schema/type surface if needed
- call the read tool for offline replay or import
- call the mock write tool in tests or examples
Retriever should not become the implementation home for data-tool experiments.
That keeps the boundary clean:
- collaborators can iterate independently
- Retriever can consume stable outputs
- breaking changes stay localized
Immediate Plan¶
Phase 1¶
- keep broad data/export experiments out of main runtime merges
- define a tiny shared type surface
- extract or rewrite a standalone read tool
- extract or rewrite a standalone mock data write tool
Phase 2¶
- add focused tests for cross-tool interoperability
- validate that collaborators can use the tools without Retriever runtime knowledge
- decide whether the shared types are stable enough for wider adoption
Phase 3¶
- only after repeated real usage, decide whether any part of this should move back into Retriever core
Merge Guidance¶
For current Retriever development:
- keep runtime, perception, recording, and replay improvements reviewable separately
- keep
retriever.types.datafocused on collection/replay/export contracts - avoid forcing data/export concerns into the execution core
This reduces risk and preserves room to redesign the data surface with external collaborators before locking it into Retriever.
Success Criteria¶
This split is working if:
- a collaborator can generate mock artifacts without Retriever runtime setup
- a collaborator can read and inspect those artifacts with a tiny tool surface
- Retriever can import or replay those artifacts through a narrow adapter
- no core runtime code depends on a large experimental data contract package
That is the right bar for the current phase.