1.5 KiB
nodata
Nodata is a simple binary that consists of two parts:
- Data ingest
- Data storage
- Data aggregation
- Data API / egress
Data ingest
Nodata presents a simple protobuf grpc api for ingesting either single events or batch
Data storage
Nodata stores data locally in a parquet partitioned scheme
Data aggregation
Nodata accepts wasm routines for running aggregations over data to be processed
Data Egress
Nodata exposes aggregations as apis, or events to be sent as grpc streamed apis to a service.
Architecture
Data flow
Data enteres nodata
- Application uses SDK to publish data
- Data is sent over grpc using, a topic, id and data
- Data is sent to a topic
- A broadcast is sent that said topic was updated with a given offset
- A client can consume from said topic, given a topic and id // 6. We need a partition in here to separate handling between partitions and consumer groups
- A queue is running consuming each broadcast message, assigning jobs for each consumer group to delegate messages
Components
A component is a consumer on a set topic, it will either act as a source, sink or a tranformation between topics. It can declare topics, use topics, transform data and much more.
A topic at its most basic is a computational unit implementing a certain interface, source, sink, transformation.
The most simple is a source and sink, where we respectively push or pull data from the topics.
What does it look like
As part of nodata, you'll be given nodata the cli. The cli can bootstrap a variety of components,