Go to file
Kasper Juul Hermansen a14ea1fbcc
Some checks failed
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is failing
fix(deps): update rust crate serde to v1.0.217
2024-12-28 01:50:37 +00:00
crates feat: with updated notmad 2024-11-19 17:30:18 +01:00
proto/nodata/v1 feat: add prometheus and protobuf messages 2024-11-10 13:42:19 +01:00
templates feat: add prometheus and protobuf messages 2024-11-10 13:42:19 +01:00
.drone.yml feat: add base 2024-08-10 00:09:11 +02:00
.env feat: add s3 and deployment 2024-11-17 16:07:58 +01:00
.gitignore feat: add base 2024-08-10 00:09:11 +02:00
buf.gen.yaml feat: add prometheus and protobuf messages 2024-11-10 13:42:19 +01:00
buf.yaml feat: add data ingest 2024-08-10 00:54:10 +02:00
Cargo.lock fix(deps): update rust crate serde to v1.0.217 2024-12-28 01:50:37 +00:00
Cargo.toml feat: with updated notmad 2024-11-19 17:30:18 +01:00
cuddle.yaml feat: added please and release 2024-11-17 20:57:48 +01:00
README.md feat: with dagger engine 2024-08-18 01:13:27 +02:00
renovate.json feat: add base 2024-08-10 00:09:11 +02:00

nodata

Nodata is a simple binary that consists of two parts:

  1. Data ingest
  2. Data storage
  3. Data aggregation
  4. Data API / egress

Data ingest

Nodata presents a simple protobuf grpc api for ingesting either single events or batch

Data storage

Nodata stores data locally in a parquet partitioned scheme

Data aggregation

Nodata accepts wasm routines for running aggregations over data to be processed

Data Egress

Nodata exposes aggregations as apis, or events to be sent as grpc streamed apis to a service.

Architecture

Data flow

Data enteres nodata

  1. Application uses SDK to publish data
  2. Data is sent over grpc using, a topic, id and data
  3. Data is sent to a topic
  4. A broadcast is sent that said topic was updated with a given offset
  5. A client can consume from said topic, given a topic and id // 6. We need a partition in here to separate handling between partitions and consumer groups
  6. A queue is running consuming each broadcast message, assigning jobs for each consumer group to delegate messages

Components

A component is a consumer on a set topic, it will either act as a source, sink or a tranformation between topics. It can declare topics, use topics, transform data and much more.

A topic at its most basic is a computational unit implementing a certain interface, source, sink, transformation.

The most simple is a source and sink, where we respectively push or pull data from the topics.

A component implements either or all of 3 sdk interfaces

  1. Create a new sample rust application
  2. Add dependency nodata-component
  3. In the main functin use: nodata_component::component
  4. Implement the interfaces you want to use
  5. Build the application into a dockerfile, optionally use the nodata cli to build the app
  6. Register the application as a component
  7. nodata client add-component --image docker.io/kjuulh/nodata-example-transform:latest --tranform <input-topic>:<output-topic>