> [!tldr] A continuous set of samples of data taken over time
A data stream is a source of data that produces samples over time. This is my own crappy definition of it, not a good one. In data streams **order matters**.
A data stream can be processed in batches, or near-real time.
**Types of data streams:**
- A [[Hash Table|key/value]] system with time-based keys
- A simple running log of measurements + contextual data like [[Oura Ring]]'s "movement" time-series data, which presents an [[Enumeration|enum]] value every 5 minutes.
- The [[PDW]] is a data stream - now that it's getting "as it happens" type data, for sure
**Tooling:**
- [[Kafka]] for large-scale streams
- Simply appending [[CSV]] files or [[NDJSON]] files works
- [[Relational Databases]] can be used, but [[Big Data]] technologies may be more useful in practice
****
# More
## Source
- self / grad school