-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
The Treatment and Control audiences need to be stored for future low-latency, high-reliability retrieval. Retrieval happens when we are delivering the survey, and informs the system which users to send surveys to. How is this achieved at Reddit’s scale? Users interact with ads, which generate events that are sent to our downstream systems for processing. At the output, these interactions are stored in DynamoDB as engagement records for easy access. Records are indexed on user ID and ad campaign ID to allow for efficient retrieval. The use of stream processing (Apache Flink) ensures this whole process happens within minutes, and keeps audiences up to date in real-time. The following high-level diagram summarizes the process:
Related posts
-
Stirling PDF: Self-hosted, web-based PDF manipulation tool
-
Ask HN: Best stack for building a desktop app?
-
Open VSX – Extensions for VS Code Compatible Editors
-
LLM Based Input Space Partitioning Testing for Library APIs (a.k.a. Bogus CVEs)
-
Intel submitted OpenJDK PRs for supporting new 64 bit general purpose registers