Reddit Sentiment Analysis Real-Time* Data Pipeline

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • reddit-streaming-pipeline

    A real-time reddit data streaming pipeline for sentiment analysis of various subreddits

  • Github link: https://github.com/nama1arpit/reddit-streaming-pipeline

  • finnhub-streaming-data-pipeline

    Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more

  • I didn't use any specific guide. It was mostly build, test, integrate and repeat for each component. For some of them, I went through official documentation on getting started with each application and implemented it in the cluster. However, I reckon you can find other tutorials to setup each application by itself. A few github projects helped me in planning the project architecture and codebase structure like https://github.com/RSKriegs/finnhub-streaming-data-pipeline and https://gitlab.fit.cvut.cz/kozlovit/ni-dip-project-kozlovit.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • world_cup_twitter_sentiment

    Measuring sentiment of World Cup matches using Tinybird and the Twitter API v2

  • Sure thing! Here’s the repo: https://github.com/tb-peregrine/world_cup_twitter_sentiment

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts