Why You Should NOT Build Your Data Pipeline on Top of Singer

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • getting-started

    This repository is a getting started guide to Singer. (by singer-io)

  • Singer has excellent documentation around its core protocol. It also does a nice job defining the suite of special metadata that it supports. When you start actually using Singer, however, mapping these primitives onto your integrations is difficult. For example, “replication-method” sets whether all the data from the source should be replicated (“full_table”) or just the new or updated data (“incremental”). What is unclear is which taps actually support “incremental” or “full_table” or both.

  • tap-hubspot

  • Some integrations help out by specifying what the configuration should look like in a readme or in a sample config. Even these lead to headaches. They often just list the fields that need to be passed in but do not explain what they mean, what their format is, or how to find them (good luck trying to find all the information you need to configure your Google Ads integration!). In other cases, they only list a subset, and then you have to discover the rest by reading the integration (e.g., tap-salesforce doesn’t mention is_sandbox in the docs UPDATE: someone has now added this field in the readme with this PR).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte?

    1 project | /r/dataengineering | 6 Dec 2023
  • Design patter for Python ETL

    2 projects | /r/dataengineering | 2 Dec 2022
  • Basic data engineering question.

    2 projects | /r/dataengineering | 16 Oct 2022
  • I have hundreds of API data endpoints with different schemas. How do I organize?

    1 project | /r/dataengineering | 10 Oct 2022
  • CDC (Change Data Capture) with 3rd party APIs

    1 project | /r/dataengineering | 23 Sep 2022