Why You Should NOT Build Your Data Pipeline on Top of Singer

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

getting-started

16 1,220 0.0 Makefile

This repository is a getting started guide to Singer. (by singer-io)

Singer has excellent documentation around its core protocol. It also does a nice job defining the suite of special metadata that it supports. When you start actually using Singer, however, mapping these primitives onto your integrations is difficult. For example, “replication-method” sets whether all the data from the source should be replicated (“full_table”) or just the new or updated data (“incremental”). What is unclear is which taps actually support “incremental” or “full_table” or both.

tap-hubspot

3 49 6.8 Python

Some integrations help out by specifying what the configuration should look like in a readme or in a sample config. Even these lead to headaches. They often just list the fields that need to be passed in but do not explain what they mean, what their format is, or how to find them (good luck trying to find all the information you need to configure your Google Ads integration!). In other cases, they only list a subset, and then you have to discover the rest by reading the integration (e.g., tap-salesforce doesn’t mention is_sandbox in the docs UPDATE: someone has now added this field in the readme with this PR).

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte?

1 project | /r/dataengineering | 6 Dec 2023
Design patter for Python ETL

2 projects | /r/dataengineering | 2 Dec 2022
Basic data engineering question.

2 projects | /r/dataengineering | 16 Oct 2022
I have hundreds of API data endpoints with different schemas. How do I organize?

1 project | /r/dataengineering | 10 Oct 2022
CDC (Change Data Capture) with 3rd party APIs

1 project | /r/dataengineering | 23 Sep 2022

Why You Should NOT Build Your Data Pipeline on Top of Singer

This page summarizes the projects mentioned and recommended in the original post on dev.to
singer ETL Tap etl-framework Python
Post date: 30 Nov 2020

getting-started

tap-hubspot

InfluxDB

Related posts

Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte?

Design patter for Python ETL

Basic data engineering question.

I have hundreds of API data endpoints with different schemas. How do I organize?

CDC (Change Data Capture) with 3rd party APIs

Why You Should NOT Build Your Data Pipeline on Top of Singer

This page summarizes the projects mentioned and recommended in the original post on dev.to singer ETL Tap etl-framework Python Post date: 30 Nov 2020

getting-started

tap-hubspot

InfluxDB

Related posts

Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte?

Design patter for Python ETL

Basic data engineering question.

I have hundreds of API data endpoints with different schemas. How do I organize?

CDC (Change Data Capture) with 3rd party APIs

This page summarizes the projects mentioned and recommended in the original post on dev.to
singer ETL Tap etl-framework Python
Post date: 30 Nov 2020