data-collection

Top 20 data-collection Open-Source Projects

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12

    I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.

    It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.

  • Snowplow

    The enterprise-grade behavioral data engine (web, mobile, server-side, webhooks), running cloud-natively on AWS and GCP

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cloudquery

    The open source high performance ELT framework powered by Apache Arrow

  • Project mention: We might want to regularly keep track of how important each server is | news.ycombinator.com | 2024-02-06

    Check out CloudQuery - https://github.com/cloudquery/cloudquery for an easy cloud asset inventory.

  • hertzbeat

    Apache HertzBeat(incubating) is a real-time monitoring system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.

  • Project mention: Apache HertzBeat(incubating) Another Prometheus, Zabbix | news.ycombinator.com | 2024-04-17

    It seems that the deleted post cannot be reposted due to mistaken operation. Can the administrator help to restore the deleted post? Thank you.

    Hi,

    This is an open-source project that I have been developing full-time for over two years.

    Name HertzBeat, in terms of functionality, it is similar to Prometheus and Zabbix.

    Recently, the project has just entered the Apache Foundation Incubator.

    Here, I want to share it with HN readers.

    In a word, it is an easy-to-use, open source, real-time monitoring system with agentless, high performance cluster, prometheus-compatible, offers powerful custom monitoring and status page building capabilities.

    github: https://github.com/apache/hertzbeat

    I hope this product is helpful and any feedback (even negative) would bring me joy.

  • jitsu

    Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days

  • Plan

    Player Analytics plugin for Minecraft Server platforms - View player activity of your server with ease. :calendar: (by plan-player-analytics)

  • Project mention: What is everyone using now? | /r/admincraft | 2023-05-31

    Plan (short for PLayer ANalytics, it gathers information about player activity and makes it easy for everyone to check their own statistics in a nice web UI. i found that long-term players like to have an idea of how much time and work they put into the server since they joined. you as the admin get access to everyone else's statistics, plus some helpful server stats)

  • collect

    ODK Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments around the world. Contribute and make the world a better place! ✨📋✨

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Disable-Firefox-Telemetry-and-Data-Collection

    How to disable Firefox Telemetry and Data Collection

  • Project mention: Disable-Firefox-Telemetry-and-Data-Collection: NEW Data - star count:147.0 | /r/algoprojects | 2023-10-17
  • library

    70+ CLI tools to build, browse, and blend your media library. An index for your archive. (by chapmanjacobd)

  • Project mention: Ask HN: Anyone looking for contributors for their open source projects | news.ycombinator.com | 2024-03-21

    Sure, I write small python CLI utils that help me solve media organization, media consumption, and sometimes data analysis. I use this every day on Linux and Android but I haven't tested it on other platforms. There are a lot of different subcommands and, although the CLI package will always be opinionated to some extent, there is a lot of niche functionality which might not need to exist. So I'm open to things being refactored or new subcommands being added. [1]

    I have a lot of ideas for new ones, for example, I want a CLI that can take an artist name like "Theodor Kittelsen" and fetch highest quality public domain images--but I realize any implementation that does this well will be somewhat fragile so I haven't really attempted that yet. Other ideas that I have are often solved by piping output from one of my existing commands to another.

    1. https://github.com/chapmanjacobd/library

  • ESPLogger

    An Arduino library providing a minimal interface to log data on flash memory and SD cards with ESP8266 and ESP32.

  • akvo-flow

    A data collection and monitoring tool that works anywhere.

  • rtdl

    rtdl makes it easy to build and maintain a real-time data lake (by realtimedatalake)

  • OpenCamera-Sensors

    Android app for synchronized recording of video and IMU data on one or multiple smartphones with advanced camera options, useful for 3D reconstruction, SLAM, AR, video stabilization. Supports remote control over network. (by prime-slam)

  • Project mention: Kharar mein Mausam 🤌🏻 | /r/Chandigarh | 2023-06-27

    Bhai, OpenCamera ya OpenCamera Sensors app use karo. (GitHub / F-Droid)

  • shifting

    A privacy-focused list of alternatives to online services.

  • central-frontend

    Vue.js based frontend for ODK Central

  • wakeword-data-collector

    A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword

  • Multi-Modal-Automation-Suite

    image based automation environment

  • Project mention: Open Automation project | /r/programming | 2023-12-07
  • system-info-collector

    App to collect ram/cpu usage from OS and show it in pretty graphs

  • Project mention: [media] System Info Collector - Fast and easy-to-use cli application for collecting RAM and CPU usage information over time | /r/rust | 2023-07-09

    Repository - https://github.com/qarmin/system-info-collector Binaries - https://github.com/qarmin/system-info-collector/releases

  • Shift

    Shift is a high performance better alternative to Airbyte, Singer, Meltano (by piyushsingariya)

  • Project mention: Alternative to Airbyte, Singer and Meltano | /r/dataengineering | 2023-08-11

    As side hobby I started working on this personal project https://github.com/piyushsingariya/Kaku

  • walker.js

    walker.js has been renamed and moved to elbwalker/walkerOS

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

data-collection related posts

Index

What are some of the best open-source data-collection projects? This list will help you:

Project Stars
1 airbyte 14,054
2 Snowplow 6,734
3 cloudquery 5,584
4 hertzbeat 4,814
5 jitsu 3,845
6 Plan 803
7 collect 698
8 Disable-Firefox-Telemetry-and-Data-Collection 209
9 library 161
10 ESPLogger 77
11 akvo-flow 66
12 rtdl 43
13 OpenCamera-Sensors 37
14 shifting 36
15 central-frontend 29
16 wakeword-data-collector 13
17 Multi-Modal-Automation-Suite 10
18 system-info-collector 9
19 Shift 8
20 walker.js 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com