Python data-collection

Open-source Python projects categorized as data-collection

Top 4 Python data-collection Projects

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12

    I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.

    It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.

  • library

    An index for your archive. 70+ CLI tools to help you build, browse, and manage your media library. (by chapmanjacobd)

  • Project mention: Ask HN: Anyone looking for contributors for their open source projects | news.ycombinator.com | 2024-03-21

    Sure, I write small python CLI utils that help me solve media organization, media consumption, and sometimes data analysis. I use this every day on Linux and Android but I haven't tested it on other platforms. There are a lot of different subcommands and, although the CLI package will always be opinionated to some extent, there is a lot of niche functionality which might not need to exist. So I'm open to things being refactored or new subcommands being added. [1]

    I have a lot of ideas for new ones, for example, I want a CLI that can take an artist name like "Theodor Kittelsen" and fetch highest quality public domain images--but I realize any implementation that does this well will be somewhat fragile so I haven't really attempted that yet. Other ideas that I have are often solved by piping output from one of my existing commands to another.

    1. https://github.com/chapmanjacobd/library

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • wakeword-data-collector

    A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword

  • Multi-Modal-Automation-Suite

    image based automation environment

  • Project mention: Open Automation project | /r/programming | 2023-12-07
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-collection related posts

Index

What are some of the best open-source data-collection projects in Python? This list will help you:

Project Stars
1 airbyte 13,923
2 library 158
3 wakeword-data-collector 13
4 Multi-Modal-Automation-Suite 10

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com