SaaSHub helps you find the best software and product alternatives Learn more →
Top 4 Python data-collection Projects
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
library
An index for your archive. 70+ CLI tools to help you build, browse, and manage your media library. (by chapmanjacobd)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
wakeword-data-collector
A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword
Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.
It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.
Project mention: Ask HN: Anyone looking for contributors for their open source projects | news.ycombinator.com | 2024-03-21Sure, I write small python CLI utils that help me solve media organization, media consumption, and sometimes data analysis. I use this every day on Linux and Android but I haven't tested it on other platforms. There are a lot of different subcommands and, although the CLI package will always be opinionated to some extent, there is a lot of niche functionality which might not need to exist. So I'm open to things being refactored or new subcommands being added. [1]
I have a lot of ideas for new ones, for example, I want a CLI that can take an artist name like "Theodor Kittelsen" and fetch highest quality public domain images--but I realize any implementation that does this well will be somewhat fragile so I haven't really attempted that yet. Other ideas that I have are often solved by piping output from one of my existing commands to another.
1. https://github.com/chapmanjacobd/library
Python data-collection related posts
- Open Automation project
- Open Automation project
- Building an API + query language for rich data like images and video
- The YC Winter 2022 Batch
- Locally vs cloud stored management systems
- AI video understanding in games
- Gauging sentiment in sales calls?
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source data-collection projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | airbyte | 13,923 |
2 | library | 158 |
3 | wakeword-data-collector | 13 |
4 | Multi-Modal-Automation-Suite | 10 |
Sponsored