databay
CKAN
databay | CKAN | |
---|---|---|
2 | 6 | |
185 | 4,267 | |
- | 0.7% | |
0.0 | 9.8 | |
10 months ago | 1 day ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
databay
-
Hi! I created a library which simplifies and augments usage of Python threads - called SuperLoops. It provides support for thread maintenance, events, failure handling, health status propagation, and graceful termination. Hope you find it useful 👋
ps. I also have a couple of other open source libraries: for scheduled data flow (Databay) and for algo trading (IBeam).
-
What is 'the thing' that makes batches? A "Batcher"? Help us come up with a name.
I'm having a discussion with a collaborator of my library, as to what we should call "the object that produces batches". In simple terms - turns a 1D array of items, into a 2D array or arrays of items, for instance: turns [1,2,3,4] into [[1,3], [2,3]] - but you can visualise any implementation that makes sense to you, we only care about the naming here.
CKAN
-
Open Source Flask-based web applications
CKAN The Open Source Data Portal Software
-
Metadata Store - Which one to Choose ? OpenMetadata vs Datahub ?
We use Kubernetes as our deployment platform. Any feedback on one of these open source data catalogs ? - https://atlas.apache.org/#/ - https://opendatadiscovery.org/ - https://open-metadata.org/ - https://marquezproject.github.io/marquez/ - https://datahubproject.io/ - https://www.amundsen.io/ - https://ckan.org/ - https://magda.io/
-
What 'tool' is used to build OpenData sites?
CKAN (https://ckan.org/) is what data.gov and most state governments use.
-
Software and tools for (non-human) genomics data platform
Our first instinct is to use [CKAN](https://ckan.org) for cataloging (and storage, with modifications), especially since we know it and know that it has been used successfully elsewhere. However, we suspect that more specialized/better tools exist for this, thus why I kindly ask for your insights.
-
How to start Data Science and Machine Learning Career?
Ckan
-
We are digitisers at the Natural History Museum in London, on a mission to digitise 80 million specimens and free their data to the world. Ask us anything!
We publish all our data on the [Data Portal](https://data.nhm.ac.uk), a Museum project that's been running since 2014. Instead of MediaWiki it runs on an open-source Python framework called [CKAN](https://ckan.org), which is designed for hosting datasets - though we've had to adapt it in various ways so that it can handle such large amounts of data.
What are some alternatives?
hickory - Command line tool for scheduling Python scripts
ArchivesSpace - The ArchivesSpace archives management tool
PyFunctional - Python library for creating data pipelines with chain functional programming
ArchiveBox - 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
cloudflare_ddns - Cloudflare Dynamic DNS schedule daemon
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
ibeam - IBeam is an authentication and maintenance tool used for the Interactive Brokers Client Portal Web API Gateway.
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
superloops
Collective Access: Providence - Cataloguing and data/media management application
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
datahub - The Metadata Platform for your Data Stack