CKAN
kuwala
CKAN | kuwala | |
---|---|---|
7 | 33 | |
4,429 | 781 | |
0.9% | 0.0% | |
9.8 | 0.0 | |
1 day ago | about 2 years ago | |
Python | JavaScript | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CKAN
- Open Source takes center stage at United Nations
-
Open Source Flask-based web applications
CKAN The Open Source Data Portal Software
-
Metadata Store - Which one to Choose ? OpenMetadata vs Datahub ?
We use Kubernetes as our deployment platform. Any feedback on one of these open source data catalogs ? - https://atlas.apache.org/#/ - https://opendatadiscovery.org/ - https://open-metadata.org/ - https://marquezproject.github.io/marquez/ - https://datahubproject.io/ - https://www.amundsen.io/ - https://ckan.org/ - https://magda.io/
-
What 'tool' is used to build OpenData sites?
CKAN (https://ckan.org/) is what data.gov and most state governments use.
-
Software and tools for (non-human) genomics data platform
Our first instinct is to use [CKAN](https://ckan.org) for cataloging (and storage, with modifications), especially since we know it and know that it has been used successfully elsewhere. However, we suspect that more specialized/better tools exist for this, thus why I kindly ask for your insights.
-
How to start Data Science and Machine Learning Career?
Ckan
-
We are digitisers at the Natural History Museum in London, on a mission to digitise 80 million specimens and free their data to the world. Ask us anything!
We publish all our data on the [Data Portal](https://data.nhm.ac.uk), a Museum project that's been running since 2014. Instead of MediaWiki it runs on an open-source Python framework called [CKAN](https://ckan.org), which is designed for hosting datasets - though we've had to adapt it in various ways so that it can handle such large amounts of data.
kuwala
-
Show HN: GeoSage β A ETL Webtool for Geo and Demographics Data from the Open Web
--> Google Trends Data for Regions (Coming Soon)
The tool goes beyond our previously published CLI tool (https://github.com/kuwala-io/kuwala/tree/master/kuwala) by providing a hostable solution with a user-friendly interface. We have not open-sourced it yet but a demo is available here: https://geosage.kuwala.io/.
Urban planners can utilize movement data to analyze foot traffic in different city zones. Marketers can leverage demographic data to tailor campaigns more effectively. Developers can build their apps on top of it.
To round it up .... GeoSage brings...
Unified Data Management: Access data from OSM, Facebook, and soon Google, all in one place.
-
Show HN: Free Datasets for Spatial Engineers and Location Analysts
--> https://github.com/kuwala-io/kuwala/blob/master/kuwala/pipelines/osm-poi/README.md
Googe Popular Times: Movement data can be also found on Google. When you search a location it is often shown how frequently a place was visited on an hourly-daily basis (on an index of 0-100). With this libary you can access all the Popular Times data for location and entire cities
-
What are the 5 hottest dbt Repositories one should star on GitHub 2022?
What are the 5 hottest dbt Repositories one should star on Github 2022?
dbt is a software framework that sits in the middle of the ELT process. It represents the transformative layer after loading data from an original source. Dbt combines SQL with software engineering principles.
Here are my top5!
- Lightdash (https://github.com/lightdash/lightdash): Lightdash converts dbt models and makes it possible to define and easily visualize additional metrics via a visual interface.
- β re_data (https://github.com/re-data/re-data): Re-Data is an abstraction layer that helps users monitor dbt projects and their underlying data. For example, you get alerts when a test failed or a data anomaly occurs in a dbt project.
- evidence (https://github.com/evidence-dev/evidence): Evidence is another tool for lightweight BI reporting. With Evidence, you can build simple reports in "medium style" using SQL queries and Markdown.
- Kuwala (https://github.com/kuwala-io/kuwala): With Kuwala, a BI analyst can intuitively build advanced data workflows using a drag-drop interface on top of the modern data stack without coding. Behind the Scenes, the dbt models are generated so that a more experienced engineer can customize the pipelines at any time.
- fal ai (https://github.com/fal-ai/fal): Fal helps to run Python scripts directly from the dbt project. For example, you can load dbt models directly into the Python context which helps to apply Data Science libraries like SKlearn and Prophet in the dbt models.
- Show HN: Open-Source Data Workspace Powered by Dbt and Airbyte
-
What are the hottest dbt Repositories you should star on Github 2022? - Here are mine.
Kuwala ( https://github.com/kuwala-io/kuwala ) Kuwala is a data workspace that consolidates the Modern Data Stack and makes it usable for BI analysts and Engineers. Even though dbt is originally targeted at BI Analysts, dbt is mainly used by Engineers. This shifts a large amount of pipeline engineering effort to the IT department. With Kuwala, a BI analyst can intuitively build advanced data workflows using a drag-drop interface on top of the modern data stack without coding. Consequently, the BI Analyst can work more iteratively and maintain the complete workflow from source to metrics in a dashboard. Under the hood and Behind the Scenes, the dbt models are generated so that a more experienced engineer can customize the pipelines at any time. In addition, engineers can easily convert dbt models into reusable βdrag and dropβ components.
-
What are your hottest dbt repositories in 2022 so far? Here are mine!
- 𧱠Kuwala: With Kuwala, a BI analyst can intuitively build advanced data workflows using a drag-drop interface on top of the modern data stack without coding. Behind the Scenes, the dbt models are generated so that a more experienced engineer can customize the pipelines at any time.
-
Is Geoboundaries still a thing for GIS experts?
I have then built with a friend this here: https://github.com/kuwala-io/kuwala/tree/master/kuwala/pipelines/admin-boundaries . So the script is extracting the boundaries from OSM and cleans it (forms the hierachy and connect shapes). However, it did not lift up and I had not the feeling this was in the end an interesting feature for the community. It would be wonderful to hear your feedback and maybe find someone to pick it up :-)
-
My open-source project: there shall be no difference between BI, Data Analysts and Data Engineer
Hi, we have a nice slack channel, here: https://kuwala-community.slack.com/ssb/redirect and my repo on Github is here available: https://github.com/kuwala-io/kuwala
-
I don't get the many shady location data providers if there is Google Popular Times and Open Street Map that you can access with ease and drive similar conclusions.
**Global Admin Boundaries:** A huge problem that often people feel when working with location data is aggregating the data into different geo-based slices (country level, admin level, or even smaller into sub-districts). Here is a repo that cleaned the data out of Open Street Map for geo boundaries worldwide from very broad to a very small granularity --> https://github.com/kuwala-io/kuwala/blob/master/kuwala/pipelines/admin-boundaries/README.md
If it is about location data you should know OpenStreetMap. It's the biggest Database with meta info on location. It's not perfect but big companies like Mapbox, Apple, and Microsoft rely on it. Since the API is kind of messy, you can load with this repository whole cities information smoothly into a PostGres --> https://github.com/kuwala-io/kuwala/blob/master/kuwala/pipelines/osm-poi/README.md
What are some alternatives?
ArchivesSpace - ArchivesSpace, the archives management tool
uawardata - The data behind uawardata.com
ArchiveBox - π Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
mara-pipelines - A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
lightdash - Self-serve BI to 10x your data team β‘οΈ
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
dbt-fal - do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.
Collective Access: Providence - Cataloguing and data/media management application
webcrumbs - Build, re(use) and share your own JavaScript plugins that effortlessly match your website's style. π Star to support our work!
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
re_data - re_data - fix data issues before your users & CEO would discover them π