recap
Work with your web service, database, and streaming schemas in a single format. (by recap-build)
amundsen
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data. (by amundsen-io)
recap | amundsen | |
---|---|---|
2 | 7 | |
305 | 4,293 | |
0.0% | 1.0% | |
8.7 | 7.8 | |
about 2 months ago | 5 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
recap
Posts with mentions or reviews of recap.
We have used some of these posts to build our list of alternatives
and similar projects.
-
Recap: A python library for describing database tables and serialization formats with minimal type coercion.
The Github Repo: https://github.com/recap-build/recap
- Recap – A Data Catalog for Machines
amundsen
Posts with mentions or reviews of amundsen.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-07-19.
-
Quick Start Guide to Amundsen Demo 🚀
We'll be using WSL2 for this guide, and we'll start by cloning this repo and its submodules:
-
Apache Atlas or OpenMetaData?
You can use Amundsen data builder to send data to Apache Atlas, https://github.com/amundsen-io/amundsen/blob/main/databuilder/example/scripts/sample_atlas_search_extractor.py If you don’t have to configure Apache Atlas then why not, but the server side validation the last time when I used it was absent. You couldn’t validate the JSON body sent to the REST API endpoints.
-
Searching for Delta Lake Cataloging
Other than that, maybe you could try amundsen (https://github.com/amundsen-io/amundsen/issues/608) which now has a connector to extract delta lake metadata via Spark.
- Help with Data Discoverability in a Data Lake
-
Launch YC S21: Meet the Batch, Thread #6
How does it differ from something like Amundsen : https://github.com/amundsen-io/amundsen
-
Metadata and how to capture it
Metadata Engine: - Datahub https://github.com/linkedin/datahub - Amundsen https://github.com/amundsen-io/amundsen/ - Marquez https://marquezproject.github.io/ - Egeria - Open Metadata and Governance https://egeria.odpi.org
-
The State of Data Engineering in 2021
A final category worth highlighting is Discovery, where it seems every notable company developed an internal Data Catalogue tool that now is available as an open-source or paid service. Some examples are Amundsen (Lyft), Datahub (LinkedIn), Metacat (Netflix), Databook (Uber), and Dataportal (Airbnb).
What are some alternatives?
When comparing recap and amundsen you can also consider the following projects:
datahub - The Metadata Platform for your Data Stack
OpenLineage - An Open Standard for lineage metadata collection
marquez - Collect, aggregate, and visualize a data ecosystem's metadata
metacat
sickbeard_mp4_automator - Automatically convert video files to a standardized format with metadata tagging to create a beautiful and uniform media library
Medusa - Building blocks for digital commerce
amundsendatabuilder - Data ingestion library for Amundsen to build graph and search index
ytmdl - A simple app to get songs from YouTube in mp3 format with artist name, album name etc from sources like iTunes, Spotify, LastFM, Deezer, Gaana etc.