datahub VS OpenMetadata

Compare datahub vs OpenMetadata and see what are their differences.

OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration. (by open-metadata)
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
datahub OpenMetadata
35 29
9,977 5,653
1.4% 3.9%
10.0 10.0
3 days ago 3 days ago
Java TypeScript
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

datahub

Posts with mentions or reviews of datahub. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-10.

OpenMetadata

Posts with mentions or reviews of OpenMetadata. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-07-17.
  • Show HN: OpenMetadata – OSS platform for data discovery observability governance
    2 projects | news.ycombinator.com | 17 Jul 2024
    * It seems like DataHub has an async Kafka ingestion approach while OpenMetadata is API

    We do not use Kafka by default. If someone needs kafka they can add it. However for Metadata APIs, we do not feel like Kafka is needed. Lot of projects are getting dependent on Kafka and calling it as real-time. Its unnecessary burden on users who are going to operate in production for 99% of use-cases Kafka is not needed, coming from a Kafka committer :)

    2. Yes all of our APIs and Entity definitions are generated using JsonSchema. For us, Json Schema has been awesome, all of our backend / ingestion and UI is generated from JsonSchema and its easy to extend and add new models when needed

    3. IMO, we have much more coverage , you can look at the types available here https://github.com/open-metadata/OpenMetadata/tree/main/open... and we are support JsonSchema as a type from a long time

  • OpenMetadata: Join the #1 Open Source Data Community
    1 project | news.ycombinator.com | 20 Jun 2024
  • How to Dynamically Adjust the Height of a Textarea in ReactJS
    1 project | dev.to | 25 Oct 2023
    In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.
  • Blog - Project Nessie: A Look in the Depths
    1 project | /r/bigdata | 11 Jul 2023
    How does this compare with https://github.com/open-metadata/OpenMetadata
  • What is your favorite data catalog?
    2 projects | /r/dataengineering | 25 Jun 2023
    u/cmcau try https://open-metadata.org much easier to setup , for details https://docs.open-metadata.org and for any support https://slack.open-metadata.org
  • Data Governance Hands On with Amazon DataZone
    1 project | dev.to | 22 May 2023
    Then, a pool of tools appeared on the market with features that allow covering some of the challenges cited, especially those related to data cataloging. Informatica's tool is perhaps the best known among the licensed. Among the open source tools, I highlight Data Hub (www.datahubproject.io) developed on LinkedIn, Open Metadata (https://open-metadata.org/) and Amundsen (https://www.amundsen.io /) powered by Lyft. In addition to cataloging and discovering data artifacts, these tools allow for a view of data lineage, including technical documentation and business terms, and building relationships between data artifacts. Also, it is possible to register data owners, the people responsible for the data in those tools. This greatly facilitates access request and evaluation process (which today is a major bottleneck).
  • What OSS are you using for data contracts?
    1 project | /r/dataengineering | 3 May 2023
    Probably, in order to have it integrate with tools like OpenLineage and OpenMetadata and such I will have to make open-source contributions.
  • Thoughts around decube.io (data observability and catalog platform)
    1 project | /r/dataengineering | 4 Apr 2023
    We are the team behind OpenMetadata . Our mission is to build a centralized metadata platform that offers data discovery, collaboration, governance and quality. We believe that having tool for each of these categories not only result user frustration but metadata silos.
  • Great expectations?
    1 project | /r/dataengineering | 4 Apr 2023
    As anyone ever tried open metadata for data QA testing? Curious about that https://open-metadata.org/
  • Our data catalog is difficult to manage and not built for the wider org - what can we do?
    4 projects | /r/dataengineering | 10 Mar 2023
    We're looking to PoC https://open-metadata.org/ shortly

What are some alternatives?

When comparing datahub and OpenMetadata you can also consider the following projects:

amundsen - Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

marquez - Collect, aggregate, and visualize a data ecosystem's metadata

OpenLineage - An Open Standard for lineage metadata collection

odd-platform - First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

atlas - Manage your database schema as code

Hyperactive - An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

metacat

Deeplearning4j - Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...

Atlas - 🚀 An open and lightweight modification to Windows, designed to optimize performance, privacy and usability.

big-data-pipeline-lambda-arch - A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.

monosi - Open source data observability platform

awesome-wardley-maps - Wardley maps community hub. Useful Wardley mapping resources

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured

Did you konow that Java is
the 8th most popular programming language
based on number of metions?