Shades of Open Source - Understanding The Many Meanings of "Open"

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • incubator-xtable

    Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

    In the world of table formats, there are three competing standards: Apache Iceberg, Apache Hudi, and Delta Lake, with two out of the three being Apache projects (and there is also Apache XTable for interoperability between these and future formats). For catalogs, options include Nessie, Gravitino, Polaris, and Unity Catalog, all of which are open source but not yet Apache projects.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • polaris

    Polaris Catalog is an open source catalog for Apache Iceberg (by polaris-catalog)

    In the world of table formats, there are three competing standards: Apache Iceberg, Apache Hudi, and Delta Lake, with two out of the three being Apache projects (and there is also Apache XTable for interoperability between these and future formats). For catalogs, options include Nessie, Gravitino, Polaris, and Unity Catalog, all of which are open source but not yet Apache projects.

  • MongoDB

    The MongoDB Database

    Many popular open source projects are beloved and closely tied to particular vendors. For example, web frameworks like React and Angular are associated with Meta and Google, respectively. Database software like MongoDB, Elasticsearch, and Redis are also tied to specific commercial entities but are widely used and praised for their functionality. When there is a clear driver of a project, it can offer some benefits:

  • dremio-oss

    Dremio - the missing link in modern data

    This practice, in itself, isn't inherently bad. Many businesses maintain commercial proprietary forks of open-source projects, but usually, the commercial version has a different name than the open-source project. For example, in the world of data catalogs, Dremio is the main developer of Nessie, and Snowflake drives Polaris. Both aim to become community-driven projects over time but will also drive integrated features in their respective commercial products under different names. For instance, if you set up your own Nessie catalog, it has a distinct name compared to the Dremio Enterprise Catalog (formerly Arctic) integrated into Dremio Cloud. The Dremio Enterprise Catalog is powered by Nessie but has additional features, so the different names prevent confusion about available features or which documentation to reference.

  • Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    In contrast, Databricks maintains internal forks of Spark, Delta Lake, and Unity Catalog, using the same names for both the open-source versions and the features specific to the Databricks platform. While they do provide separate documentation, online discussions often reflect confusion about how to use features in the open-source versions that only exist on the Databricks platform. This creates a "muddying of the waters" between what is open and what is proprietary. This isn't an issue if you are a Databricks user, but it can be quite confusing for those who want to use these tools outside of the Databricks ecosystem.

  • Redis

    Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.

    Many popular open source projects are beloved and closely tied to particular vendors. For example, web frameworks like React and Angular are associated with Meta and Google, respectively. Database software like MongoDB, Elasticsearch, and Redis are also tied to specific commercial entities but are widely used and praised for their functionality. When there is a clear driver of a project, it can offer some benefits:

  • React

    The library for web and native user interfaces.

    In reality, independence isn't always crucial. Many open-source standards in web development, like React, are not Apache projects and are heavily directed by their creators, such as Meta. However, a web framework like React isn't responsible for the interoperability of web applications. Instead, long-standing standards like REST and HTTP serve as the glue that connects web applications across various backend languages, frontend frameworks, and more.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • hudi

    Upserts, Deletes And Incremental Processing on Big Data.

    In the world of table formats, there are three competing standards: Apache Iceberg, Apache Hudi, and Delta Lake, with two out of the three being Apache projects (and there is also Apache XTable for interoperability between these and future formats). For catalogs, options include Nessie, Gravitino, Polaris, and Unity Catalog, all of which are open source but not yet Apache projects.

  • Apache Arrow

    Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

    It's this kind of certainty that underscores the vital role of the Apache Software Foundation (ASF). Many first encounter Apache through its pioneering project, the open-source web server framework that remains ubiquitous in web operations today. The ASF was initially created to hold the intellectual property and assets of the Apache project, and it has since evolved into a cornerstone for open-source projects worldwide. The ASF enforces strict standards for diverse contributions, independence, and activity in its projects, ensuring they can withstand the test of time as standards in software development. Many open-source projects strive to become Apache projects to gain the community credibility necessary for adoption as standard software building blocks, such as Apache Tomcat for Java web applications, Apache Arrow for in-memory data representation, and Apache Parquet for data file formatting, among others.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Crafting A Minimalist Portfolio Website with SvelteKit and Pico CSS

    9 projects | dev.to | 14 Oct 2023
  • Apache IoTDB: Database for Internet of Things

    1 project | news.ycombinator.com | 9 Nov 2022
  • Has anyone used Apache IoTDB? What is your opinion on it?

    1 project | /r/IOT | 13 Sep 2022
  • Beginner's Tutorial for CRUD Operations in NodeJS and MongoDB

    1 project | dev.to | 16 Jul 2024
  • List of 45 databases in the world

    27 projects | dev.to | 9 Jul 2024

Did you konow that Java is
the 8th most popular programming language
based on number of metions?