Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
zfs-localpv
Dynamically provision Stateful Persistent Node-Local Volumes & Filesystems for Kubernetes that is integrated with a backend ZFS data storage stack.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
CouchDB
Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
-
aws-lambda-java-libs
Official mirror for interface definitions and helper classes for Java code running on the AWS Lambda platform.
import jinja2 template = jinja2.Template(""" # Awesome Big Data A curated list of awesome big data frameworks, libraries, software and resources. Inspired by [awesome-php](https://github.com/ziadoz/awesome-php). """) template.render()
ListItem( 'Kubernetes', 'https://kubernetes.io/', 'Container Engines and Orchestration', """Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management.""" ), ListItem( 'Podman', 'https://podman.io/', 'Container Engines and Orchestration', """Podman is a daemonless, open source, Linux native tool designed to make it easy to find, run, build, share and deploy applications using Open Containers Initiative (OCI) Containers and Container Images.""" ), # Data Storage :: Block Storage ListItem( 'Amazon EBS', 'https://aws.amazon.com/ebs/', 'Data Storage :: Block Storage', """Amazon Elastic Block Store (Amazon EBS) is an easy-to-use, scalable, high-performance block-storage service designed for Amazon Elastic Compute Cloud (Amazon EC2).""" ), ListItem( 'OpenEBS', 'https://openebs.io/', 'Data Storage :: Block Storage', """OpenESB is a Java-based open-source enterprise service bus. It allows you to integrate legacy systems, external and internal partners and new development in your Business Process.""" ), # Data Storage :: Cluster Storage ListItem( 'Ceph', 'https://ceph.io/en/', 'Data Storage :: Cluster Storage', """Ceph is an open-source software storage platform, implements object storage on a single distributed computer cluster, and provides 3-in-1 interfaces for object-, block- and file-level storage.""" ), ListItem( 'Hadoop Distributed File System', 'https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html', 'Data Storage :: Cluster Storage', """The Hadoop Distributed File System ( HDFS ) is a distributed file system designed to run on commodity hardware.""" ), # Data Storage :: Object Storage ListItem( 'Amazon S3', 'https://aws.amazon.com/s3/', 'Data Storage :: Object Storage', """Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides scalable object storage through a web service interface.""" )
ListItem( 'Kubernetes', 'https://kubernetes.io/', 'Container Engines and Orchestration', """Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management.""" ), ListItem( 'Podman', 'https://podman.io/', 'Container Engines and Orchestration', """Podman is a daemonless, open source, Linux native tool designed to make it easy to find, run, build, share and deploy applications using Open Containers Initiative (OCI) Containers and Container Images.""" ), # Data Storage :: Block Storage ListItem( 'Amazon EBS', 'https://aws.amazon.com/ebs/', 'Data Storage :: Block Storage', """Amazon Elastic Block Store (Amazon EBS) is an easy-to-use, scalable, high-performance block-storage service designed for Amazon Elastic Compute Cloud (Amazon EC2).""" ), ListItem( 'OpenEBS', 'https://openebs.io/', 'Data Storage :: Block Storage', """OpenESB is a Java-based open-source enterprise service bus. It allows you to integrate legacy systems, external and internal partners and new development in your Business Process.""" ), # Data Storage :: Cluster Storage ListItem( 'Ceph', 'https://ceph.io/en/', 'Data Storage :: Cluster Storage', """Ceph is an open-source software storage platform, implements object storage on a single distributed computer cluster, and provides 3-in-1 interfaces for object-, block- and file-level storage.""" ), ListItem( 'Hadoop Distributed File System', 'https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html', 'Data Storage :: Cluster Storage', """The Hadoop Distributed File System ( HDFS ) is a distributed file system designed to run on commodity hardware.""" ), # Data Storage :: Object Storage ListItem( 'Amazon S3', 'https://aws.amazon.com/s3/', 'Data Storage :: Object Storage', """Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides scalable object storage through a web service interface.""" )
ListItem( 'Kubernetes', 'https://kubernetes.io/', 'Container Engines and Orchestration', """Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management.""" ), ListItem( 'Podman', 'https://podman.io/', 'Container Engines and Orchestration', """Podman is a daemonless, open source, Linux native tool designed to make it easy to find, run, build, share and deploy applications using Open Containers Initiative (OCI) Containers and Container Images.""" ), # Data Storage :: Block Storage ListItem( 'Amazon EBS', 'https://aws.amazon.com/ebs/', 'Data Storage :: Block Storage', """Amazon Elastic Block Store (Amazon EBS) is an easy-to-use, scalable, high-performance block-storage service designed for Amazon Elastic Compute Cloud (Amazon EC2).""" ), ListItem( 'OpenEBS', 'https://openebs.io/', 'Data Storage :: Block Storage', """OpenESB is a Java-based open-source enterprise service bus. It allows you to integrate legacy systems, external and internal partners and new development in your Business Process.""" ), # Data Storage :: Cluster Storage ListItem( 'Ceph', 'https://ceph.io/en/', 'Data Storage :: Cluster Storage', """Ceph is an open-source software storage platform, implements object storage on a single distributed computer cluster, and provides 3-in-1 interfaces for object-, block- and file-level storage.""" ), ListItem( 'Hadoop Distributed File System', 'https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html', 'Data Storage :: Cluster Storage', """The Hadoop Distributed File System ( HDFS ) is a distributed file system designed to run on commodity hardware.""" ), # Data Storage :: Object Storage ListItem( 'Amazon S3', 'https://aws.amazon.com/s3/', 'Data Storage :: Object Storage', """Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides scalable object storage through a web service interface.""" )
ListItem(name='Apache Superset', website='https://superset.apache.org/', category='Visualization Frameworks', short_description='Apache Superset is an open-source software cloud-native application for data exploration and data visualization able to handle data at petabyte scale.'),
ListItem(name='Apache Spark', website='https://spark.apache.org/', category='Batch Processing', short_description='Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.'),
ListItem(name='Apache Hive', website='https://hive.apache.org/', category='Interactive Query', short_description='Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop.'),
ListItem(name='CouchDB', website='https://couchdb.apache.org/', category='NoSQL :: Document Databases', short_description='Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang. CouchDB uses multiple formats and protocols to store, transfer, and process its data. It uses JSON to store data, JavaScript as its query language using MapReduce, and HTTP for an API.')
ListItem(name='Apache Beam', website='https://beam.apache.org/', category='Batch Processing', short_description='Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream processing'),
ListItem(name='AWS Lambda', website='https://aws.amazon.com/lambda/', category='Serverless Functions', short_description='AWS Lambda is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It is a computing service that runs code in response to events and automatically manages the computing resources required by that code.'),
ListItem(name='Apache Airflow', website='https://airflow.apache.org/', category='Workflow Engine', short_description='Apache Airflow is an open-source workflow management platform for data engineering pipelines.'),