nomad
opteryx
nomad | opteryx | |
---|---|---|
5 | 1 | |
0 | 45 | |
- | - | |
4.1 | 9.8 | |
4 months ago | 5 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nomad
opteryx
-
Pure Python Distributed SQL Engine
Thanks for sharing.
I have a SQL Engine in Python too (https://github.com/mabel-dev/opteryx). I focused my initial effort on supporting SQL statements and making the usage feel like a database - that probably reflects the problem I had in front of me when I set out - only handling handfuls of gigabytes in a batch environment for ETLs with a group of new-to-data-engineering engineers. Have recently started looking more at real-time performance, such as distributing work. Am interesting in how you've approached.
What are some alternatives?
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
quokka - Making data lake work for time series
awesome-aws - A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
influxdb3-python - Python module that provides a simple and convenient way to interact with InfluxDB 3.0.
STARS - A multi-cloud DNS record scanner that aims to help cybersecurity/IT analysts identify dangling CNAME records in their cloud DNS services that could possibly lead to subdomain takeover scenarios.
pg8000 - A Pure-Python PostgreSQL Driver
prowler - Prowler is an Open Source Security tool for AWS, Azure, GCP and Kubernetes to do security assessments, audits, incident response, compliance, continuous monitoring, hardening and forensics readiness. Includes CIS, NIST 800, NIST CSF, CISA, FedRAMP, PCI-DSS, GDPR, HIPAA, FFIEC, SOC2, GXP, Well-Architected Security, ENS and more
datafusion-ballista - Apache Arrow Ballista Distributed Query Engine
cloud-custodian - Rules engine for cloud security, cost optimization, and governance, DSL in yaml for policies to query, filter, and take actions on resources
emr-serverless-samples - Example code for running Spark and Hive jobs on EMR Serverless.
gcp-storage-emulator - Local emulator for Google Cloud Storage
datafusion-python - Apache DataFusion Python Bindings