Bigquery

Top 23 Bigquery Open-Source Projects

  • Hasura

    Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.

  • Project mention: Serious flaws in SQL – Edgar F. Codd (1990) | news.ycombinator.com | 2024-04-25

    > 2. ORMs do not hide SQL nastiness.

    This is certainly true!

    I mean: ORMs are now well known to "make the easy queries slightly more easy, while making intermediate queries really hard and complex queries impossible".

    I think the are of ORMs is over. It simply did not deliver.

    If a book on SQL is --say-- 100 pages, a book on Hibernate is 400 pages. So much to learn just to make the easy queries slightly easier to type? Just not worth it.

    I prefer jooq any day over ORMs. And dont get me started over what tools like Hasuna have to offer.

    There are also some languages (forgot the names) that are SQL-done-right. Select in the back, more type safe, more logic, more in the same steps as the query gets executed. These need to be adopted by PG and MySQL and we're good to go. (IMHO)

    https://www.jooq.org/

    https://hasura.io/

  • Redash

    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

  • Project mention: Redash: Connect to data source, easily visualize, dashboard and share your data | news.ycombinator.com | 2024-03-20
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cube.js

    📊 Cube — The Semantic Layer for Building Data Applications

  • Project mention: MQL – Client and Server to query your DB in natural language | news.ycombinator.com | 2024-04-07

    I should have clarified. There's a large number of apps that are:

    1. taking info strictly from SQL (e.g. information_schema, query history)

    2. taking a user input / question

    3. writing SQL to answer that question

    An app like this is what I call "text-to-sql". Totally agree a better system would pull in additional documentation (which is what we're doing), but I'd no longer consider it "text-to-sql". In our case, we're not even directly writing SQL, but rather generating semantic layer queries (i.e. https://cube.dev/).

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12

    I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.

    It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.

  • doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

  • Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27

    As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.

  • cloudquery

    The open source high performance ELT framework powered by Apache Arrow

  • Project mention: We might want to regularly keep track of how important each server is | news.ycombinator.com | 2024-02-06

    Check out CloudQuery - https://github.com/cloudquery/cloudquery for an easy cloud asset inventory.

  • growthbook

    Open Source Feature Flagging and A/B Testing Platform

  • Project mention: GrowthBook: Open-source feature flagging and A/B testing platform | /r/opensource | 2023-10-20
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • sqlglot

    Python SQL Parser and Transpiler

  • Project mention: Transpile Any SQL to PostgreSQL Dialect | news.ycombinator.com | 2024-03-18

    Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects

    Tools like this are helpful for:

    - Rendering SQL in a consistent way, eg for snapshot testing

  • ibis

    the portable Python dataframe library

  • Project mention: Show HN: Hashquery, a Python library for defining reusable analysis | news.ycombinator.com | 2024-04-23

    I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]

    0: https://ibis-project.org/

  • franchise

    🍟 a notebook sql client. what you get when have a lot of sequels.

  • Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

  • Project mention: Rudderstack Switches to Elastic License | news.ycombinator.com | 2023-09-08
  • jitsu

    Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days

  • tbls

    tbls is a CI-Friendly tool for document a database, written in Go.

  • Project mention: FLaNK 25 December 2023 | dev.to | 2023-12-26
  • ethereum-etl

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

  • Project mention: Blockchain transactions decoding: making wallet activity understandable | dev.to | 2023-10-27

    Event is a log entity which EVM smart contracts can emit during transaction execution. Events are very good at signalling that an some action has taken place on-chain. Applications can subscribe and listen to events to trigger some off-chain logic or they can index, transform and store events in some off-chain storage (look at The Graph protocol or Ethereum ETL).

  • professional-services

    Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

  • Scio

    A Scala API for Apache Beam and Google Cloud Dataflow.

  • ingestr

    ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

  • Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04
  • elementary

    The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

  • logica

    Logica is a logic programming language that compiles to SQL. It runs on Google BigQuery, PostgreSQL and SQLite.

  • Project mention: Prolog language for PostgreSQL proof of concept | news.ycombinator.com | 2024-03-30

    If you're interested in this I would also recommend you check out Logica[0], which is a datalog-like language that is explicitly made to compile to SQL queries.

    0: https://logica.dev/

  • peerdb

    Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage

  • Project mention: Pgwire: a Rust library for PostgreSQL compatible application | news.ycombinator.com | 2024-03-20

    We at PeerDB (https://github.com/PeerDB-io/peerdb) were early adopters of Pgwire to implement our Postgres-compatible SQL Layer to do ETL. Very easy to work with. Saved us multiple months of effort to build it from scratch.

    Project mention: GitHub - swirlai/swirl-search: Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously, finds the best results using a reader LLM, then prompts Generative AI, enabling you to get answers based on your data. | /r/programming | 2023-12-05
  • DataflowTemplates

    Cloud Dataflow Google-provided templates for solving in-Cloud data tasks

  • bigquery-utils

    Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

  • Project mention: Swirl: An open-source search engine with LLMs and ChatGPT to provide all the answers you need 🌌 | dev.to | 2023-09-06

    Using the Galaxy UI, knowledge workers can systematically review the best results from all configured services including Apache Solr, ChatGPT, Elastic, OpenSearch, PostgreSQL, Google BigQuery, plus generic HTTP/GET/POST with configurations for premium services like Google's Programmable Search Engine, Miro and Northern Light Research.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Bigquery related posts

Index

What are some of the best open-source Bigquery projects? This list will help you:

Project Stars
1 Hasura 30,810
2 Redash 24,917
3 cube.js 17,135
4 airbyte 13,923
5 doris 11,314
6 cloudquery 5,581
7 growthbook 5,517
8 sqlglot 5,441
9 ibis 4,074
10 franchise 3,988
11 Rudderstack 3,926
12 jitsu 3,831
13 tbls 3,057
14 ethereum-etl 2,819
15 professional-services 2,723
16 Scio 2,520
17 ingestr 2,308
18 elementary 1,739
19 logica 1,680
20 peerdb 1,595
21 swirl-search 1,509
22 DataflowTemplates 1,089
23 bigquery-utils 1,028

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com