Big Data Is Dead

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

duckdb-wasm

13 975 9.5 C++

WebAssembly version of DuckDB

I witness the overengineering regarding "big" data tools and pipelines since many years... For a lot of use cases, data warehouses and data lakes are only in the gigabytes or single-digit terabytes range, thus their architecture could be much more simplified, e.g. running DuckDB on a decent EC2 instance.
In my experience, doing this will yield the query results faster than some other systems even starting the query execution (yes, I'm looking at you Athena)...
I even think that a lot of queries can be run from a browser nowadays, that's why I created https://sql-workbench.com/ with the help of DuckDB WASM (https://github.com/duckdb/duckdb-wasm) and perspective.js (https://github.com/finos/perspective).

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
sql-workbench

5 15 2.6

Public issue-tracking and feature suggestion for sql-workbench.com

I witness the overengineering regarding "big" data tools and pipelines since many years... For a lot of use cases, data warehouses and data lakes are only in the gigabytes or single-digit terabytes range, thus their architecture could be much more simplified, e.g. running DuckDB on a decent EC2 instance.
In my experience, doing this will yield the query results faster than some other systems even starting the query execution (yes, I'm looking at you Athena)...
I even think that a lot of queries can be run from a browser nowadays, that's why I created https://sql-workbench.com/ with the help of DuckDB WASM (https://github.com/duckdb/duckdb-wasm) and perspective.js (https://github.com/finos/perspective).

polars

146 27,185 10.0 Rust

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Polars: alternativa ao Pandas

2 projects | /r/datasciencebr | 13 Jun 2023
I used multiprocessing and multithreading at the same time to drop the execution time of my code from 155+ seconds to just over 2+ seconds

1 project | /r/Python | 29 May 2023
Test On 4 Concurrent Jobs Using Python-Polars 0.17.11 to GroupBy Billion Rows

3 projects | /r/Python | 7 May 2023
Welcome to InfluxDB IOx: InfluxData’s New Storage Engine

5 projects | news.ycombinator.com | 26 Oct 2022
Working with more than 10gb csv

3 projects | /r/datascience | 5 Oct 2022

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
hardware-buttons scrape-images linkedin-bot
Post date: 27 May 2024

duckdb-wasm

InfluxDB

sql-workbench

polars

Related posts

Polars: alternativa ao Pandas

I used multiprocessing and multithreading at the same time to drop the execution time of my code from 155+ seconds to just over 2+ seconds

Test On 4 Concurrent Jobs Using Python-Polars 0.17.11 to GroupBy Billion Rows

Welcome to InfluxDB IOx: InfluxData’s New Storage Engine

Working with more than 10gb csv