C++ Analytics

Open-source C++ projects categorized as Analytics

Top 10 C++ Analytic Projects

  • ClickHouse

    ClickHouse® is a free analytics DBMS for big data

    Project mention: Float Compression 3: Filters | news.ycombinator.com | 2023-02-01

    Interesting to match with the observations from the practice of using ClickHouse[1][2] for time series:

    1. Reordering to SOA helps a lot - this is the whole point of column-oriented databases.

    2. Specialized codecs like Gorilla[3], DoubleDelta[4], and FPC[5] lose to simply using ZSTD[6] compression in most cases, both in compression ratio and in performance.

    3. Specialized time-series DBMS like InfluxDB or TimescaleDB lose to general-purpose relational OLAP DBMS like ClickHouse [7][8][9].

    [1] https://clickhouse.com/blog/optimize-clickhouse-codecs-compr...

    [2] https://github.com/ClickHouse/ClickHouse

    [3] https://clickhouse.com/docs/en/sql-reference/statements/crea...

    [4] https://clickhouse.com/docs/en/sql-reference/statements/crea...

    [5] https://clickhouse.com/docs/en/sql-reference/statements/crea...

    [6] https://github.com/facebook/zstd/

    [7] https://arxiv.org/pdf/2204.09795.pdf "SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things" (2022)

    [8] https://gitlab.com/gitlab-org/incubation-engineering/apm/apm... https://gitlab.com/gitlab-org/incubation-engineering/apm/apm...

    [9] https://www.sciencedirect.com/science/article/pii/S187705091...

  • duckdb

    DuckDB is an in-process SQL OLAP Database Management System

    Project mention: F/OSS Spotlight: 🦆 DuckDB | dev.to | 2023-01-24

    DuckDB (code) is the SQLite of OLAP queries.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • perspective

    A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

    Project mention: Ask HN: Who is hiring? (February 2023) | news.ycombinator.com | 2023-02-01

    We're looking for senior product managers and engineers of all experience levels to build the next generation of collaborative data visualization. At the Prospective Co., you'll contribute to our existing open-source project as well as help design our enterprise offering.

    https://perspective.finos.org/

    We're looking for any of:

    - Familiarity with WebAssembly, data visualization, WebGL/OpenGL, data science, Jupyter/notebook, web/desktop/mobile UI development, compiler/language or database design, finance services.

    - Primary stack is Rust (targeting WebAssembly). JavaScript, C++ and Python are a big plus.

    - We <3 GitHub contributors - opt to discuss your GitHub work in lieu of a technical interview.

    Contact [email protected]

  • velox

    A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

    Project mention: Velox: An open source unified execution engine | news.ycombinator.com | 2022-09-01
  • stonedb

    StoneDB is an Open-Source MySQL HTAP and MySQL-Native DataBase for OLTP, Real-Time Analytics, a counterpart of MySQLHeatWave. (https://stonedb.io)

    Project mention: Show HN: An Open-Source MySQL HTAP Database for OLTP, Real-Time Analytics | news.ycombinator.com | 2022-12-14
  • clp

    Compressed Log Processor (CLP) is a free tool capable of compressing text logs and searching the compressed logs without decompression. (by y-scope)

    Project mention: FOSS, cloud native, log storage and query engine build with Apache Arrow &amp; Parquet, written in Rust and React. | reddit.com/r/rust | 2022-10-01

    Thoughts on integrating CLP with this infra? Not sure whether this even makes sense to try? LINK

  • oneDAL

    oneAPI Data Analytics Library (oneDAL)

    Project mention: Is there a no-compromise (presumably C/C++) platform similar to Apache Spark? | reddit.com/r/dataengineering | 2022-07-27
  • Sonar

    Write Clean C++ Code. Always.. Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • mc2

    A Platform for Secure Analytics and Machine Learning

    Project mention: Intel deprecates SGX on Core series processors | news.ycombinator.com | 2022-04-15

    Analytics and ML on confidential data are some interesting server side use cases. See the MC2 open source project, for example: https://github.com/mc2-project/mc2

  • QtFirebase

    An effort to bring Google's Firebase C++ API to Qt + QML

    Project mention: [Weekly] What is everybody working on? Share your progress, discoveries, tips and tricks! | reddit.com/r/QtFramework | 2022-06-29

    Have you seen https://github.com/Larpon/QtFirebase ?

  • nebula

    A distributed block-based data storage and compute engine (by varchar-io)

    Project mention: Show HN: Turn any data into a fast analytical API | news.ycombinator.com | 2022-04-10

    we use our in-house baked engine - open sourced here https://github.com/varchar-io/nebula

    Yeah, Tinybird has lots of similarities, I will do more research on it, thanks for the reference.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-02-01.

C++ Analytics related posts

Index

What are some of the best open-source Analytic projects in C++? This list will help you:

Project Stars
1 ClickHouse 26,938
2 duckdb 8,117
3 perspective 5,203
4 velox 2,061
5 stonedb 687
6 clp 537
7 oneDAL 537
8 mc2 266
9 QtFirebase 266
10 nebula 131
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com