perspective
ClickHouse
Our great sponsors
perspective | ClickHouse | |
---|---|---|
32 | 110 | |
5,203 | 26,938 | |
1.3% | 1.5% | |
9.6 | 10.0 | |
about 14 hours ago | 1 day ago | |
C++ | C++ | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
perspective
-
Ask HN: Who is hiring? (February 2023)
We're looking for senior product managers and engineers of all experience levels to build the next generation of collaborative data visualization. At the Prospective Co., you'll contribute to our existing open-source project as well as help design our enterprise offering.
https://perspective.finos.org/
We're looking for any of:
- Familiarity with WebAssembly, data visualization, WebGL/OpenGL, data science, Jupyter/notebook, web/desktop/mobile UI development, compiler/language or database design, finance services.
- Primary stack is Rust (targeting WebAssembly). JavaScript, C++ and Python are a big plus.
- We <3 GitHub contributors - opt to discuss your GitHub work in lieu of a technical interview.
Contact [email protected]
- NYC Slice
- Data Visualization Framework for React, Angular, Svelte, TypeScript, JavaScript
- Nocodb: Turns Any MySQL, Postgres, SQLite into a Spreadsheet with REST APIs
- Ask HN: Who is hiring? (October 2022)
- Ask HN: Who is hiring? (September 2022)
-
Official /r/rust "Who's Hiring" thread for job-seekers and job-offerers [Rust 1.63]
DESCRIPTION: We're looking for senior product managers and engineers of all experience levels to build the next generation of collaborative data visualization. At the Prospective Co., you'll contribute to our existing open-source project (Perspective https://perspective.finos.org/) as well as help design our enterprise offering. We're looking for any of: - Familiarity with WebAssembly, data visualization, WebGL/OpenGL, data science, Jupyter/notebook, web/desktop/mobile UI development, compiler/language or database design, finance services. - Primary stack is Rust (targeting WebAssembly, especially Yew). JavaScript, C++ and Python are a big plus. - We <3 GitHub contributors - opt to discuss your GitHub work in lieu of a technical interview.
- Show HN: Grid.js – Advanced table library that works everywhere (2020)
- Memray is a memory profiler for Python by Bloomberg
-
Is React performant enough for trading applications?
Thank you. I saw this, https://github.com/finos/perspective
ClickHouse
-
Float Compression 3: Filters
Interesting to match with the observations from the practice of using ClickHouse[1][2] for time series:
1. Reordering to SOA helps a lot - this is the whole point of column-oriented databases.
2. Specialized codecs like Gorilla[3], DoubleDelta[4], and FPC[5] lose to simply using ZSTD[6] compression in most cases, both in compression ratio and in performance.
3. Specialized time-series DBMS like InfluxDB or TimescaleDB lose to general-purpose relational OLAP DBMS like ClickHouse [7][8][9].
[1] https://clickhouse.com/blog/optimize-clickhouse-codecs-compr...
[2] https://github.com/ClickHouse/ClickHouse
[3] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[4] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[5] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[6] https://github.com/facebook/zstd/
[7] https://arxiv.org/pdf/2204.09795.pdf "SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things" (2022)
[8] https://gitlab.com/gitlab-org/incubation-engineering/apm/apm... https://gitlab.com/gitlab-org/incubation-engineering/apm/apm...
[9] https://www.sciencedirect.com/science/article/pii/S187705091...
-
Features I'd Like in PostgreSQL
Simply by larger sizes of compressed blocks, which are limited to page size in Postgres, and by improving the data locality by sorting, which is inherent for LSM-trees.
But if you want higher compression, you need to consider column-oriented DBMS, such as ClickHouse[1]. They are unbeatable in terms of data compression.
[1] https://github.com/ClickHouse/ClickHouse
Disclaimer: I'm a developer of ClickHouse.
-
I redesigned and open-sourced my SC2 search engine! Now featuring interactive filtering, fuzzy matching and search categories
Database: ClickHouse (OLAP DB) via Tinybird
-
anyone have experience writing data to parquet files? Is there a better alternative for storing large amounts of financial tick data?
clickhouse
-
Setting the TZ environment variable avoids thousands of system calls
Syscalls can be heavier than expected. One example is when an application is run inside gVisor. Another example is when a lot of eBPF code is attached. A third example is when a program is run under strace.
Disclaimer: I'm working on ClickHouse[1], and it is used by thousands of companies in unimaginable environments. It has to work in every possible condition... That's why we set the TZ variable at startup and also embed the timezones into the binary. And we don't use the glibc functions for timezone operations because they are astonishingly slow.
-
Ask HN: What's your favorite illustration in Computer Science?
https://en.wikipedia.org/wiki/De_Bruijn_sequence
I want to make use of it in ClickHouse, but we did not (yet), see https://github.com/ClickHouse/ClickHouse/issues/41195
- Faster PostgresSQL to BigQuery
-
Efficient and performance-portable vector software
It is a nice paper, but for practical applications on fixed-size numbers up to 64 bits, as well as for tuples on fixed-size numbers, and for partial sorting, it does not beat Radix Sort (on moderate-sized arrays).
I have tested it in ClickHouse, but end up with this:
https://github.com/ClickHouse/ClickHouse/blob/master/src/Com...
PS. It's strange that Google's paper dismisses djbsort: https://sorting.cr.yp.to/ - also in the class of sorting networks, but introduced a few years ago.
-
Show HN: HyperLogLog in Zig
[1] https://github.com/ClickHouse/ClickHouse/
What is often forgotten in designing a data structure for a cardinality estimator - is that it should work well not only for a few large states but also for a large number of small sets.
For example, in a query like follows:
SELECT URL, COUNT(DISTINCT UserID) FROM pageviews GROUP BY URL
-
Show HN: ClickHouse-local – a small tool for serverless data analytics
ClickHouse will throw an exception in case of not enough memory and continue to serve other queries. Under certain configurations it can OOM as well.
> How confident are you that your chosen dataset is neutral?
I have no idea if it is "neutral", I picked it randomly.
I test ClickHouse on every interesting dataset, see here: https://github.com/ClickHouse/ClickHouse/issues?q=is%3Aissue...
The reason - I love working with data :) If I see a dataset, I load it into ClickHouse - this is the first thing I do. This is not a kind of marketing or promotion of ClickHouse - you know, if it were some directed task, it would be uninteresting for me.
What are some alternatives?
Trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
loki - Like Prometheus, but for logs.
VictoriaMetrics - VictoriaMetrics: fast, cost-effective monitoring solution and time series database
duckdb - DuckDB is an in-process SQL OLAP Database Management System
TimescaleDB - An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
RocksDB - A library that provides an embeddable, persistent key-value store for fast storage.
arrow-datafusion - Apache Arrow DataFusion SQL Query Engine
materialize - Materialize is a fast, distributed SQL database built on streaming internals.
PostgreSQL - Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
MongoDB Libbson
TileDB - The Universal Storage Engine
Adminer - Database management in a single PHP file