paradedb
septum
paradedb | septum | |
---|---|---|
16 | 15 | |
3,962 | 369 | |
11.0% | - | |
9.8 | 6.4 | |
4 days ago | 2 months ago | |
Rust | Ada | |
GNU Affero General Public License v3.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
paradedb
- Using ClickHouse to scale an events engine
-
Code Search Is Hard
Elasticsearch is good, and it does scale, but it is much more cumbersome and expensive to scale and operate than Postgres. If you use the managed service, you'll pay for the operational pain in the form of higher pricing.
The Postgres movement is strong and extensions like ParadeDB https://github.com/paradedb/paradedb are designed specifically to solve this pain point (Disclaimer: I work for ParadeDB)
-
Ask HN: Best way to mirror a Postgres database to parquet?
No timeline yet, but we know it's a high-priority feature and are working hard on it. Best way would be to join our Slack (link here: https://github.com/paradedb/paradedb/blob/dev/README.md) to follow along. It will be in the coming weeks/months, though.
-
Transforming Postgres into a Fast OLAP Database
You're right. We're working on this currently. You can track the issue here: https://github.com/paradedb/paradedb/issues/717
-
We built our customer data warehouse all on Postgres
There are definitely ways to cleanly make Postgres scale for analytics. We didn't discuss in this blog, but we will be writing about them in the future. For example, check out what the folks at ParadeDB are doing. https://github.com/paradedb/paradedb. Neon is doing an awesome job separating compute from storage. Supabase contributed foreign data wrappers make it super easy to read from S3 into Postgres. Lots of great work going out there :)
- Show HN: Pg_analytics – Speed Up Postgres Analytical Queries by 94x
-
Multi-Database Support in DuckDB
Check out https://github.com/paradedb/paradedb/tree/dev/pg_analytics, we're shipping this week
- ParadeDB – PostgreSQL for Search
-
Postgresql index
Shameless plug, but I'm one of the makers of `pg_bm25` (https://github.com/paradedb/paradedb). We're making a faster tsvector/tsrank as a Postgres extension. Maybe it can help, our benchmarks show much faster performance especially as row count increases
- Building an open source vector database. Looking for advice.
septum
-
Code Search Is Hard
https://github.com/pyjarrett/septum
The hardest part about getting code search right imo is grabbing the right amount of surrounding context, which septum is aimed at solving on a per-file basis.
Another one I'm surprised hasn't been mentioned is stack-graphs (https://github.com/github/stack-graphs), which tries to incrementally resolve symbolic relationships across the whole codebase. It powers github's cross-file precise indexing and conceptually makes a lot of sense, though I've struggled to get the open source version to work
-
Getting up to speed on a c++ codebase
septum - interactive searching for contexts matching and excluding parameters
-
Getting Ada into the mainstream (Dec 1990 edition ^^)
I do a lot of weird and experimental work in Ada. Some of it works, whereas a lot of it doesn't. While I have done this sort of work in Python, Ruby, Rust, C or C++ in the past, when I do it in Ada, I end up saving time later on since the language forces many "good practices."
-
Septum 0.0.7 released (experimental Mac support)
I'd appreciate any issues or suggestions you want to report on GitHub to help me improve this.
- Septum: Context-based code search tool
-
Zig self hosted compiler is now capable of building itself
Ada is another option without a GC. I wrote a search tool for large codebases with it (https://github.com/pyjarrett/septum), and the easy multitasking and pinning to CPUs allows you to easily go wide if the problem you're solving supports it.
There's very little allocation since it supports returning VLAs (like strings) from functions via a secondary stack. Its Alire tool does the toolchain install and provides package management, so trying the language out is super easy. I've done a few bindings to things in C with it, which is ridiculously easy.
-
April 2022 What Are You Working On?
I mentioned my project Septum in a HackerNews comment, which caused it to pick up over 200 GitHub stars. That seemed to give Ada some publicity since it's a general purpose tool, so I'll also publish a new up-to-date version (0.0.6) here soon.
-
Ask HN: How do you search large code-base before adding a feature or fixing bug?
I work on code bases with millions of lines, so I wrote a tool called Septum to help me (https://github.com/pyjarrett/septum/). This isn't to replace grep or ripgrep or silver searcher, those are all great tools you should have!
Septum is neighborhood based (context-based) search, so you can find contiguous groups of lines which contain specific things, but exclude other things. It's also interactive so you can add/remove filters as needed. This makes it useful for those cases where terms change based on their context so you can exclude terms related to the contexts you don't want to keep. It reads .septum/config which contains its normal commands to load directories and settings, so you can have different configs per project you're working on.
-
Ada Crate of the Year: Interactive code search
Here's a short demo video of his Septum tool mentioned in the article: https://asciinema.org/a/459292
-
What Did You Work On in 2021?
I also did a few things: - Wrote an online e-book about Ada - Septum - context-based source code search for multi-million line codebases (I use this nearly every day at work. It's being submitted as my Ada crate of the year. - dir_iterators - library similar to the incredible walkdir. - project_indicators - library for spinners and progress bars. - trendy_terminal - library for cross-platform terminal setup, VT100 support, and GNU readline-like behavior. - trendy_test - library for simple unit testing, which runs tests in parallel. - Ada Ray Tracer - an Ada port of Ray Tracing in One Weekend. - dirs_to_graphviz - Make graphviz files from directory trees. - rst_tables - a tool to draw RST table outlines.
What are some alternatives?
MeiliSearch - A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
liburing-ada - liburing/io_uring bindings for Ada
tantivy - Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
ews - The Embedded Web Server is designed for use in embedded systems with limited resources (eg, no disk). It supports both static (converted from a standard web tree, including graphics and Java class files) and dynamic pages. It is written in GCC Ada.
prism - Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.
hound - Lightning fast code searching made easy
retake - PostgreSQL for Search [Moved to: https://github.com/paradedb/paradedb]
Ada_GUI - An Ada-oriented GUI
bionicgpt - BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality [Moved to: https://github.com/bionic-gpt/bionic-gpt]
ada-ray-tracer
qdrant - Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Ada-SPARK-Crate-Of-The-Year