resin
lucene
resin | lucene | |
---|---|---|
2 | 15 | |
569 | 2,925 | |
0.2% | 1.9% | |
4.6 | 9.8 | |
12 days ago | 3 days ago | |
C# | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
resin
-
Ask HN: May I sell the copyright to my code?
I've built a search engine. It has its own query language, data structures and binary file formats, it's MIT licensed and it has around 60 forks. Nobody uses it though, even though it has been around for some years but it was just until recently that I managed to solve the very last of the most crucial of bugs, so I don't find that surprising at all.
It works well now, though, for a Wikipedia sized text based corpus, even though it's still in beta and contains code that can still be optimized. However, before I want to go any further with the project, I'd like to see if I can sell it, the copyright to my code, that is. Because maybe I want to be in the business of creating smart code, then sell it, then move on to the next thing? And maybe some company would like to have a search engine in their software portfolio? Suppose we meet, have a drink, see what happens.
Do I have the rights to sell it, though?
I'm the author of 99.999% of the commits.
https://github.com/kreeben/resin
- Show HN: Hardware-accelerated vector-based search engine for image and text
lucene
-
Unveiling Apache Lucene: Open Source Innovation, Funding, and Community
Apache Lucene began as a research-driven project that quickly established itself as a critical component for implementing high-performance text searching in diverse applications. Its evolutionary journey is marked by continuous improvements, bolstered by contributions from a worldwide community. The project’s development is transparently showcased on its official GitHub repository, where passionate developers, testers, and system architects work together to enhance its robust indexing and search capabilities. What sets Apache Lucene apart is its open source business model—a model driven by community participation and corporate sponsorship. This approach not only fuels rapid innovation but also provides a sustainable framework for long-term project maintenance. The dual benefit of cutting-edge innovation and financial sustainability has led many companies, including major tech players, to adopt and support Apache Lucene as the backbone of their search functionalities.
-
Lucene and I
I’ve been a Lucene-ite since Doug Cutting’s creation migrated from Sourceforge to Jakarta.”I Love[d] Lucene” so much.that I volunteered, a large part of time over 14 months of my life (and shout out to Otis and Mike for sharing the arduous grinds) to co-author two editions of Lucene in Action. The process of discovering a really cool, super powerful, and easy to use full text search library, realizing the word about it needed to get out widely, had me dig deep into the community and codebase with my Apache hat on, tinkering, contributing, committing, and generally loitering on the shoulders of folks way smarter than me.
-
Don't defer Close() on writable files
"if you are creating a file, to ensure full synchronisation you also need to fsync the parent directory, otherwise the file can be fsynced but the update to the directory lost."
And if you need this in Java you still have resort to ugly hacks.
https://github.com/apache/lucene/issues/7231
-
No SNAPSHOTs
Even ASF does not use Maven to build some of its projects anymore: Beam, Groovy, Lucene, Geode, POI, and Solr are not built with Maven. Those are not the most popular ASF projects, I know, but still, it is something.
-
Building an efficient sparse keyword index in Python
First, a review of the landscape. As said in the introduction, there aren't a ton of good options. Apache Lucene is by far the best traditional search index from a speed, performance and functionality standpoint. It's the base for Elasticsearch/OpenSearch and many other projects. But it requires Java.
-
Java Panama Vector API Integrated with Apache Lucene
https://github.com/apache/lucene/issues/10047
2. The Panama Vector API allows CPU's that support it to accelerate vector operations: https://openjdk.org/jeps/438
So this allows fast ANN on Lucene for semantic search!
How did people do this before Lucene supported it? Only through entirely different tools?
-
What Is a Vector Database
Are they forking Lucene or somehow getting the Lucene devs to increase that limit? Because this PR has been open for over a year now: https://github.com/apache/lucene/issues/11507
- An alternative to Elasticsearch that runs on a few MBs of RAM
- Lucene 9.4 (optionally) uses Panama's mapped MemorySegments when JDK 19 is detected
What are some alternatives?
kitten - A statically typed concatenative systems programming language.
OpenSearch - 🔎 Open source distributed and RESTful search engine.
Studybyte - Studybyte is a search engine designed to help students find educational content effortlessly.
pisa - PISA: Performant Indexes and Search for Academia
RarbgAdvancedSearch - Rarbg Advanced Search is an advanced search tool for the popular torrent site Rarbg
RoaringBitmap - A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others