wtpsplit
lance
wtpsplit | lance | |
---|---|---|
1 | 10 | |
499 | 3,296 | |
- | 3.4% | |
7.4 | 9.8 | |
5 days ago | 1 day ago | |
Python | Rust | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wtpsplit
-
Typo correction using NLP
Source: I'm the author of nlprule and nnsplit which are quite well used for grammatical error correction and sentence boundary detection, respectively.
lance
- The Nimble File Format by Meta
-
Supabase Storage: now supports the S3 protocol
you should look at lance(https://lancedb.github.io/lance/)
-
Understanding Parquet, Iceberg and Data Lakehouses
Parquet has been the lakehouse file format of choice for nearly half a decade. But we are starting to see other contenders that are optimized more for lower latency like lance https://github.com/lancedb/lance
- FLaNK Stack Weekly for 12 June 2023
- FLaNK Stack 5-June-2023
- [Show HN] Lance is a Rust-based alternative to Parquet for ML data
-
Show HN: Lance is a Rust-based alternative to Parquet for ML data
getting bunch of 404s on the docs. for example https://eto-ai.github.io/lance/format.html (But this works: https://lancedb.github.io/lance/*)
Did you guys just pivot from eto-ai to lancedb?
-
Any job processing framework like Spark but in Rust?
For Feature Stores check out: https://github.com/eto-ai/lance
- Show HN: Lance – Deep Learning with DuckDB and Arrow
What are some alternatives?
SymSpell - SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
roop - one-click face swap
tangram - Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.
deeplake - Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
tch-rs - Rust bindings for the C++ api of PyTorch.
Lixur - Lixur is an open-sourced project that seeks to build a scalable, feeless, decentralized, quantum-secure, and easy-to-use blockchain with smart, and intelligent (A.I.) contract functionality.
NLP-progress - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
tangram - Tangram is an all-in-one automated machine learning framework. [Moved to: https://github.com/tangramdotdev/tangram]
Rio - A hardware-accelerated GPU terminal emulator focusing to run in desktops and browsers.
nlprule - A fast, low-resource Natural Language Processing and Text Correction library written in Rust.
chatdocs - Chat with your documents offline using AI.