incubator-gluten
narrator
incubator-gluten | narrator | |
---|---|---|
3 | 5 | |
988 | 4,265 | |
3.0% | - | |
9.9 | 6.4 | |
7 days ago | 17 days ago | |
Scala | Python | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
incubator-gluten
-
A glimpse into the future of data processing infrastructure.
When I first learned about the Gluten project from Intel, I thought Databricks was going to be in trouble.
- FLaNK Stack for 04 December 2023
-
Blaze: Fast query execution engine for Apache Spark
Interesting, looks like it is just DataFusion engine for Spark. There is a similar project: https://github.com/oap-project/gluten - it brings ClickHouse as an engine to Spark.
narrator
- FLaNK Stack for 04 December 2023
- David Attenborough narrates your life
-
David Attenborough is now narrating my life
This is hilarious. Can’t wait for the Werner Herzog version.
And you can use it yourself: https://github.com/cbh123/narrator
What are some alternatives?
LearningSparkV2 - This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Jlama - Jlama is a modern Java inference engine for LLMs
opaque-sql - An encrypted data analytics platform
langchain4j - Java version of LangChain
blaze - Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
nougat - Implementation of Nougat Neural Optical Understanding for Academic Documents
blaze - NumPy and Pandas interface to Big Data
onnx-models - A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.
Jupyter Scala - A Scala kernel for Jupyter
TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
kyuubi - Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
PyMuPDF - PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.