synthesizer
thinc
synthesizer | thinc | |
---|---|---|
4 | 4 | |
566 | 2,804 | |
- | 0.5% | |
10.0 | 7.3 | |
5 months ago | 11 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
synthesizer
-
Phibrarian Alpha - the first model checkpoint from SciPhi's Mistral-7b
The run is a few days in on a 8x 80gb A100 cluster, and I quietly released the first epoch checkpoint here. I am building the model in association with our synthetic data efforts here, at SciPhi.
-
With LLMs we can create a fully open-source Library of Alexandria.
I am updating because we have another interesting result - by going deeper instead of broader, and by combining new techniques like RAG, we can make incredibly descriptive textbooks. This one here was generated by a ~fully AI pipeline. The pipeline goes MIT OCW -> Syllabus -> Table of Contents -> Textbook. The last step is grounded through vector-lookups over the whole of Wikipedia.
- Textbook was authored with an AI pipeline
-
Looking for fine-tuners who want to build an exciting new model -
The timing is great, because yesterday I introduced RAG into the synthetic generation pipeline [here](https://github.com/emrgnt-cmplxty/sciphi/tree/main). I'm in the process of indexing the entirety of this pypi dataset using ChromaDB in the cloud. It will be relatively easy to plug this into SciPhi when done.
thinc
-
JAX – NumPy on the CPU, GPU, and TPU, with great automatic differentiation
Agree, though I wouldn’t call PyTorch a drop-in for NumPy either. CuPy is the drop-in. Excepting some corner cases, you can use the same code for both. Thinc’s ops work with both NumPy and CuPy:
https://github.com/explosion/thinc/blob/master/thinc/backend...
-
Tinygrad: A simple and powerful neural network framework
I love those tiny DNN frameworks, some examples that I studied in the past (I still use PyTorch for work related projects) :
thinc.by the creators of spaCy https://github.com/explosion/thinc
-
good examples of functional-like python code that one can study?
thinc - defining neural nets in functional way jax, a new deep learning framework puts emphasis on functions rather than tensors, I've tested it for a couple of applications and it's really cool, you can write stuff like you'd write math expressions in papers using numpy. That speeds up development significantly, and makes code much more readable
- thinc - A refreshing functional take on deep learning, compatible with your favorite libraries
What are some alternatives?
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
quantulum3 - Library for unit extraction - fork of quantulum for python3
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
pytorch-lightning - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. [Moved to: https://github.com/PyTorchLightning/pytorch-lightning]
horovod - Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
pytorch-lightning - Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
extending-jax - Extending JAX with custom C++ and CUDA code
stableagents - Stable, Semi-Autonomous, Reliable and Steerable LLM Agents for production use cases.
dm-haiku - JAX-based neural network library
cover-agent - CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
AIF360 - A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.