tensorflow
mesh-transformer-jax
Our great sponsors
tensorflow | mesh-transformer-jax | |
---|---|---|
221 | 52 | |
182,456 | 6,213 | |
0.8% | - | |
10.0 | 0.0 | |
2 days ago | over 1 year ago | |
C++ | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tensorflow
- TensorFlow-metal on Apple Mac is junk for training
-
🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬
To get up to speed with TensorFlow, check their quickstart Support TensorFlow on GitHub ⭐
- One .gitignore to rule them all
-
10 Github repositories to achieve Python mastery
Explore here.
-
GitHub and Developer Ecosystem Control
Part of the major userbase pull in GitHub revolves around hosting a considerable number of popular projects including Angular, React, Kubernetes, cpython, Ruby, tensorflow, and well even the software that powers this site Forem.
-
Non-determinism in GPT-4 is caused by Sparse MoE
Right but that's not an inherent GPU determinism issue. It's a software issue.
https://github.com/tensorflow/tensorflow/issues/3103#issueco... is correct that it's not necessary, it's a choice.
Your line of reasoning appears to be "GPUs are inherently non-deterministic don't be quick to judge someone's code" which as far as I can tell is dead wrong.
Admittedly there are some cases and instructions that may result in non-determinism but they are inherently necessary. The author should thinking carefully before introducing non-determinism. There are many scenarios where it is irrelevant, but ultimately the issue we are discussing here isn't the GPU's fault.
-
Can someone explain how keras code gets into the Tensorflow package?
and things like y = layers.ELU()(y) work as expected. I wanted to see a list of the available layers so I went to the Tensorflow GitHub repository and to the keras directory. There's a warning in that directory that says:
-
Is it even possible to design a ML model without using Python or MATLAB? Like using C++, C or Java?
Exactly what language do you think TensorFlow is written in? :)
-
How to do deep learning with Caffe?
You can use Tensorflow's deep learning API for this.
-
When the documentation has TODOs
Since you've specifically mentioned ML, here's Tenserflow's GitHub. I'm sure a quick glance through that will change your mind.
mesh-transformer-jax
-
Large Language Models: Compairing Gen2/Gen3 Models (GPT-3, GPT-J, MT5 and More)
GPT-J is a LLM case study with two goals: Training a LLM with a data source containing unique material, and using the training frameworkMesh Transformer JAX to achieve a high training efficiency through parallelization. There is no research paper about GPT-J, but on its GitHub pages, the model, different checkpoints, and the complete source code for training is given.
-
[R] Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers
This idea has already been proposed in ViT-22B and GPT-J-6B.
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
-
[D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
- Boss wants me to use ChatGPT for work, but I refuse to input my personal phone number. Any advice?
-
Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy
You can skip to step 4 using something like GPT-J as far as I understand: https://github.com/kingoflolz/mesh-transformer-jax#links
The pretrained model is already available.
-
Best coding model?
The Github repo suggests it's possible you can change the number of checkpoints to make it run on a GPU.
- Ask HN: What language models can I fine-tune at home?
-
selfhosted/ open-source ChatGPT alternative?
GPT-J, which uses mesh-transformer-jax: https://github.com/kingoflolz/mesh-transformer-jax
-
GPT-J, an open-source alternative to GPT-3
They hinted at it in the screenshot, but the goods are linked from the https://6b.eleuther.ai page: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b (Apache 2)
What are some alternatives?
PaddlePaddle - PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
LightGBM - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
KoboldAI-Client
scikit-learn - scikit-learn: machine learning in Python
alpaca-lora - Instruct-tune LLaMA on consumer hardware
LightFM - A Python implementation of LightFM, a hybrid recommendation algorithm.
Finetune_LLMs - Repo for fine-tuning Casual LLMs