tensorflow
mesh-transformer-jax
Our great sponsors
tensorflow | mesh-transformer-jax | |
---|---|---|
221 | 52 | |
181,467 | 6,169 | |
0.7% | - | |
10.0 | 0.0 | |
7 days ago | about 1 year ago | |
C++ | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tensorflow
-
🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬
To get up to speed with TensorFlow, check their quickstart Support TensorFlow on GitHub ⭐
- One .gitignore to rule them all
-
10 Github repositories to achieve Python mastery
Explore here.
-
GitHub and Developer Ecosystem Control
Part of the major userbase pull in GitHub revolves around hosting a considerable number of popular projects including Angular, React, Kubernetes, cpython, Ruby, tensorflow, and well even the software that powers this site Forem.
-
Non-determinism in GPT-4 is caused by Sparse MoE
Right but that's not an inherent GPU determinism issue. It's a software issue.
https://github.com/tensorflow/tensorflow/issues/3103#issueco... is correct that it's not necessary, it's a choice.
Your line of reasoning appears to be "GPUs are inherently non-deterministic don't be quick to judge someone's code" which as far as I can tell is dead wrong.
Admittedly there are some cases and instructions that may result in non-determinism but they are inherently necessary. The author should thinking carefully before introducing non-determinism. There are many scenarios where it is irrelevant, but ultimately the issue we are discussing here isn't the GPU's fault.
-
Can someone explain how keras code gets into the Tensorflow package?
and things like y = layers.ELU()(y) work as expected. I wanted to see a list of the available layers so I went to the Tensorflow GitHub repository and to the keras directory. There's a warning in that directory that says:
-
How to do deep learning with Caffe?
You can use Tensorflow's deep learning API for this.
-
Ask HN: What is a AI chip and how does it work?
This is indeed the bread-and-butter, but there is use of all sorts of standard linear algebra algorithms. You can check various xla-related (accelerated linear algebra) folders in tensorflow or torch folders in pytorch to see the list of what is used [1],[2]
[1] https://github.com/tensorflow/tensorflow/tree/8d9b35f442045b...
[2] https://github.com/pytorch/pytorch/blob/6e3e3dd477e0fb9768ee...
-
Mastering Data Science: Top 10 GitHub Repos You Need to Know
2. TensorFlow Developed by the Google Brain team, TensorFlow is a powerful open-source machine learning framework that’s perfect for deep learning and neural network projects. With TensorFlow, you can build and train complex models using an intuitive and flexible API, making it an essential tool for any data scientist looking to delve into deep learning.
-
Tensorflow V2 - LSTM Penn Tree Bank Dataset
I found the official Tensorflow V1 code from a Github branch here (https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/models/rnn/ptb/ptb_word_lm.py). All code necessary to run that file is in the /ptb folder (except data).
mesh-transformer-jax
-
Large Language Models: Compairing Gen2/Gen3 Models (GPT-3, GPT-J, MT5 and More)
GPT-J is a LLM case study with two goals: Training a LLM with a data source containing unique material, and using the training frameworkMesh Transformer JAX to achieve a high training efficiency through parallelization. There is no research paper about GPT-J, but on its GitHub pages, the model, different checkpoints, and the complete source code for training is given.
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
-
[D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
-
Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy
You can skip to step 4 using something like GPT-J as far as I understand: https://github.com/kingoflolz/mesh-transformer-jax#links
The pretrained model is already available.
-
Ask HN: Self-hosted/open-source ChatGPT alternative? Like Stable Diffusion
I know nothing, but have heard Hugging Face is in that direction.
https://github.com/huggingface/transformers
>Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.
> These models can be applied on:
> - Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages.
> - Images, for tasks like image classification, object detection, and segmentation.
> - Audio, for tasks like speech recognition and audio classification.
---
Also read about GPT-J, whose capability is comparable with GTP-3.
https://github.com/kingoflolz/mesh-transformer-jax
But I believe it requires buying or renting GPUs.
-
[D]: Are there any alternatives to Huggingface in the use of GPT-Neo?
Well, many models hosted on Hugging Face were actually developed without HF Transformers first (and then were ported to HF Transformers by the community). It is the case with GPT-J. Here is the original GPT-J implementation: https://github.com/kingoflolz/mesh-transformer-jax
-
dalle update
GPT-J with 6B parameters barelly scrapes by on a 16GB GPU (using KoboldAI, dunno what impact different scripts and stuff might have)
-
Yandex opensources 100B parameter GPT-like model
doesnt seem the code is there - pretrained models are there. https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
https://huggingface.co/EleutherAI/gpt-j-6B
isnt that so ?
-
Meta announces a GPT3-size language model you can download
175B * 16 bits = 350GB, but it does compress a bit.
GPT-J-6B, which you can download at https://github.com/kingoflolz/mesh-transformer-jax, is 6B parameters but weighs 9GB. It does decompress to 12GB as expected. Assuming the same compression ratio, download size would be 263GB, not 350GB.
-
[D] Connor Leahy on EleutherAI, Replicating GPT-2/GPT-3, AI Risk and Alignment
GPT-J
What are some alternatives?
PaddlePaddle - PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
LightGBM - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
scikit-learn - scikit-learn: machine learning in Python
LightFM - A Python implementation of LightFM, a hybrid recommendation algorithm.
xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
PyBrain
Deeplearning4j - Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation.
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
MLflow - Open source platform for the machine learning lifecycle