datatap-python
simpleT5
Our great sponsors
datatap-python | simpleT5 | |
---|---|---|
9 | 2 | |
34 | 381 | |
- | - | |
0.0 | 2.5 | |
over 1 year ago | 12 months ago | |
Python | Python | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datatap-python
-
[Project] DataTap provides droplets ( containers for datasets) to make working on popular deep learning datasets easy.
Learn more about how you can start using this here https://github.com/zensors/datatap-python
- Stream any deep learning dataset with just 3 lines of code into Pytorch, Tensorflow or any python project.
- Data droplets make dataset management & sharing simple -- The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.
- Data droplets specification lets you unify and easily share deep learning datasets. Doplets are designed for complex annotations and let you focus on Deep learning rather than data manipulation.
-
The fastest format to store, access & manage labelled data for any deep learning project
http://datatap.dev/ is an open source platform that allows you to easily pull in any data set in a standard format so you can start training a deep learning model in < 3 minutes
-
Setting up a feedback loop for performance evaluation and retraining of a model.
You should import the data into https://github.com/zensors/datatap-python, will make managing data for the feedback loop easier
-
Show HN: Free user-friendly platform for visual data management
Looking for a user-friendly data management tool? With DataTap, you focus on algorithm design, not on data wrangling. DataTap is a visual data management platform from Zensors.
Check out the repository (https://github.com/zensors/datatap-python)
The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.
Cool Features
simpleT5
-
Transformers: How to compare performance to base model?
Currently I just took ~42000 samples and trained a translation task directly on codeT5 with https://github.com/Shivanandroy/simpleT5. Validation loss and at least the qualitative results are not to bad. Im now going to try to compare it to the base codeT5-model with the *.loss function as suggested above.
-
[P] SimpleT5 : Train T5 models in just 3 lines of code
🌟GitHub: https://github.com/Shivanandroy/simpleT5 🌟Medium: https://snrspeaks.medium.com/simplet5-train-t5-models-in-just-3-lines-of-code-by-shivanand-roy-2021-354df5ae46ba 🌟Colab Notebook: https://colab.research.google.com/drive/1JZ8v9L0w0Ai3WbibTeuvYlytn0uHMP6O?usp=sharing
What are some alternatives?
whylogs - An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
reformer-pytorch - Reformer, the efficient Transformer, in Pytorch
seq2seq - A general-purpose encoder-decoder framework for Tensorflow
ModelZoo.pytorch - Hands on Imagenet training. Unofficial ModelZoo project on Pytorch. MobileNetV3 Top1 75.64🌟 GhostNet1.3x 75.78🌟
coral-cnn - Rank Consistent Ordinal Regression for Neural Networks with Application to Age Estimation
frame-semantic-transformer - Frame Semantic Parser based on T5 and FrameNet
iterative-stratification - scikit-learn cross validators for iterative stratification of multilabel data
KeyPhraseTransformer - KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction | Keyword extraction
Schematics - Python Data Structures for Humans™.
TencentPretrain - Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
analog-watch-recognition - Reading time from analog clocks
fastT5 - ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.