applied-ml
tensorflow
applied-ml | tensorflow | |
---|---|---|
13 | 223 | |
25,984 | 182,575 | |
- | 0.5% | |
3.0 | 10.0 | |
5 days ago | 3 days ago | |
C++ | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
applied-ml
-
[D] Favorite ML Youtube Channels/Blogs/Newsletters
Also, have any of you stumbled across any cool GitHub repos like this one: https://github.com/eugeneyan/applied-ml ?
- Curated Papers on Machine Learning in Production
-
Top Github repo trends in 2021
The second repo I LOVE is Eugene Yan’s Applied ML repository. This is a brilliant idea to create and actually something I was planning on sort of casually doing in my non-existent free time… Anyhow, it is a curated list of technical posts from top engineering teams (Netflix, Amazon, Pinterest, Linkedin, etc.) detailing how they built out different types of AI/ML systems (e.g. forecasting, recommenders, search and ranking, etc.). Ofc, it focuses on AI/ML, but something similar could be made for the traditional or BI-oriented analytics stack, as well as the streaming world, super high value for practitioners! Btw-one of my favorite things at BCG used to be looking at our IT architecture team’s reference architecture diagrams… the best way to understand technologies is to look at how a ton of stuff is architected… and its fun!
- Curated papers, articles, & blogs on data science and ML in production
-
Messed up my career by pivoting to DS. Wondering if it's too late to switch to MLE
Applied ML: A collection of papers, articles, and blogs on ML in production by different companies (Netflix, Uber, Facebook, LinkedIn, etc)
-
[D] A dilemma of an ML guy in industry
Eugene Yan's applied-ml has tons of case studies.
- Papers & tech blogs by companies sharing their work on data science & machine learning in production.
-
My information dump for people trying to break into data science/interview notes
https://github.com/eugeneyan/applied-ml You may find some of his links interesting. I would avoid anything that refers to scaling up a platform as these are more backend engr focus. The more relevant posts to you are probably on the scale of blog posts that are product oriented like the ones I listed in section 4 (e.g. we wanted to solve X for our users and this is how we scoped and defined it). The technical aspects should come backseat to the business aspects. There's def a lot of companies/blog posts that he missed, but the internet is huge.
-
[D] Can anyone point me to resources/case studies of companies/business creating infrastructure for their data needs?
Check the resources mentioned in applied-ml. It includes blog posts/papers from many companies describing how they built some ML product X.
-
What content would be useful to intermediate Data Scientist
Check out this repo. They collect hundreds of case studies, broken down by dozens of methodologies from large real-world companies such as AirBnB, Nvidia, Uber, Netflix etc.
tensorflow
-
Side Quest Devblog #1: These Fakes are getting Deep
# L2-normalize the encoding tensors image_encoding = tf.math.l2_normalize(image_encoding, axis=1) audio_encoding = tf.math.l2_normalize(audio_encoding, axis=1) # Find euclidean distance between image_encoding and audio_encoding # Essentially trying to detect if the face is saying the audio # Will return nan without the 1e-12 offset due to https://github.com/tensorflow/tensorflow/issues/12071 d = tf.norm((image_encoding - audio_encoding) + 1e-12, ord='euclidean', axis=1, keepdims=True) discriminator = keras.Model(inputs=[image_input, audio_input], outputs=[d], name="discriminator")
-
Google lays off its Python team
[3]: https://github.com/tensorflow/tensorflow/graphs/contributors
- TensorFlow-metal on Apple Mac is junk for training
-
🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬
To get up to speed with TensorFlow, check their quickstart Support TensorFlow on GitHub ⭐
- One .gitignore to rule them all
-
10 Github repositories to achieve Python mastery
Explore here.
-
GitHub and Developer Ecosystem Control
Part of the major userbase pull in GitHub revolves around hosting a considerable number of popular projects including Angular, React, Kubernetes, cpython, Ruby, tensorflow, and well even the software that powers this site Forem.
-
Non-determinism in GPT-4 is caused by Sparse MoE
Right but that's not an inherent GPU determinism issue. It's a software issue.
https://github.com/tensorflow/tensorflow/issues/3103#issueco... is correct that it's not necessary, it's a choice.
Your line of reasoning appears to be "GPUs are inherently non-deterministic don't be quick to judge someone's code" which as far as I can tell is dead wrong.
Admittedly there are some cases and instructions that may result in non-determinism but they are inherently necessary. The author should thinking carefully before introducing non-determinism. There are many scenarios where it is irrelevant, but ultimately the issue we are discussing here isn't the GPU's fault.
-
Can someone explain how keras code gets into the Tensorflow package?
and things like y = layers.ELU()(y) work as expected. I wanted to see a list of the available layers so I went to the Tensorflow GitHub repository and to the keras directory. There's a warning in that directory that says:
-
Is it even possible to design a ML model without using Python or MATLAB? Like using C++, C or Java?
Exactly what language do you think TensorFlow is written in? :)
What are some alternatives?
awesome-mlops - A curated list of references for MLOps
PaddlePaddle - PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
awesome-ml-blogs - Curated list of technical blogs on machine learning · AI/ML/DL/CV/NLP/MLOps
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
machine-learning-roadmap - A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Cookbook - The Data Engineering Cookbook
LightGBM - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
ml-surveys - 📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
scikit-learn - scikit-learn: machine learning in Python
pipebase - data integration framework
LightFM - A Python implementation of LightFM, a hybrid recommendation algorithm.