The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more โ
Top 23 Kaggle Open-Source Projects
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
-
Pytorch-UNet
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
-
catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
-
Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials
A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
fastdup
fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
-
upgini
Data search & enrichment library for Machine Learning โ Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
-
deepfake-detection
DeepFake Detection: Detect the video is fake or not using InceptionResNetV2. (by xinyooo)
-
Paper-Recommendation-System
Web interface to search ArXiv papers using NLP Sentence-Transformers, Faiss and Streamlit
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: SIRUS.jl: Interpretable Machine Learning via Rule Extraction | /r/Julia | 2023-06-29SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bรฉnard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.
Project mention: CatBoost: Open-source gradient boosting library | news.ycombinator.com | 2024-03-05
Visualizing your dataset (especially large ones) in a low-dimensional embedding space can tell you a lot about the patterns and clusters in your dataset.
We recently release a notebook showing how you can visualize your dataset using DINOv2 models by running it on your CPU.
Yes! No GPUs needed.
We used it to find clusters of similar images, duplicates, and outliers in a subset of the LAION dataset
Try it on your own dataset:
Colab notebook - https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/dinov2_notebook.ipynb
GitHub repo - https://github.com/visual-layer/fastdup
Project mention: How are deepfakes different from beauty face filters? | /r/computervision | 2023-05-27For example I used a scanner using this model https://github.com/selimsef/dfdc_deepfake_challenge/blob/master/README.md
Project mention: The fastest way to improve quality of ML model on tabular data | /r/learnmachinelearning | 2023-06-18web: https://upgini.com
Kaggle related posts
- The fastest way to improve quality of ML model on tabular data
- How are deepfakes different from beauty face filters?
- [Project] Google ArXiv Papers with NLP semantic-search! Link to Github in the comments!!
- [P] Collection of Kaggle Past Solutions (to learn ideas and techniques)
- How to enrich ML models with open data for free: an in-depth review of 5 python libraries
- Completed all the Kaggle courses.
- How I complete my email addresses lists with demographic insights with Python
-
A note from our sponsor - WorkOS
workos.com | 26 Apr 2024
Index
What are some of the best open-source Kaggle projects? This list will help you:
Project | Stars | |
---|---|---|
1 | data-science-ipython-notebooks | 26,459 |
2 | d2l-en | 21,628 |
3 | LightGBM | 16,043 |
4 | Pytorch-UNet | 8,358 |
5 | catboost | 7,744 |
6 | kaggle-solutions | 3,745 |
7 | Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials | 3,643 |
8 | pytorch-toolbelt | 1,483 |
9 | MLBox | 1,475 |
10 | fastdup | 1,403 |
11 | dfdc_deepfake_challenge | 670 |
12 | upgini | 290 |
13 | benchmarks | 163 |
14 | crypto | 141 |
15 | xgboost_ray | 131 |
16 | deepfake-detection | 86 |
17 | Hello-Kaggle | 78 |
18 | kaggle-courses | 47 |
19 | Paper-Recommendation-System | 19 |
20 | apple-appstore-apps | 12 |
21 | kaggle-look-alike | 9 |
22 | YouTubers-saying-things | 7 |
23 | YouTube-thumbnail-dataset | 4 |
Sponsored