Kaggle

Top 23 Kaggle Open-Source Projects

  1. data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

  4. LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

  5. Pytorch-UNet

    PyTorch implementation of the U-Net for image semantic segmentation with high quality images

  6. catboost

    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

    Project mention: ๐Ÿš€ Why Your ML Service Needs Rust + CatBoost: A Setup Guide That Actually Works | dev.to | 2025-01-19

    [package] name = "MLApp" version = "0.1.0" edition = "2021" [dependencies] catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}

  7. kaggle-solutions

    ๐Ÿ… Collection of Kaggle Solutions and Ideas ๐Ÿ…

  8. Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

    A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.

  9. Nutrient

    Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrientโ€™s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.

    Nutrient logo
  10. pytorch-toolbelt

    PyTorch extensions for fast R&D prototyping and Kaggle farming

  11. MLBox

    MLBox is a powerful Automated Machine Learning python library.

  12. dfdc_deepfake_challenge

    A prize winning solution for DFDC challenge

  13. upgini

    Data search & enrichment library for Machine Learning โ†’ Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  14. benchmarks

    Comparison tools (by catboost)

  15. xgboost_ray

    Distributed XGBoost on Ray

  16. crypto

    Cryptocurrency Historical Market Data R Package (by JesseVent)

  17. deepfake-detection

    DeepFake Detection: Detect the video is fake or not using InceptionResNetV2. (by xinyooo)

  18. Hello-Kaggle

    For someone who is new at Kaggle

  19. kaggle-courses

    Courses on Kaggle

  20. kaggle-look-alike

    Kaggle Data Explorer UI look-alike built in React.

  21. Paper-Recommendation-System

    Web interface to search ArXiv papers using NLP Sentence-Transformers, Faiss and Streamlit

  22. apple-appstore-apps

    Apple AppStore Apps dataset. (1.2 million App Data) and 21 attributes

  23. ailert

    An open-source platform that aggregates AI content from 230+ sources including research papers, GitHub trends, and industry news, making AI knowledge accessible to everyone.

    Project mention: Building an Open-Source AI Newsletter Engine: The Story of AiLert | dev.to | 2025-01-12

    Code: https://github.com/anuj0456/ailert Docs: https://github.com/anuj0456/ailert/blob/main/README.md

  24. YouTubers-saying-things

    Dataset containing popular YouTuber channel's video subtitles

  25. YouTube-thumbnail-dataset

    Most versatile dataset of YouTube thumbnails.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Kaggle discussion

Log in or Post with

Kaggle related posts

  • The fastest way to improve quality of ML model on tabular data

    1 project | /r/learnmachinelearning | 18 Jun 2023
  • How are deepfakes different from beauty face filters?

    1 project | /r/computervision | 27 May 2023
  • [Project] Google ArXiv Papers with NLP semantic-search! Link to Github in the comments!!

    1 project | /r/MachineLearning | 19 Feb 2023
  • [P] Collection of Kaggle Past Solutions (to learn ideas and techniques)

    1 project | /r/MachineLearning | 19 Sep 2022
  • How to enrich ML models with open data for free: an in-depth review of 5 python libraries

    1 project | /r/Python | 2 Sep 2022
  • Completed all the Kaggle courses.

    1 project | dev.to | 31 Jul 2022
  • How I complete my email addresses lists with demographic insights with Python

    1 project | /r/Python | 27 Jul 2022
  • A note from our sponsor - SaaSHub
    www.saashub.com | 13 Feb 2025
    SaaSHub helps you find the best software and product alternatives Learn more โ†’

Index

What are some of the best open-source Kaggle projects? This list will help you:

# Project Stars
1 data-science-ipython-notebooks 27,837
2 d2l-en 24,864
3 LightGBM 16,937
4 Pytorch-UNet 9,645
5 catboost 8,242
6 kaggle-solutions 5,074
7 Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials 3,817
8 pytorch-toolbelt 1,529
9 MLBox 1,503
10 dfdc_deepfake_challenge 795
11 upgini 322
12 benchmarks 169
13 xgboost_ray 147
14 crypto 143
15 deepfake-detection 101
16 Hello-Kaggle 80
17 kaggle-courses 53
18 kaggle-look-alike 33
19 Paper-Recommendation-System 20
20 apple-appstore-apps 19
21 ailert 13
22 YouTubers-saying-things 8
23 YouTube-thumbnail-dataset 4

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?