Python Kaggle

Open-source Python projects categorized as Kaggle

Top 12 Python Kaggle Projects

  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Pytorch-UNet

    PyTorch implementation of the U-Net for image semantic segmentation with high quality images

  • catboost

    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

  • Project mention: CatBoost: Open-source gradient boosting library | news.ycombinator.com | 2024-03-05
  • Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

    A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.

  • pytorch-toolbelt

    PyTorch extensions for fast R&D prototyping and Kaggle farming

  • MLBox

    MLBox is a powerful Automated Machine Learning python library.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • fastdup

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

  • Project mention: Visualize your dataset using DINOv2 embedding | news.ycombinator.com | 2023-05-02

    Visualizing your dataset (especially large ones) in a low-dimensional embedding space can tell you a lot about the patterns and clusters in your dataset.

    We recently release a notebook showing how you can visualize your dataset using DINOv2 models by running it on your CPU.

    Yes! No GPUs needed.

    We used it to find clusters of similar images, duplicates, and outliers in a subset of the LAION dataset

    Try it on your own dataset:

    Colab notebook - https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/dinov2_notebook.ipynb

    GitHub repo - https://github.com/visual-layer/fastdup

  • dfdc_deepfake_challenge

    A prize winning solution for DFDC challenge

  • Project mention: How are deepfakes different from beauty face filters? | /r/computervision | 2023-05-27

    For example I used a scanner using this model https://github.com/selimsef/dfdc_deepfake_challenge/blob/master/README.md

  • upgini

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  • Project mention: The fastest way to improve quality of ML model on tabular data | /r/learnmachinelearning | 2023-06-18

    web: https://upgini.com

  • xgboost_ray

    Distributed XGBoost on Ray

  • Paper-Recommendation-System

    Web interface to search ArXiv papers using NLP Sentence-Transformers, Faiss and Streamlit

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Kaggle related posts

Index

What are some of the best open-source Kaggle projects in Python? This list will help you:

Project Stars
1 data-science-ipython-notebooks 26,459
2 d2l-en 21,628
3 Pytorch-UNet 8,358
4 catboost 7,744
5 Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials 3,638
6 pytorch-toolbelt 1,483
7 MLBox 1,475
8 fastdup 1,403
9 dfdc_deepfake_challenge 670
10 upgini 289
11 xgboost_ray 131
12 Paper-Recommendation-System 19

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com