Data Science toolset summary from 2021

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • OPS - Build and Run Open Source Unikernels
  • Scout APM - Less time debugging, more time building
  • SonarLint - Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
  • Prophet

    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

    Prophet - It is a time-series forecasting library built by Facebook. Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. Link - https://github.com/facebook/prophet

  • examples

    TensorFlow examples (by tensorflow)

    Tensorflow - It is mainly used for training ML models which are based on Neural networks and Deep Learning. TensorFlow was developed by the Google Brain team for internal Google use. It can be used in a wide variety of programming languages, most notably Python, as well as Javascript, C++, and Java. This flexibility lends itself to a range of applications in many different sectors. Link - https://www.tensorflow.org/

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • PostgreSQL

    Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

  • MySQL

    MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

    MySQL - https://www.mysql.com/

  • MongoDB

    The MongoDB Database

    MongoDB - https://www.mongodb.com/

  • scikit-learn

    scikit-learn: machine learning in Python

    Scikit-learn - It is one of the most widely used frameworks for Python based Data science tasks. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Link - https://scikit-learn.org/

  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    PyTorch - PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license. Link - https://pytorch.org/

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • MLflow

    Open source platform for the machine learning lifecycle

    MLflow - https://mlflow.org/

  • Keras

    Deep Learning for humans

    Keras - Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library. Link - https://keras.io/

  • huggingface_hub

    All the open source things related to the Hugging Face Hub.

    Huggingface - It is open source library for building transformer based language models. It is used in the field of Natural Language Processing. Large language models like BERT, GPT, etc. are implemented using this library. Link - https://huggingface.co/

  • guildai

    Experiment tracking, ML developer tools

    Guild.ai - https://guild.ai/

  • nodejs-bigquery

    Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

    Google Cloud BigQuery - https://cloud.google.com/bigquery

  • catboost

    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

    Catboost - CatBoost is an open-source software library developed by Yandex. It provides a gradient boosting framework which attempts to solve for Categorical features using a permutation driven alternative compared to the classical algorithm. Link - https://catboost.ai/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts