PSA: You don't need fancy stuff to do good work.

This page summarizes the projects mentioned and recommended in the original post on /r/datascience

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • xgboost

    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

  • Finally, when it comes to building models and making predictions, Python and R have a plethora of options available. Libraries like scikit-learn, statsmodels, and TensorFlowin Python, or caret, randomForest, and xgboostin R, provide powerful machine learning algorithms and statistical models that can be applied to a wide range of problems. What's more, these libraries are open-source and have extensive documentation and community support, making it easy to learn and apply new techniques without needing specialized training or expensive software licenses.

  • examples

    TensorFlow examples (by tensorflow)

  • Finally, when it comes to building models and making predictions, Python and R have a plethora of options available. Libraries like scikit-learn, statsmodels, and TensorFlowin Python, or caret, randomForest, and xgboostin R, provide powerful machine learning algorithms and statistical models that can be applied to a wide range of problems. What's more, these libraries are open-source and have extensive documentation and community support, making it easy to learn and apply new techniques without needing specialized training or expensive software licenses.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • seaborn

    Statistical data visualization in Python

  • Python's pandas, NumPy, and SciPy libraries offer powerful functionality for data manipulation, while matplotlib, seaborn, and plotly provide versatile tools for creating visualizations. Similarly, in R, you can use dplyr, tidyverse, and data.table for data manipulation, and ggplot2, lattice, and shiny for visualization. These packages enable you to create insightful visualizations and perform statistical analyses without relying on expensive or proprietary software.

  • scikit-learn

    scikit-learn: machine learning in Python

  • Finally, when it comes to building models and making predictions, Python and R have a plethora of options available. Libraries like scikit-learn, statsmodels, and TensorFlowin Python, or caret, randomForest, and xgboostin R, provide powerful machine learning algorithms and statistical models that can be applied to a wide range of problems. What's more, these libraries are open-source and have extensive documentation and community support, making it easy to learn and apply new techniques without needing specialized training or expensive software licenses.

  • rvest

    Simple web scraping for R

  • Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.

  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  • Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.

  • NumPy

    The fundamental package for scientific computing with Python.

  • Python's pandas, NumPy, and SciPy libraries offer powerful functionality for data manipulation, while matplotlib, seaborn, and plotly provide versatile tools for creating visualizations. Similarly, in R, you can use dplyr, tidyverse, and data.table for data manipulation, and ggplot2, lattice, and shiny for visualization. These packages enable you to create insightful visualizations and perform statistical analyses without relying on expensive or proprietary software.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cheatsheets

    Official Matplotlib cheat sheets (by matplotlib)

  • Python's pandas, NumPy, and SciPy libraries offer powerful functionality for data manipulation, while matplotlib, seaborn, and plotly provide versatile tools for creating visualizations. Similarly, in R, you can use dplyr, tidyverse, and data.table for data manipulation, and ggplot2, lattice, and shiny for visualization. These packages enable you to create insightful visualizations and perform statistical analyses without relying on expensive or proprietary software.

  • ggplot2

    An implementation of the Grammar of Graphics in R

  • Python's pandas, NumPy, and SciPy libraries offer powerful functionality for data manipulation, while matplotlib, seaborn, and plotly provide versatile tools for creating visualizations. Similarly, in R, you can use dplyr, tidyverse, and data.table for data manipulation, and ggplot2, lattice, and shiny for visualization. These packages enable you to create insightful visualizations and perform statistical analyses without relying on expensive or proprietary software.

  • dplyr

    dplyr: A grammar of data manipulation

  • Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts