Looking for open source projects in Machine Learning and Data Science

This page summarizes the projects mentioned and recommended in the original post on /r/ArtificialInteligence

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  • What about addressing some of the issues on nanoGPT? https://github.com/karpathy/nanoGPT

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

  • You could try spaCy. This is the brains of the operation - an open-source NLP library for advanced NLP in Python. Another is DocArray - It's built on top of NumPy and Dask, and good for preprocessing, modeling, and analysis of text data.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • docarray

    Represent, send, store and search multimodal data

  • You could try spaCy. This is the brains of the operation - an open-source NLP library for advanced NLP in Python. Another is DocArray - It's built on top of NumPy and Dask, and good for preprocessing, modeling, and analysis of text data.

  • Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts