Jupyter Notebook AI

Open-source Jupyter Notebook projects categorized as AI

Top 23 Jupyter Notebook AI Projects

  • google-research

    Google Research

    Project mention: Multi-bitrate JPEG compression perceptual evaluation dataset 2023 | news.ycombinator.com | 2024-01-31
  • AI-For-Beginners

    12 Weeks, 24 Lessons, AI for All!

    Project mention: Artificial Intelligence for Beginners – A Curriculum | news.ycombinator.com | 2023-06-13

    This is a good summary of most topics in AI/ML. The only thing that it seems to by missing (or maybe I'm just not seeing it) is a section on generative AI for images and video (DALL-E, Stable Diffusion etc).

    They do cover LLMs which is generative AI for text though: https://github.com/microsoft/AI-For-Beginners/blob/main/less...

  • WorkOS

    The modern API for authentication & user identity. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • generative-ai-for-beginners

    18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

    Project mention: Generative AI for Beginners | news.ycombinator.com | 2023-11-24

    Create an issue at https://github.com/microsoft/generative-ai-for-beginners. There is a call to action for feedback and looks like at least one of the contributors are in education, so will probably take the feedback on board.

  • learnopencv

    Learn OpenCV : C++ and Python Examples

    Project mention: YOLO-NAS Pose | /r/pytorch | 2023-11-16

    Deci's YOLO-NAS Pose: Redefining Pose Estimation! Elevating healthcare, sports, tech, and robotics with precision and speed. Github link and blog link down below! Repo: https://github.com/spmallick/learnopencv/tree/master/YOLO-NAS-Pose

  • h4cker

    This repository is primarily maintained by Omar Santos (@santosomar) and includes thousands of resources related to ethical hacking, bug bounties, digital forensics and incident response (DFIR), artificial intelligence security, vulnerability research, exploit development, reverse engineering, and more.

  • StableLM

    StableLM: Stability AI Language Models

    Project mention: Stable LM 3B: Bringing Sustainable, High-Performance LMs to Smart Devices | news.ycombinator.com | 2023-10-02


    looking at the 3b results (here https://github.com/Stability-AI/StableLM#stablelm-alpha-v2 ?), it looks like Mistral (which outperforms Llama-2 13b) is far more powerful

  • stable-diffusion-webui-colab

    stable diffusion webui colab

    Project mention: Stable-Diffusion-Webui-Colab | news.ycombinator.com | 2023-07-24
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • dopamine

    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

    Project mention: Fast and hackable frameworks for RL research | /r/reinforcementlearning | 2023-03-08

    I'm tired of having my 200m frames of Atari take 5 days to run with dopamine, so I'm looking for another framework to use. I haven't been able to find one that's fast and hackable, preferably distributed or with vectorized environments. Anybody have suggestions? seed-rl seems promising but is archived (and in TF2). sample-factory seems super fast but to the best of my knowledge doesn't work with replay buffers. I've been trying to get acme working but documentation is sparse and many of the features are broken.

  • ML-Papers-of-the-Week

    🔥Highlighting the top ML papers every week.

    Project mention: [D] Where can I find a list of the foundational academic papers in RL/ML/DL and what are your go-to places to find new academic papers in RL/ML/DL? | /r/MachineLearning | 2023-07-07

    Labml.ai stopped working in May. I like https://github.com/dair-ai/ML-Papers-of-the-Week

  • generative-ai

    Sample code and notebooks for Generative AI on Google Cloud (by GoogleCloudPlatform)

    Project mention: Google Imagen 2 | news.ycombinator.com | 2023-12-13

    I've used the code based on similar examples from GitHub [1]. According to docs [2], imagegeneration@005 was released on the 11th, so I guessed it's Imagen 2, though there are no confirmations.

    [1] https://github.com/GoogleCloudPlatform/generative-ai/blob/ma...

    [2] https://console.cloud.google.com/vertex-ai/publishers/google...

  • nlpaug

    Data augmentation for NLP

  • ArtLine

    A Deep Learning based project for creating line art portraits.

    Project mention: [P] ControlNet + ArtLine, Transform portrait styles with written instructions. GitHub Link in comments | /r/MachineLearning | 2023-02-23

    GitHub Link: https://github.com/vijishmadhavan/ArtLine

  • Dreambooth-Stable-Diffusion

    Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focused on training faces, objects, and styles. (by JoePenna)

    Project mention: Will there be comprehensive tutorials for fine-tuning SD XL when it comes out? | /r/StableDiffusion | 2023-07-01

    Tons of stuff here, no? https://github.com/JoePenna/Dreambooth-Stable-Diffusion/

  • clip-retrieval

    Easily compute clip embeddings and build a clip retrieval system with them

    Project mention: [D] data for handwriting recognition | /r/MLQuestions | 2023-05-09

    The tool clip-retreival lets you filter those 400 million images to whatever subsets you're interested in --- for example, 10,000 images of (mostly) handwriting.

  • machine-learning-experiments

    🤖 Interactive Machine Learning experiments: 🏋️models training + 🎨models demo

  • imodels

    Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).

  • vertex-ai-samples

    Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud

    Project mention: Gemini 1.5 outshines GPT-4-Turbo-128K on long code prompts, HVM author | news.ycombinator.com | 2024-02-18
  • tensor-house

    A collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more.

  • Deep-Learning-In-Production

    Build, train, deploy, scale and maintain deep learning models. Understand ML infrastructure and MLOps using hands-on examples.

  • chameleon-llm

    Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".

    Project mention: Giving GPT “Infinite” Knowledge | news.ycombinator.com | 2023-05-08

    > Do you know any active research in this area? I briefly considered playing with this, but my back-of-the-envelope semi-educated feeling for now is that it won't scale.

    I am aware of a couple of potentially promising research directions. One formally academic called Chameleon [0], and one that's more like a grassroots organic effort that aims to build an actually functional Auto-GPT-like, called Agent-LLM [1]. I have read the Chameleon paper, and I must say I'm quite impressed with their architecture. It added a few bits and pieces that most of the early GPT-based agents didn't have, and I have a strong intuition that these will contribute to these things actually working.

    Auto-GPT is another, relatively famous piece of work in this area. However, at least as of v0.2.2, I found it relatively underwhelming. For any online knowledge retrieval+synthesis and retrieval+usage tasks, it seemed to get stuck, but it did sort-of-kind-of OK on plain online knowledge retrieval. After having a look at the Auto-GPT source code, my intuition (yes, I know - "fuzzy feelings without a solid basis" - but I believe that this is simply due to not having an AI background to explain this with crystal-clear wording) is that the poor performance of the current version of Auto-GPT is insufficient skill in prompt-chain architecture and the surprisingly low quality and at times buggy code.

    I think Auto-GPT has some potential. I think the implementation lets down the concept, but that's just a question of refactoring the prompts and the overall code - which it seems like the upstream Github repo has been quite busy with, so I might give it another go in a couple of weeks to see how far it's moved forward.

    > Specifically, as task complexity grows, the amount of results to combine will quickly exceed the context window size of the "combiner" GPT-4. Sure, you can stuff another layer on top, turning it into a tree/DAG, but eventually, I think the partial result itself will be larger than 8k, or even 32k tokens - and I feel this "eventually" will be hit rather quickly. But maybe my feelings are wrong and there is some mileage in this approach.

    Auto-GPT uses an approach based on summarisation and something I'd term 'micro-agents'. For example, when Auto-GPT is searching for an answer to a particular question online, for each search result it finds, it spins up a sub-chain that gets asked a question 'What does this page say about X?' or 'Based on the contents of this page, how can you do Y?'. Ultimately, intelligence is about lossy compression, and this is a starkly exposed when it comes to LLMs because you have no choice but to lose some information.

    > I think the partial result itself will be larger than 8k, or even 32k tokens - and I feel this "eventually" will be hit rather quickly. But maybe my feelings are wrong and there is some mileage in this approach.

    The solution to that would be to synthesize output section by section, or even as an "output stream" that can be captured and/or edited outside the LLM in whole or in chunks. IMO, I do think there's some mileage to be exploited in a recursive "store, summarise, synthesise" approach, but the problem will be that of signal loss. Every time you pass a subtask to a sub-agent, or summarise the outcome of that sub-agent into your current knowledge base, some noise is introduced. It might be that the signal to noise ratio will dissipate as higher and higher order LLM chains are used - analogously to how terrible it was to use electricity or radio waves before any amplification technology became available.

    One possible avenue to explore to crack down on decreasing SNR (based on my own original research, but I can also see some people disclosing online that they are exploring the same path), is to have a second LLM in the loop, double-checking the result of the first one. This has some limitations, but I have successfully used this approach to verify that, for example, the LLM does not outright refuse to carry out a task. This is currently cost-prohibitive to do in a way that would make me personally satisfied and confident enough in the output to make it run full-auto, but I expect that increasing ability to run AI locally will make people more willing to experiment with massive layering of cooperating LLM chains that check each others' work, cooperate, and/or even repeat work using different prompts to pick the best output a la redundant avionics computers.

    [0]: https://github.com/lupantech/chameleon-llm

  • PConv-Keras

    Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

  • Basic-Mathematics-for-Machine-Learning

    The motive behind Creating this repo is to feel the fear of mathematics and do what ever you want to do in Machine Learning , Deep Learning and other fields of AI

  • Reactors

    🌱 Join a community of developers at Microsoft Reactor and connect with people, skills, and technology to build your career or personal learning. We offer free livestreams, on-demand content, and hybrid/in-person events daily around the world. Access our projects and code here.

    Project mention: Michael Mumbauer speaks to a packed crowd at Microsoft Reactor SF during GDC2023 talking all things Ashfall - the multimedia AAA IP utilizing Hedera to unleash the full potential of web3 entertainment. I’ll past video when available. | /r/Hedera | 2023-03-22
  • LearnThisRepo.com

    Learn 300+ open source libraries for free using AI. LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-18.

Jupyter Notebook AI related posts


What are some of the best open-source AI projects in Jupyter Notebook? This list will help you:

Project Stars
1 google-research 32,196
2 AI-For-Beginners 27,371
3 generative-ai-for-beginners 23,929
4 learnopencv 19,978
5 h4cker 15,899
6 StableLM 15,782
7 stable-diffusion-webui-colab 14,861
8 dopamine 10,318
9 ML-Papers-of-the-Week 7,132
10 generative-ai 4,749
11 nlpaug 4,248
12 ArtLine 3,532
13 Dreambooth-Stable-Diffusion 3,133
14 clip-retrieval 1,985
15 machine-learning-experiments 1,572
16 imodels 1,256
17 vertex-ai-samples 1,220
18 tensor-house 1,120
19 Deep-Learning-In-Production 1,041
20 chameleon-llm 985
21 PConv-Keras 893
22 Basic-Mathematics-for-Machine-Learning 548
23 Reactors 498
Learn 300+ open source libraries for free using AI.
LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!