Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more →
Top 23 Pytorch Open-Source Projects
-
For those interested in this innovative tool, accessing the GitHub repository at https://github.com/AUTOMATIC1111/stable-diffusion-webui provides further details and instructions on how to utilize its features effectively. Embrace the future of creativity and unlock new possibilities with this enhanced web interface for Stable Diffusion.
-
Judoscale
Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
-
import os from pathlib import Path from tempfile import mkdtemp from dotenv import load_dotenv from langchain_core.prompts import PromptTemplate from langchain_docling.loader import ExportType from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline # Import pipeline from langchain_community.llms import HuggingFacePipeline def _get_env_from_colab_or_os(key): try: from google.colab import userdata try: return userdata.get(key) except userdata.SecretNotFoundError: pass except ImportError: pass return os.getenv(key) load_dotenv() # https://github.com/huggingface/transformers/issues/5486: os.environ["TOKENIZERS_PARALLELISM"] = "false" HF_TOKEN = _get_env_from_colab_or_os("HF_TOKEN") print(f"The value of HF_TOKEN is: '{HF_TOKEN}'") FILE_PATH = ["https://arxiv.org/pdf/2408.09869"] # Docling Technical Report EMBED_MODEL_ID = "sentence-transformers/all-MiniLM-L6-v2" GEN_MODEL_ID = "mistralai/Mixtral-8x7B-Instruct-v0.1" # Added modifications - Mistral 7B Instruct v0.1 tokenizer = AutoTokenizer.from_pretrained(GEN_MODEL_ID, token=HF_TOKEN) model = AutoModelForCausalLM.from_pretrained(GEN_MODEL_ID, token=HF_TOKEN) # Create the text-generation pipeline pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer) # Initialize Langchain LLM using the pipeline llm = HuggingFacePipeline(pipeline=pipeline) ### END of added modifications EXPORT_TYPE = ExportType.DOC_CHUNKS QUESTION = "Which are the main AI models in Docling?" PROMPT = PromptTemplate.from_template( "Context information is below.\n---------------------\n{context}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {input}\nAnswer:\n", ) TOP_K = 3 MILVUS_URI = str(Path(mkdtemp()) / "docling.db")
-
ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Method 1 (Basic API): websockets_api_example.py
-
>Chollet, a French computer scientist and one of the industry’s sharpest skeptics
I feel like this description really buries the lede on Chollet's expertise. (For those who don't know, he's the creator of and lead contributor[0] to Keras)
[0]https://github.com/keras-team/keras/graphs/contributors
-
nn
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
Project mention: ChatGPT unexpectedly began speaking in a user's cloned voice during testing | news.ycombinator.com | 2024-08-11
-
There are several implementations of the YOLO algorithm available, but for ease-of-use, we will use the Ultralytics implementation in this guide. We will implement and test the code locally and then deploy to Koyeb's GPUs for higher inference speed.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Resource: vLLM
-
While I appreciate the pictures, really at the end of the day all you have is a glossary and slightly more detailed arbitrary hand waving.
What specific architecture is used to build a basic model?
Why is that specific combination of basic building blocks used?
Why does it work when other similar ones don’t?
I generally approve of simplifications, but these LLM simplifications are too vague and broad to be useful or meaningful.
Here my challenge: take that article and write an LLM.
No?
How about an article on raytracing?
Why is building an LLM miles of explanation of concepts and nothing concrete you can actually build?
Where’s my “LLM in a weekend” that covers the theory and how to actually implement one?
The distinction between this and something like https://github.com/rasbt/LLMs-from-scratch is stark.
My hot take is, if you haven’t built one, you don’t actually understand how they work, you just have a kind of vague kind-of-heard of it understanding, which is not the same thing.
-
Project mention: Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos | news.ycombinator.com | 2025-01-25
They did it on their own computer. https://github.com/ultralytics/ultralytics
-
Project mention: Show HN: Voice-Pro – AI Voice Cloning Magic: Transform Any Voice in 15 Seconds | news.ycombinator.com | 2024-11-27
It's really easy for a technical person to do as well.
I use Coqui TTS[0] as part of my home automation, I wrote a small python script that lets me upload a voice clip for it to clone after I got the idea from HeyWillow[1], and a small shim that lets me send the output to a Home Assistant media player instead of using their standard output device. I run the TTS container on a VM with a Tesla P4 (~£100 to buy) and get about 1x-2x (roughly the same time it'd take to say it, to process) using the large model.
Just for a giggle, I uploaded a few 3s-5s second clip of myself speaking and cloned my voice, then executed a command to our living room media player to call my wife into the room; from another room, she was 100% convinced it was myself speaking words I'd never spoken.
I tried playing with a variety of sentences for a few hours and overall, it sounded almost exactly like me, to me, with the exception of some "attitude" and "intonation" I know I wouldn't use in my speech. I didn't notice much of an improvement using much longer clips; the short ones were "good enough".
Tangentially, it really bugs me that most phone providers in the UK insist you record a "personal greeting" now before they'll let you check your voice mail box, I just record silence, because the last thing I want/need is a voicemail greeting in my voice confirming to some randomer I didn't want calling me, who I am and that my number is active, even more so knowing how I can
[0] https://github.com/coqui-ai/TTS
-
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Project mention: DeepSpeed-Domino: Communication-Free LLM Training Engine | news.ycombinator.com | 2024-11-26 -
Ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
I'm guessing this comment is some kind of "if you know, you know." Likely starting from https://docs.ray.io/en/latest/cluster/vms/user-guides/launch... and then trawling through one of these I guess https://github.com/ray-project/ray/issues?q=is%3Aissue+prem+...
-
Project mention: Deep Live Cam: Real-Time Face Swapping and One-Click Video Deepfake Tool | news.ycombinator.com | 2024-08-10
Interesting... This project is built upon "GFPGAN v1.4" (https://github.com/TencentARC/GFPGAN) and "FaceSwap Extension - Automatic 1111 - Proof of Concept" (https://github.com/revolverocelot1/-webui-faceswap-unlocked). The GFPGAN project is grounded on its own in the paper "GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior" by Wang et al.
-
MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
-
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Project mention: This PR content was generated automatically using cover-agent | news.ycombinator.com | 2024-11-19Those are some pointless tests.
E.g. test_activation_stats_functions [1] that just checks that the returned value is a float, and that it can take random numbers as input.
test_get_state_dict_custom_unwrap [2] is probably supposed to check that custom_unwrap is invoked, but since it doesn't either record being called, or transform its input, the assertions can't actually check that it was called.
[1] https://github.com/huggingface/pytorch-image-models/pull/233...
[2] https://github.com/huggingface/pytorch-image-models/pull/233...
-
-
-
-
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
-
Project mention: Show HN: Mycelium – A graph viewer library for neural networks | news.ycombinator.com | 2024-09-06
-
pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
Project mention: SB-1047 will stifle open-source AI and decrease safety | news.ycombinator.com | 2024-04-29It's very easy to get started, right in your Terminal, no fees! No credit card at all.
And there are cloud providers like https://replicate.com/ and https://lightning.ai/ that will let you use your LLM via an API key just like you did with OpenAI if you need that.
You don't need OpenAI - nobody does.
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
Pytorch discussion
Pytorch related posts
-
Google DeepMind Unveils QuestBench to Enhance LLM Evaluation
-
Lovely Tensors: Tensors, for human consumption
-
Show HN: A Medical Research Agent Built with BioMCP and Haystack
-
Building a RAG with Docling and LangChain
-
Complete Large Language Model (LLM) Learning Roadmap
-
This Bench Does Not Exist
-
🩷 สร้าง AI แชทบอทให้กำลังใจด้วย Python และ Transformers
-
A note from our sponsor - Judoscale
judoscale.com | 23 Apr 2025
Index
What are some of the best open-source Pytorch projects? This list will help you:
# | Project | Stars |
---|---|---|
1 | stable-diffusion-webui | 151,591 |
2 | transformers | 143,133 |
3 | ComfyUI | 74,692 |
4 | Keras | 62,884 |
5 | nn | 60,047 |
6 | Real-Time-Voice-Cloning | 54,052 |
7 | yolov5 | 53,449 |
8 | vllm | 45,365 |
9 | LLMs-from-scratch | 44,756 |
10 | ultralytics | 39,737 |
11 | TTS | 39,540 |
12 | Made-With-ML | 38,420 |
13 | DeepSpeed | 38,004 |
14 | Ray | 36,619 |
15 | GFPGAN | 36,597 |
16 | MockingBird | 36,152 |
17 | pytorch-image-models | 33,822 |
18 | fairseq | 31,337 |
19 | mmdetection | 30,848 |
20 | pytorch-tutorial | 30,730 |
21 | Real-ESRGAN | 30,586 |
22 | netron | 29,962 |
23 | pytorch-lightning | 29,314 |