Python gpt-2

Open-source Python projects categorized as gpt-2

Top 23 Python gpt-2 Projects

  • RWKV-LM

    RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • Project mention: Do LLMs need a context window? | news.ycombinator.com | 2023-12-25

    https://github.com/BlinkDL/RWKV-LM#rwkv-discord-httpsdiscord... lists a number of implementations of various versions of RWKV.

    https://github.com/BlinkDL/RWKV-LM#rwkv-parallelizable-rnn-w... :

    > RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V)

    > RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode.

    > So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding (using the final hidden state).

    > "Our latest version is RWKV-6,*

  • LoRA

    Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

  • Project mention: DECT NR+: A technical dive into non-cellular 5G | news.ycombinator.com | 2024-04-02

    This seems to be an order of magnitude better than LoRa (https://lora-alliance.org/ not https://arxiv.org/abs/2106.09685). LoRa doesn't have all the features this one does like OFDM, TDM, FDM, and HARQ. I didn't know there's spectrum dedicated for DECT use.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • GPT2-Chinese

    Chinese version of GPT2 training code, using BERT tokenizer.

  • awesome-pretrained-chinese-nlp-models

    Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

  • picoGPT

    An unnecessarily tiny implementation of GPT-2 in NumPy.

  • Project mention: Understanding Automatic Differentiation in 30 lines of Python | news.ycombinator.com | 2023-08-24

    In that case, you might also enjoy https://jaykmody.com/blog/gpt-from-scratch/

    (here's the raw code: https://github.com/jaymody/picoGPT/blob/main/gpt2.py)

  • xTuring

    Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

  • Project mention: I'm developing an open-source AI tool called xTuring, enabling anyone to construct a Language Model with just 5 lines of code. I'd love to hear your thoughts! | /r/machinelearningnews | 2023-09-07

    Explore the project on GitHub here.

  • DialoGPT

    Large-scale pretraining for dialogue

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Code-LMs

    Guide to using pre-trained large language models of source code

  • Project mention: PolyCoder LLM integration | /r/neovim | 2023-05-23
  • transfer-learning-conv-ai

    🦄 State-of-the-Art Conversational AI with Transfer Learning

  • Discord-AI-Chatbot

    This Discord chatbot is incredibly versatile. Powered incredibly fast Groq API

  • Project mention: Discord bot for OpenAI API Key? | /r/ChatGPT | 2023-12-07
  • this-word-does-not-exist

    This Word Does Not Exist

  • Project mention: Ask HN: How do you name software? | news.ycombinator.com | 2024-02-10
  • TencentPretrain

    Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo

  • TextRL

    Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

  • DialogRPT

    EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"

  • MAGIC

    Language Models Can See: Plugging Visual Controls in Text Generation (by yxuansu)

  • CapDec

    CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)

  • Project mention: Open source – Unsupervised captioning getting closer to supervised captioning | news.ycombinator.com | 2024-04-20
  • transformer-lm

    Transformer language model (GPT-2) with sentencepiece tokenizer

  • pistoBot

    Create an AI that chats like you

  • openai-detector

    AI classifier for indicating AI-written text

  • namekrea

    NameKrea is an AI Domain Name Generator which uses GPT-2

  • nanoChatGPT

    nanogpt turned into a chat model (by VatsaDev)

  • Project mention: A full tutorial on turning GPT-2 into a conversational AI | news.ycombinator.com | 2023-08-31

    Hi, Vatsa here, this is tutorial on turning GPT-2 into a conversational bot, it was a fun project, and I hope you like it it!

    github -> https://github.com/VatsaDev/nanoChatGPT

  • AdaVAE

    [Preprint] AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling PyTorch Implementation

  • Extracting-Training-Data-from-Large-Langauge-Models

    A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python gpt-2 related posts

Index

What are some of the best open-source gpt-2 projects in Python? This list will help you:

Project Stars
1 RWKV-LM 11,619
2 LoRA 8,956
3 GPT2-Chinese 7,348
4 awesome-pretrained-chinese-nlp-models 4,193
5 picoGPT 3,081
6 xTuring 2,515
7 DialoGPT 2,315
8 Code-LMs 1,716
9 transfer-learning-conv-ai 1,712
10 Discord-AI-Chatbot 1,259
11 this-word-does-not-exist 1,009
12 TencentPretrain 975
13 TextRL 518
14 DialogRPT 336
15 MAGIC 245
16 CapDec 169
17 transformer-lm 163
18 pistoBot 139
19 openai-detector 98
20 namekrea 49
21 nanoChatGPT 47
22 AdaVAE 32
23 Extracting-Training-Data-from-Large-Langauge-Models 26

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com