Python attention-mechanism

Open-source Python projects categorized as attention-mechanism

Top 23 Python attention-mechanism Projects

attention-mechanism
  • vit-pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  • CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  • RWKV-LM

    RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

    Project mention: Do LLMs need a context window? | news.ycombinator.com | 2023-12-25

    https://github.com/BlinkDL/RWKV-LM#rwkv-discord-httpsdiscord... lists a number of implementations of various versions of RWKV.

    https://github.com/BlinkDL/RWKV-LM#rwkv-parallelizable-rnn-w... :

    > RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V)

    > RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode.

    > So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding (using the final hidden state).

    > "Our latest version is RWKV-6,*

  • DALLE-pytorch

    Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

  • x-transformers

    A concise but complete full-attention transformer with a set of promising experimental features from various papers

    Project mention: x-transformers | news.ycombinator.com | 2024-03-31
  • awesome-graph-classification

    A collection of important graph embedding, classification and representation learning papers with implementations.

  • GAT

    Graph Attention Networks (https://arxiv.org/abs/1710.10903)

  • a-PyTorch-Tutorial-to-Image-Captioning

    Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

    Project mention: Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old | news.ycombinator.com | 2024-02-28

    Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]

    [0] https://github.com/linto-ai/whisper-timestamped

  • reformer-pytorch

    Reformer, the efficient Transformer, in Pytorch

  • swarms

    The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503

    Project mention: Swarm, a new agent framework by OpenAI | news.ycombinator.com | 2024-10-11

    Worth noting there is an interesting multi-agent open source project named Swarms. When I saw this on X earlier I thought maybe the team had joined OpenAI but there's no connection between these projects

    > "Swarms: The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework"

    [0] https://github.com/kyegomez/swarms

    [1] https://docs.swarms.world/en/latest/

  • alphafold2

    To eventually become an unofficial Pytorch implementation / replication of Alphafold2, as details of the architecture get released

  • soundstorm-pytorch

    Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

  • flamingo-pytorch

    Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

  • perceiver-pytorch

    Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

  • performer-pytorch

    An implementation of Performer, a linear attention-based transformer, in Pytorch

  • CoCa-pytorch

    Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

  • RETRO-pytorch

    Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

  • PaLM-pytorch

    Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways

  • tab-transformer-pytorch

    Implementation of TabTransformer, attention network for tabular data, in Pytorch

  • TimeSformer-pytorch

    Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification

  • memorizing-transformers-pytorch

    Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

    Project mention: HMT: Hierarchical Memory Transformer for Long Context Language Processing | news.ycombinator.com | 2024-05-17

    Code: https://github.com/OswaldHe/HMT-pytorch

    This looks really interesting. I've the paper to my reading list and look forward to playing with the code. I'm curious to see what kinds of improvements we can get by agumenting Transformers and other generative language/sequence models with this and other mechanisms implementing hierarchical memory.[a]

    We sure live in interesting times!

    ---

    [a] In the past, I experimented a little with transformers that had access to external memory using https://github.com/lucidrains/memorizing-transformers-pytorc... and also using routed queries with https://github.com/glassroom/heinsen_routing . Both approaches seemed to work, but I never attempted to build any kind of hierarchy with those approaches.

  • nuwa-pytorch

    Implementation of NÃœWA, state of the art attention network for text to video synthesis, in Pytorch

  • parti-pytorch

    Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python attention-mechanism discussion

Log in or Post with

Python attention-mechanism related posts

Index

What are some of the best open-source attention-mechanism projects in Python? This list will help you:

Project Stars
1 vit-pytorch 20,969
2 RWKV-LM 12,755
3 DALLE-pytorch 5,587
4 x-transformers 4,885
5 awesome-graph-classification 4,728
6 GAT 3,199
7 a-PyTorch-Tutorial-to-Image-Captioning 2,761
8 whisper-timestamped 2,114
9 reformer-pytorch 2,097
10 swarms 1,870
11 alphafold2 1,565
12 soundstorm-pytorch 1,436
13 flamingo-pytorch 1,203
14 perceiver-pytorch 1,096
15 performer-pytorch 1,088
16 CoCa-pytorch 1,076
17 RETRO-pytorch 849
18 PaLM-pytorch 821
19 tab-transformer-pytorch 818
20 TimeSformer-pytorch 694
21 memorizing-transformers-pytorch 623
22 nuwa-pytorch 540
23 parti-pytorch 524

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you konow that Python is
the 2nd most popular programming language
based on number of metions?