Python Bert

Open-source Python projects categorized as Bert

Top 23 Python Bert Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s | reddit.com/r/deeplearning | 2022-12-02

    Looked into this last night and yeah, NVLink works the way you described because of misleading marketing- no contiguous memory pool, just a faster interconnect so maybe model parallelisation scales a bit better but you still have to implement it. Also saw an example where some PyTorch GPT2 models scaled horrifically in training with multiple PCIe V100s and 3090s that didn’t have NVLink so that’s a caveat with dual 4090s not having NVLink.

  • clip-as-service

    🏄 Embed/reason/rank images and sentences with CLIP models

    Project mention: Image Similarity Score using transfer learning | reddit.com/r/MLQuestions | 2022-08-31
  • Zigi

    The context switching struggle is real. Zigi makes context switching a thing of the past. It monitors Jira and GitHub updates, pings you when PRs need approval and lets you take fast actions - all directly from Slack!

  • PaddleNLP

    👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.

    Project mention: The 10 Trending Python Repositories on GitHub (May 2022) | dev.to | 2022-06-23

    PaddleNLP

  • haystack

    :mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.

    Project mention: New free tool that uses fine-tuned BERT model to surface answers from research papers | reddit.com/r/LanguageTechnology | 2022-10-28

    Some cool tools like HayStack that would be useful in putting some of these together.

  • ERNIE

    Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

    Project mention: ERNIE - ViLG 2.0 by Baidu | reddit.com/r/singularity | 2022-10-31
  • BERT-pytorch

    Google AI 2018 BERT pytorch implementation

  • bertviz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    Project mention: using bert for relation extraction | reddit.com/r/MLQuestions | 2022-08-10

    2) BERT learns a lot in its embeddings: the BERTOLOGY paper (https://arxiv.org/abs/2002.12327) provides a great in-depth look at some of the broader linguistic traits that BERT learns. Different layers often learn different patterns, so the embeddings aren't really interpretable, but you can use something like bertviz (https://github.com/jessevig/bertviz) to explore attention weights across layers for predetermined examples

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Data Platform where developers build real-time applications for analytics, IoT and cloud-native services in less time with less code.

  • BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics.

    Project mention: How can I group domain specific keywords based on their word embeddings? | reddit.com/r/MLQuestions | 2022-11-30
  • Top2Vec

    Top2Vec learns jointly embedded topic, document and word vectors.

    Project mention: How can I group domain specific keywords based on their word embeddings? | reddit.com/r/MLQuestions | 2022-11-30
  • KeyBERT

    Minimal keyword extraction with BERT

    Project mention: [D]: Predict the most probable document including the answer to a given question | reddit.com/r/MachineLearning | 2022-04-28

    Using keyword similarity using KeyBERT:https://github.com/MaartenGr/KeyBERT (i.e. loading keywords for each of the given documents and compare to the keywords of the question)

  • ABSA-PyTorch

    Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。

    Project mention: Is there an open-source way to replicate entity-level sentiment from Google's Cloud Natural Language API? | reddit.com/r/LanguageTechnology | 2021-12-06

    I'm learning about NLP and was really impressed with Google's Natural Language API (demo). It seems that entity-level sentiment analysis is the future of NLP. Has anyone in the community come across open-source libraries that replicate the API (although of course with lower F1 scores). I found an excellent repo called ABSA-PyTorch but it seems that all the implementations are classification-based; that is, they return "positive/negative" rather than a spectrum between positive and negative. Is there a sub field of Aspect-Based Sentiment Analysis (ABSA) that isn't classification based? I wasn't able to find any keywords despite hours of Google searching.

  • FARM

    :house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

  • jiant

    jiant is an nlp toolkit

    Project mention: Any recommendation for the replacement of the toolkit jiant? [Research] [Discussion] | reddit.com/r/MachineLearning | 2022-06-11

    I am doing research in NLP with the toolkit jiant (https://github.com/nyu-mll/jiant). It is a quite nice and easy-to-use tool. Unfortunately, it stopped being maintained. I wonder is there any other recommendation that I can use to replace it?

  • DeBERTa

    The implementation of DeBERTa

  • scibert

    A BERT model for scientific text.

    Project mention: Galactica: an AI trained on humanity's scientific knowledge | news.ycombinator.com | 2022-11-15
  • adapter-transformers

    Huggingface Transformers + Adapters = ❤️

    Project mention: [D] NLP question: does fine-tuning train input embedding? | reddit.com/r/MachineLearning | 2022-08-07

    Usually in computer vision resnets, people finetune only the last layers, but in NLP you tune the entire model. There are also plenty of instances where people try to not do this, such as in adapters, however.

  • BERT-NER

    Pytorch-Named-Entity-Recognition-with-BERT

  • contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

    Project mention: Extract words from large data set of reviews by sentiment | reddit.com/r/MLQuestions | 2022-05-23

    Use CTM https://github.com/MilaNLProc/contextualized-topic-models with sentiment labels to built distribution of words over labels

  • bertsearch

    Elasticsearch with BERT for advanced document search.

  • gector

    Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

  • Transformers4Rec

    Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.

    Project mention: New item prediction modules in open source libraries | reddit.com/r/datascience | 2022-02-22
  • beir

    A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

    Project mention: An alternative to Elasticsearch that runs on a few MBs of RAM | news.ycombinator.com | 2022-10-24

    There are actually benchmarks that allow measuring search relevancy objectively, e.g. BEIR[1]. Manticore Search team did an effort to make a PR to include it to the list. The results are here [2]. Unfortunately the BEIR team seems to be too busy to review a whole pile of PRs including about Vespa. Nevertheless it would be nice to have both Meilisearch and Typesense there too since it's interesting what performance those non-tf-idf based search engines would show compared to BM25-based and vector search engines.

    [1] https://github.com/beir-cellar/beir

  • PatrickStar

    PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-12-02.

Python Bert related posts

Index

What are some of the best open-source Bert projects in Python? This list will help you:

Project Stars
1 transformers 75,115
2 clip-as-service 11,034
3 PaddleNLP 6,682
4 haystack 6,122
5 ERNIE 5,329
6 BERT-pytorch 5,244
7 bertviz 4,641
8 BERTopic 3,427
9 Top2Vec 2,325
10 KeyBERT 2,001
11 ABSA-PyTorch 1,667
12 FARM 1,597
13 jiant 1,449
14 DeBERTa 1,195
15 scibert 1,186
16 adapter-transformers 1,105
17 BERT-NER 1,079
18 contextualized-topic-models 968
19 bertsearch 836
20 gector 713
21 Transformers4Rec 701
22 beir 691
23 PatrickStar 628
Truly a developer’s best friend
Scout APM is great for developers who want to find and fix performance issues in their applications. With Scout, we'll take care of the bugs so you can focus on building great things 🚀.
scoutapm.com