spark-nlp
spark-nlp-workshop
spark-nlp | spark-nlp-workshop | |
---|---|---|
87 | 16 | |
3,695 | 999 | |
1.2% | 1.1% | |
9.3 | 9.6 | |
10 days ago | 5 days ago | |
Scala | Jupyter Notebook | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark-nlp
- Spark NLP 5.1.0: Introducing state-of-the-art OpenAI Whisper speech-to-text, OpenAI Embeddings and Completion transformers, MPNet text embeddings, ONNX support for E5 text embeddings, new multi-lingual BART Zero-Shot text classification, and much more!
-
PySpark for NLP Workshop - Materials and Jupyter Notebooks
I recently had the opportunity to run a workshop at ODSC East, focusing on using PySpark for Natural Language Processing (NLP). Had a great time explaining PySpark's fundamentals and exploring the Spark NLP library.
- Spark-NLP 4.4.0: New BART for Text Translation & Summarization, new ConvNeXT Transformer for Image Classification, new Zero-Shot Text Classification by BERT, more than 4000+ state-of-the-art models, and many more! · JohnSnowLabs/spark-nlp
-
Transformers.js
I'd like to use this transformer model in rust (because it's on the backend, because I can use data munging and it will be faster, and for other reasons). It looks like a good model! But, it doesn't compile on Apple Silicon for wierd linking issues that aren't apparent - https://github.com/guillaume-be/rust-bert/issues/338. I've spent a large part of today and yesterday attempting to find out why. The only other library that I've found for doing this kind of thing programmatically (particularly sentiment analysis) is this (https://github.com/JohnSnowLabs/spark-nlp). Some of the models look a little older, which is OK, but it does mean that I'd have to do this in another language.
Does anyone know of any sentiment analysis software that can be tuned (other than VADER - I'm looking for more along the lines of a transformer model) - like BERT, but is pretrained and can be used in Rust or Python? Otherwise I'll probably using spark-nlp and having to spin another process.
Thanks.
- Release John Snow Labs Spark-NLP 4.3.0: New HuBERT for speech recognition, new Swin Transformer for Image Classification, new Zero-shot annotator for Entity Recognition, CamemBERT for question answering, new Databricks and EMR with support for Spark 3.3, 1000+ state-of-the-art models and many more!
spark-nlp-workshop
- FLaNK Stack Weekly 19 Feb 2024
-
Spark-NLP 4.1.0 Released: Vision Transformer (ViT) is here! The very first Computer Vision pipeline for the state-of-the-art Image Classification task, AWS Graviton/ARM64 support, new EMR & Databricks support, 1000+ state-of-the-art models, and more!
You can visit Spark NLP Workshop for 100+ examples
-
Spark-NLP 4.0.0 🚀: New modern extractive Question answering (QA) annotators for ALBERT, BERT, DistilBERT, DeBERTa, RoBERTa, Longformer, and XLM-RoBERTa, official support for Apple silicon M1, support oneDNN to improve CPU up to 97%, improved transformers on GPU up to +700%, 1000+ SOTA models
I submitted a pull request here: https://github.com/JohnSnowLabs/spark-nlp-workshop/pull/552 that I think addresses both of those.
-
How AI is used for mental health therapy
In SnowLab’s implementation, for example, they wrote a search function called get_clinical_entities that finds all mentions of medications for 100 patients, as well as specifications, if any, about the quantity and frequency the medication is consumed. The location of the sentence in the overall piece is also recorded, to locate the information easier.
-
John Snow Labs Spark-NLP 3.4.0: New OpenAI GPT-2, new ALBERT, XLNet, RoBERTa, XLM-RoBERTa, and Longformer for Sequence Classification, support for Spark 3.2, new distributed Word2Vec, extend support to more Databricks & EMR runtimes, new state-of-the-art transformer models, bug fixes, and lots more!
There are so many examples here for Python users (I would start from tutorials/Certificate_Trainings): https://github.com/JohnSnowLabs/spark-nlp-workshop
-
John Snow Labs Spark-NLP 3.1.0: Over 2600+ new models and pipelines in 200+ languages, new DistilBERT, RoBERTa, and XLM-RoBERTa transformers, support for external Transformers, and lots more!
Spark NLP Workshop notebooks
-
Release John Snow Labs Spark-NLP 2.7.0: New T5 and MarianMT seq2seq transformers, detect up to 375 languages, word segmentation, over 720+ models and pipelines, support for 192+ languages, and many more! · JohnSnowLabs/spark-nlp
Spark NLP training certification notebooks for Google Colab and Databricks
Spark NLP training certification notebooks for Google Colab and Databricks
Spark NLP training certification notebooks for Google Colab and Databricks
Spark NLP training certification notebooks for Google Colab and Databricks
What are some alternatives?
onnxruntime - ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
spark-nlp-display - A library for the simple visualization of different types of Spark NLP annotations.
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
proton - A streaming SQL engine, a fast and lightweight alternative to ksqlDB and Apache Flink, 🚀 powered by ClickHouse.
nlu - 1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
TensorRT-LLM - TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
pytorch-sentiment-analysis - Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
magika - Detect file content types with deep learning
clj-djl - clojure wrap for deep java library(DJL.ai)
Tribuo - Tribuo - A Java machine learning library
libpython-clj - Python bindings for Clojure
gector - Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)