-
Swahili-sentiment-Analysis-using-transformers
The special repository to demonstrate how you can use transformers for Swahili text classification
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I have copied the first news content from the Train.csv file to see how the Swahili model can work with it and it does the right classification because the sentence is long you can check on the notebook.
Let's dive into the main topic of this article, we are going to train a transformer model for Swahili news classification, Since transformers are large to make the task simple we need to select a wrapper to work with, if you are good with PyTorch you can use PyTorch Lightning a wrapper for high-performance AI research, to wrap the transformers but today lets go with ktrain from Tensorflow Python Library.
Let's dive into the main topic of this article, we are going to train a transformer model for Swahili news classification, Since transformers are large to make the task simple we need to select a wrapper to work with, if you are good with PyTorch you can use PyTorch Lightning a wrapper for high-performance AI research, to wrap the transformers but today lets go with ktrain from Tensorflow Python Library.
With the Transformer API in ktrain, we can select any Hugging Face transformers model appropriate for our data. Since we are dealing with Swahili, we will use multilingual BERT which is normally used by ktrain for non-English datasets in the alternative text_classifier API in ktrain. But you can opt for any other multilingual transformer model.