liar_dataset
dataset liar (by tfs4)
Fake-News-Classification
Exploration of Natural Language Processing techniques to create a prediction model using the LIAR dataset. (by moscatena)
liar_dataset | Fake-News-Classification | |
---|---|---|
1 | 1 | |
28 | 0 | |
- | - | |
10.0 | 10.0 | |
about 5 years ago | about 2 years ago | |
Jupyter Notebook | Jupyter Notebook | |
- | - |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
liar_dataset
Posts with mentions or reviews of liar_dataset.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-05-24.
-
On Data Quality
For our capstone project at Flatiron school we had to not only pitch the project we wanted to do, but also find a dataset that would allow us to accomplish that project. I chose to make a Fake News Classifier, and pitched a dataset I found on Kaggle for it. My instructor was quick in turning it down. It had no information on how that data was acquired, how it was labelled, and I had no means of verifying it. With some more research done I found the Liar dataset, which contained thousands of data points, humanly labelled by editors from politifact.com using a truthiness scale and which contained extensive metadata on each instance, making it verifiable. Once I settled on my final model, I decided to train a version of it on the rejected dataset, just for curiosity. The Accuracy it provided for the test data from that dataset was way higher than the one trained in the Liar dataset. Why was that? The model wasn't actually making correct predictions, it was just better at identifying the labels (which were not verified) from the dataset.
Fake-News-Classification
Posts with mentions or reviews of Fake-News-Classification.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-05-24.
-
On Data Quality
For our capstone project at Flatiron school we had to not only pitch the project we wanted to do, but also find a dataset that would allow us to accomplish that project. I chose to make a Fake News Classifier, and pitched a dataset I found on Kaggle for it. My instructor was quick in turning it down. It had no information on how that data was acquired, how it was labelled, and I had no means of verifying it. With some more research done I found the Liar dataset, which contained thousands of data points, humanly labelled by editors from politifact.com using a truthiness scale and which contained extensive metadata on each instance, making it verifiable. Once I settled on my final model, I decided to train a version of it on the rejected dataset, just for curiosity. The Accuracy it provided for the test data from that dataset was way higher than the one trained in the Liar dataset. Why was that? The model wasn't actually making correct predictions, it was just better at identifying the labels (which were not verified) from the dataset.
What are some alternatives?
When comparing liar_dataset and Fake-News-Classification you can also consider the following projects:
awesome-public-datasets - A topic-centric list of HQ open datasets.
chatgpt-comparison-detection - Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥
tf-transformers - State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).