go-trafilatura
go-trafilatura is a Go port of the trafilatura Python library. (by markusmobius)
article-extraction-benchmark
Article extraction benchmark: dataset and evaluation scripts (by scrapinghub)
Our great sponsors
go-trafilatura | article-extraction-benchmark | |
---|---|---|
1 | 1 | |
32 | 242 | |
- | 5.8% | |
7.9 | 0.0 | |
10 months ago | almost 3 years ago | |
HTML | Python | |
GNU General Public License v3.0 only | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
go-trafilatura
Posts with mentions or reviews of go-trafilatura.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-03-30.
article-extraction-benchmark
Posts with mentions or reviews of article-extraction-benchmark.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-03-30.
What are some alternatives?
When comparing go-trafilatura and article-extraction-benchmark you can also consider the following projects:
unclutter - A modern reader mode and article library for your browser.
dom-distiller - Distills the DOM
go-domdistiller - Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no dependencies on Chromium and is meant to run as a command line program or on a server.
go-dateparser - go parser for human readable dates ported from the dateparser python package
arc90-readability - A copy of the original Arc90 repo with links to many of the current ports.
soup-strainer - A reimplementation of the Readability/Decruft algorithm using BeautifulSoup and html5lib
go-htmldate - CLI and Go package for extracting publication date of a web pages.
go-trafilatura vs unclutter
article-extraction-benchmark vs unclutter
go-trafilatura vs dom-distiller
article-extraction-benchmark vs go-domdistiller
go-trafilatura vs go-domdistiller
article-extraction-benchmark vs go-dateparser
go-trafilatura vs arc90-readability
article-extraction-benchmark vs arc90-readability
go-trafilatura vs go-dateparser
article-extraction-benchmark vs dom-distiller
go-trafilatura vs soup-strainer
article-extraction-benchmark vs go-htmldate