storm-crawler
Sparkler
storm-crawler | Sparkler | |
---|---|---|
- | - | |
862 | 409 | |
1.6% | 0.0% | |
8.8 | 3.0 | |
5 days ago | about 1 year ago | |
HTML | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
storm-crawler
We haven't tracked posts mentioning storm-crawler yet.
Tracking mentions began in Dec 2020.
Sparkler
We haven't tracked posts mentioning Sparkler yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
Apache Nutch - Apache Nutch is an extensible and scalable web crawler
jsoup - jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
Crawler4j - Open Source Web Crawler for Java
PeARS-orchard - This is the development version of PeARS, the people's search engine. More compact but less robust than PeARS-federated. If you just want to use PeARS in real life, use PeARS-federated instead.
lucene - Apache Lucene open-source search software
scaling-to-distributed-crawling - Repository for the Mastering Web Scraping in Python: Scaling to Distributed Crawling blogpost with the final code.
Apache Solr - Apache Lucene and Solr open-source search software