anon
wtf_wikipedia
anon | wtf_wikipedia | |
---|---|---|
3 | 1 | |
966 | 743 | |
- | - | |
0.0 | 8.0 | |
over 1 year ago | 16 days ago | |
JavaScript | JavaScript | |
Creative Commons Zero v1.0 Universal | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
anon
-
Show HN: Explore Wikipedia edits made by institutions, companies and governments
There was fun time when Ed Summers made a tool to monitor Wikipedia edits from some IPs pool realtime, and it turned into worldwide effort with Twitter bots monitoring many governments and big corporations, highlighting a lot of cringe edits and poor attempts to remove some info from Wikipedia. Many bots are still active, you can find source code and list of bots here https://github.com/edsu/anon
Also there is analysis of old edits (2002-2014) using IP ranges collected for bots https://jarib.github.io/anon-history/, source code: https://github.com/jarib/anon-history
-
Sexual Assault Allegations Vanished from Potential Cori Bush Challenger’s Wikipedia Page | The edits to the page for state Sen. Steven Roberts came from an IP address in the Missouri Capitol.
Hell yea! It's open source!
-
My dad fuck my life up
anon which powers @congressedits, among many others
wtf_wikipedia
-
Experimental library for scraping websites using OpenAI's GPT API
This may finally be a solution for scraping wikipedia and turning it into structured data. (Or do we even need structured data in the post-AI age?)
Mediawiki is notorious for being hard to parse:
* https://github.com/spencermountain/wtf_wikipedia#ok-first- - why it's hard
* https://techblog.wikimedia.org/2022/04/26/what-it-takes-to-p... - an entire article about parsing page TITLES
* https://osr.cs.fau.de/wp-content/uploads/2017/09/wikitext-pa... - a paper published about a wikitext parser
What are some alternatives?
UserScripts - Novem Linguae's Wikipedia user scripts
sdow - Six Degrees of Wikipedia
socks5-client - SOCKS v5 client socket implementation in JavaScript for Node.JS.
duckling - Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
SwitchyOmega - Manage and switch between multiple proxies quickly & easily.
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python
javascript-x-server - JavaScript X Server (current protocol prototyping in Node.js, hoping to port to HTML5 for graphics)
scrapeghost - 👻 Experimental library for scraping websites using OpenAI's GPT API.
node-geoip-web - 🌎 a small server that returns the location of a given IP address
node-dhcpjs - dhcpjs provides native DHCP support in Node.js
ssdapi - Student Service Delivery API
contrail-web-core - Contrail web ui backend code