code-gov
datasette-scraper
code-gov | datasette-scraper | |
---|---|---|
3 | 1 | |
257 | 57 | |
1.9% | - | |
3.4 | 2.5 | |
about 1 month ago | about 1 year ago | |
Python | ||
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
code-gov
datasette-scraper
-
GitHub – GSA/code-gov: An informative repo for all Code.gov repos
https://github.com/cldellow/datasette-scraper/#architecture
(TIL datasette-scraper parses HTML with selectolax; and Selectolax with Modest or Lexbor is ~25x faster at HTML parsing than BeautifulSoup in the selectolax benchmark:
What are some alternatives?
extruct - Extract embedded metadata from HTML markup
kylo - Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
code-json-generator - Automation that scrapes USEPA github and provides that metadata for code.gov
hugo-obsidian - simple GitHub action to parse Markdown Links into a .json file for Hugo
NYSenate.gov-D7 - Designed to increase public participation in the legislative process, this web application serves as the central digital presence of the New York State Senate.
datasette-lite - Datasette running in your browser using WebAssembly and Pyodide
datasette - An open source multi-tool for exploring and publishing data
dotgov-data - Official list of .gov domains