awesome-semantic-web
extruct
awesome-semantic-web | extruct | |
---|---|---|
5 | 3 | |
1,319 | 821 | |
1.2% | 1.1% | |
6.1 | 3.8 | |
12 days ago | 8 days ago | |
Python | ||
Creative Commons Zero v1.0 Universal | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-semantic-web
-
GitHub – GSA/code-gov: An informative repo for all Code.gov repos
https://github.com/semantalytics/awesome-semantic-web#csvw
A GitHub Action would run regularly, fetch each code.json, save each to a git repo, and then upsert each into a SQLite database to be published with e.g. datasette or datasette-lite.
- Super-Structured Data: Rethinking the Schema
-
Python Tools for the Semantic Web, an Overview
Have you taken a look at: https://github.com/semantalytics/awesome-semantic-web#python, it would be great to further this list along given it's breadth and age.
-
Looking for software
You might find some of what you need here https://github.com/semantalytics/awesome-semantic-web
-
A Review of the Semantic Web Field
https://github.com/semantalytics/awesome-semantic-web#progra...
Why are you spreading FUD?
extruct
-
GitHub – GSA/code-gov: An informative repo for all Code.gov repos
https://github.com/rushter/selectolax#simple-benchmark )
(Apache Nutch is a Java-based web crawler which supports e.g. CommonCrawl (which backs various foundational LLMs)) https://en.wikipedia.org/wiki/Apache_Nutch#Search_engines_bu... . But extruct extracts more types of metadata and data than Nutch AFAIU: https://github.com/scrapinghub/extruct )
datasette-graphql adds a GraphQL HTTP API to a SQLite database:
-
Alternative to extruct python library ? (scraping schema.org, jsonld, twitter and fb card)
Is there an alternative for extruct python library in golang ?
-
Scraping MMA fighter stats from a list of names
Seems like sherdog.com supports schema.org data markup - which is really easy to scrape! There's a brilliant python parser for https://github.com/scrapinghub/extruct.
What are some alternatives?
clojure-graph-resources - A curated list of Clojure resources for dealing with graph-like data.
rdflib - RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
zed - A novel data lake based on super-structured data
PyLD - JSON-LD processor written in Python
EasierRDF - Making RDF easy enough for most developers
contextualise - Contextualise is an effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
lv2 - The LV2 audio plugin specification
code-gov - An informative repo for all Code.gov repos
trifid - Lightweight Linked Data Server and Proxy
kylo - Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
awesome-knowledge-management - A curated list of amazingly awesome articles, people, applications, software libraries and projects related to the knowledge management space
metatron - A Python 3.x HTML Meta tag parser, with emphasis on OpenGraph and complex meta tag schemes