What is Structured Data?

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • opengraph

    A python module to parse the Open Graph Protocol

    Web pages are an interesting example of both structured and unstructured data. There are specific elements one could look at for certain information like the element or other semantic elements like or . The problem though is that these elements are more like our "address" example earlier - they often contain more than just the strict data we are looking for. A title might have a prefix or suffix of the website's name. An article or section might have many other layers of

    , or any other elements to help form the site's structure. To top it off, the HTML structure can vary wildly from site to site. If you were wanting to extract data from multiple websites, it can get very hard very fast.

    That said, there are a number of ways to embed structured data into web pages. A web page could use Microdata, RDFa, JSON-LD or Open Graph to express structured data. More than that though, a web page can use multiple of these at the same time. Open Graph is commonly used as a method of defining details for a link preview while the others might express more complex data like product pricing or reviews.

    Having standard formats like Microdata or JSON-LD are a good start but only represent the format of the data - we need a common vocabulary so we can understand the data those formats encode. One common vocabulary used is called Schema.org and provides over 700 types including types to describe people, places, products, recipes, reviews, vehicles, movies and medical devices. Using Schema.org for structured data on a website can help search engines provide richer experiences in the search results.

    Summary

    Structured data, through standardising expected properties and value formats, makes the sharing and processing of data easier. Web pages in particular benefit from encoding structured data in their mark-up where it can be used by search engines and other tools.

  • PyLD

    JSON-LD processor written in Python

    Web pages are an interesting example of both structured and unstructured data. There are specific elements one could look at for certain information like the element or other semantic elements like or . The problem though is that these elements are more like our "address" example earlier - they often contain more than just the strict data we are looking for. A title might have a prefix or suffix of the website's name. An article or section might have many other layers of

    , or any other elements to help form the site's structure. To top it off, the HTML structure can vary wildly from site to site. If you were wanting to extract data from multiple websites, it can get very hard very fast.

    That said, there are a number of ways to embed structured data into web pages. A web page could use Microdata, RDFa, JSON-LD or Open Graph to express structured data. More than that though, a web page can use multiple of these at the same time. Open Graph is commonly used as a method of defining details for a link preview while the others might express more complex data like product pricing or reviews.

    Having standard formats like Microdata or JSON-LD are a good start but only represent the format of the data - we need a common vocabulary so we can understand the data those formats encode. One common vocabulary used is called Schema.org and provides over 700 types including types to describe people, places, products, recipes, reviews, vehicles, movies and medical devices. Using Schema.org for structured data on a website can help search engines provide richer experiences in the search results.

    Summary

    Structured data, through standardising expected properties and value formats, makes the sharing and processing of data easier. Web pages in particular benefit from encoding structured data in their mark-up where it can be used by search engines and other tools.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts