Our great sponsors
-
jekyll-sitemap
Jekyll plugin to silently generate a sitemaps.org compliant sitemap for your Jekyll site
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I use GitHub Pages for my personal website, as well as for several project sites. Although some static site generators include support for sitemap generation (e.g., Jekyll has a plugin for sitemaps), my personal website is generated by a custom static site generator that I built for a few specialized reasons, and most of my project sites for Java libraries consist of a single hand-written HTML page combined with javadoc-generated documentation. So a while back I implemented a GitHub Action, generate-sitemap, that can generate an XML sitemap by crawling a GitHub repository containing the HTML of the site. It uses the last commit date of each file to produce the tags. By default, it includes URLs for HTML and PDF files in the sitemap, and skips other file extensions in the repository. But it can be configured to include URLs corresponding to whatever file extensions you want included. It checks the head of HTML pages for noindex meta tags, and excludes such files from the sitemap, and it likewise excludes files from the sitemap if they match a Disallow rule in your robots.txt. The generate-sitemap can be configured in a few other ways as well (see the documentation in the GitHub repository for all details). The generate-sitemap action is implemented in Python as a container action.
I use GitHub Pages for my personal website, as well as for several project sites. Although some static site generators include support for sitemap generation (e.g., Jekyll has a plugin for sitemaps), my personal website is generated by a custom static site generator that I built for a few specialized reasons, and most of my project sites for Java libraries consist of a single hand-written HTML page combined with javadoc-generated documentation. So a while back I implemented a GitHub Action, generate-sitemap, that can generate an XML sitemap by crawling a GitHub repository containing the HTML of the site. It uses the last commit date of each file to produce the tags. By default, it includes URLs for HTML and PDF files in the sitemap, and skips other file extensions in the repository. But it can be configured to include URLs corresponding to whatever file extensions you want included. It checks the head of HTML pages for noindex meta tags, and excludes such files from the sitemap, and it likewise excludes files from the sitemap if they match a Disallow rule in your robots.txt. The generate-sitemap can be configured in a few other ways as well (see the documentation in the GitHub repository for all details). The generate-sitemap action is implemented in Python as a container action.