Configuring GitHub's Linguist to Improve Repository Language Reporting

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • linguist

    Language Savant. If your repository's language is being reported incorrectly, send us a pull request!

  • In this post, I explain how to configure GitHub's Linguist within your repository to enable more accurate and more relevant repository language reporting, with examples from a few of my own repositories. Every repository on GitHub has a chart that shows the distribution of languages detected in the repository. GitHub's Linguist is responsible for detecting the language of each file within your repository, and the reported percentages are based on file sizes. For example, "Java 50%" means that 50% of the total size of all detected files in the repository are Java files. There are also third party tools that display language statistics, such as the user-statistician GitHub Action that I developed and maintain, which includes on an SVG (among other things) a pie chart summarizing the language distribution across all of your public repositories (excluding forks). The language data necessary to generate that language chart comes from GitHub's GraphQL API, which is as it is reported for each of your repositories by Linguist.

  • user-statistician

    Generate a GitHub stats SVG for your GitHub Profile README in GitHub Actions

  • In this post, I explain how to configure GitHub's Linguist within your repository to enable more accurate and more relevant repository language reporting, with examples from a few of my own repositories. Every repository on GitHub has a chart that shows the distribution of languages detected in the repository. GitHub's Linguist is responsible for detecting the language of each file within your repository, and the reported percentages are based on file sizes. For example, "Java 50%" means that 50% of the total size of all detected files in the repository are Java files. There are also third party tools that display language statistics, such as the user-statistician GitHub Action that I developed and maintain, which includes on an SVG (among other things) a pie chart summarizing the language distribution across all of your public repositories (excluding forks). The language data necessary to generate that language chart comes from GitHub's GraphQL API, which is as it is reported for each of your repositories by Linguist.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • InteractiveBinPacking

    Self-guided tutorial on combinatorial optimization, the bin packing problem, and constructive heuristics, suitable for use as course assignments, or by self-directed learners.

  • For example, one of my repositories, InteractiveBinPacking, is an educational tool implemented in Java, with a few HTML files for contents of dialog boxes, etc, and also has a directory of example assignments with LaTeX source to enable course instructors to easily customize assignments. HTML and LaTeX are both classified as markup languages, and Java obviously as a programming language so those are all included by default, so a language chart with Java, HTML, and TeX makes sense. So far, no configuration necessary. I published a short journal article about the tool in the Journal of Open Source Education. That journal conducts the peer review within the repository itself, with a paper directory holding a Markdown file with the content of the paper, and usually a BibTeX file with the citation data for the references of the paper. Markdown is automatically excluded as prose, which is fine here. However, the BibTeX file would by default be included in the TeX count. The directory of example assignments in LaTeX is part of the purpose of the repository, but this BibTeX file is in a sense part of the documentation of the tool.

  • Chips-n-Salsa

    A Java library of Customizable, Hybridizable, Iterative, Parallel, Stochastic, and Self-Adaptive Local Search Algorithms

  • GitHub Language Chart From https://github.com/cicirello/Chips-n-Salsa

  • awesome-readme

    A curated list of awesome READMEs

  • About user-statistician

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Hacktoberfest 2023 Update from Maintainer of the user-statistician GitHub Action

    2 projects | dev.to | 9 Oct 2023
  • Hacktoberfest 2023 Contributors Wanted: Additional Translations for the user-statistician GitHub Action

    3 projects | dev.to | 30 Sep 2023
  • The user-statistician GitHub Action mentioned in Awesome-README

    5 projects | dev.to | 25 Aug 2022
  • 🚀 A Comprehensive Guide to Personalizing Your GitHub Profile README

    3 projects | dev.to | 24 Apr 2024
  • Readme: A Curated List of READMEs

    1 project | news.ycombinator.com | 12 Feb 2024