Configuring GitHub's Linguist to Improve Repository Language Reporting

This page summarizes the projects mentioned and recommended in the original post on dev.to

Nutrient - The #1 PDF SDK Library
Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
nutrient.io
featured
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
  1. linguist

    Language Savant. If your repository's language is being reported incorrectly, send us a pull request!

    In this post, I explain how to configure GitHub's Linguist within your repository to enable more accurate and more relevant repository language reporting, with examples from a few of my own repositories. Every repository on GitHub has a chart that shows the distribution of languages detected in the repository. GitHub's Linguist is responsible for detecting the language of each file within your repository, and the reported percentages are based on file sizes. For example, "Java 50%" means that 50% of the total size of all detected files in the repository are Java files. There are also third party tools that display language statistics, such as the user-statistician GitHub Action that I developed and maintain, which includes on an SVG (among other things) a pie chart summarizing the language distribution across all of your public repositories (excluding forks). The language data necessary to generate that language chart comes from GitHub's GraphQL API, which is as it is reported for each of your repositories by Linguist.

  2. Nutrient

    Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.

    Nutrient logo
  3. user-statistician

    Generate a GitHub stats SVG for your GitHub Profile README in GitHub Actions

    In this post, I explain how to configure GitHub's Linguist within your repository to enable more accurate and more relevant repository language reporting, with examples from a few of my own repositories. Every repository on GitHub has a chart that shows the distribution of languages detected in the repository. GitHub's Linguist is responsible for detecting the language of each file within your repository, and the reported percentages are based on file sizes. For example, "Java 50%" means that 50% of the total size of all detected files in the repository are Java files. There are also third party tools that display language statistics, such as the user-statistician GitHub Action that I developed and maintain, which includes on an SVG (among other things) a pie chart summarizing the language distribution across all of your public repositories (excluding forks). The language data necessary to generate that language chart comes from GitHub's GraphQL API, which is as it is reported for each of your repositories by Linguist.

  4. InteractiveBinPacking

    Self-guided tutorial on combinatorial optimization, the bin packing problem, and constructive heuristics, suitable for use as course assignments, or by self-directed learners.

    For example, one of my repositories, InteractiveBinPacking, is an educational tool implemented in Java, with a few HTML files for contents of dialog boxes, etc, and also has a directory of example assignments with LaTeX source to enable course instructors to easily customize assignments. HTML and LaTeX are both classified as markup languages, and Java obviously as a programming language so those are all included by default, so a language chart with Java, HTML, and TeX makes sense. So far, no configuration necessary. I published a short journal article about the tool in the Journal of Open Source Education. That journal conducts the peer review within the repository itself, with a paper directory holding a Markdown file with the content of the paper, and usually a BibTeX file with the citation data for the references of the paper. Markdown is automatically excluded as prose, which is fine here. However, the BibTeX file would by default be included in the TeX count. The directory of example assignments in LaTeX is part of the purpose of the repository, but this BibTeX file is in a sense part of the documentation of the tool.

  5. Chips-n-Salsa

    A Java library of Customizable, Hybridizable, Iterative, Parallel, Stochastic, and Self-Adaptive Local Search Algorithms

    GitHub Language Chart From https://github.com/cicirello/Chips-n-Salsa

  6. awesome-readme

    A curated list of awesome READMEs

    About user-statistician

  7. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Hacktoberfest 2023 Update from Maintainer of the user-statistician GitHub Action

    2 projects | dev.to | 9 Oct 2023
  • Hacktoberfest 2023 Contributors Wanted: Additional Translations for the user-statistician GitHub Action

    3 projects | dev.to | 30 Sep 2023
  • The user-statistician GitHub Action mentioned in Awesome-README

    5 projects | dev.to | 25 Aug 2022
  • 🚀 Create An Attractive GitHub Profile README 📝

    21 projects | dev.to | 29 Jun 2024
  • Update your Profile README with Streak Stats

    2 projects | dev.to | 17 May 2024

Did you know that Java is
the 8th most popular programming language
based on number of references?