    apologies if this should be a discussion/issue/whatever but:

    Do you envision going up against CodeQL and/or <> by making semantic information available to the ast nodes? OT1H, I can imagine it could be an overwhelming increase in project scope, but OTOH it could also truly lead to some stunning transformation patterns

    e.g. or even more "textual" semantics such as

      var foo = "hello".substring(1); // knowing "foo" is a String
    While not trivial because it is not documented, you can create your a database with your own facts. Some of the extractors that create the required files are open source
    Efforts: Dependabot, CodeQL, Coverity, facebook's Infer tool, etc
    > But why not for instance use a build system in some "container"?

    I am not sure how this helps.

    > I think the project could "bother" contributors with something like that, couldn't it?

    Which project?

    > An embedded C developer I've talked with quite often on some other forum, who imho is quite competent, said that Coverity is a poor tool that generates way too much false negatives and overlooks at the same time glaring issues.

    He likely violated a license agreement with Coverity, since no one is allowed to say anything comparing Coverity to anything else.

    > Said that's mostly an issue with all OpenSource tools for static C analysis.

    I have been filing bug reports.

    > OTOH the commercial ones are very expensive usually, with a target market of critical things like aviation of safety systems in cars and military use, places where they spend billions on projects. Nothing there for the average company, and especially not for (frankly often underfunded) OpenSource projects.

    So you understand my pain.

    > CodeQL? It's mostly an semantic search and replace tool, as I know? Is it that helpful? (I had a look, but the projects I'm working on don't require it. One would just use the IDE. No need for super large-scale refactorings, across projects, in our case).

    I have never heard about this function. It is a static analyzer whose checks are written in the CodeQL language. However, it is very immature. When github acquired it, they banished the less reliable checks to the extended-and-security suite, leaving it only with about ~50 checks for C/C++ code. Those catch very little, although in the rare instances that they do catch things, the catches are somewhat amazing. Unfortunately, at least one of those checks provides technically correct, yet difficult to understand, explanations of the problem, so most developers would dismiss its reports as false positives despite it being correct:

    There are probably more issues like that, but I have yet to see and report them.

    > SonarCloud, hmm… This one I've used (around web development though). But am not a fan of. It bundles other "scanner" tools, with varying quality and utility. At least what they had for the languages I've actively used it was mostly about "style issues". And when it showed real errors, the IDE would do the same… (The question then is how this could be committed in the first place. But OK, some people just don't care. For them you need additional checks like SonarCloud I guess.)

    It is supposed to be able to integrate into github's code scanning feature, so any newly detected issues are reported in the PR that generated them. Anyway, it is something that I am considering. I wanted to use it much sooner, but it required authorization to make changes to github on my behalf, which made me cautious about the manner in which I try it. It is basically at the bottom of my todo list right now.

    > Wouldn't it be easy to add at least this to the build by using some "build container"?

    I do not understand your question. To use it, we need a few things:

    1. To be able to show any newly introduced defect reports in the PR that generated them shortly after it was filed.

    2. To be able to scan the kernel modules since right now, it cannot due to a bad interaction between the build system and how compiler interposition is done. As of a few days ago, I have a bunch of hacks locally that enable kernel module scans, but this needs more work.

    > Well, that's why I think something equivalent to `-Wall -Werror` should be switched on before writing the first line of code, in any language.

    OpenZFS has had that in place for more than a decade. I do not know precisely when it was first used (although I could look if anyone is particularly interested), but my guess is 2008 when ZFSOnLinux started. Perhaps it was done at Sun before then, but both events predate me. I became involved in 2012 and it is amazing to think that I am now considered one of the early OpenZFS contributors.

    Interestingly, the earliest commits in the OpenZFS repository referencing static analysis are from 2009 (with the oldest commit being from 2008 when ZFSOnLinux started). Those commits are ports of changes from OpenSolaris based on defect reports made by Coverity. There would be no more commits mentioning static analysis until 2014 when I wrote patches fixing things reported by Clang's static analyzer. Coverity was (re)introduced in 2016.

    As far as the current OpenZFS repository is concerned, knowledge of static analysis died with OpenSolaris and we lost an entire form of QA until we rediscovered it during attempts to improve QA years later.

    > But I guess I will stay with engraving my data into solid rock. Proven for at least hundred thousand years.

    That method is no longer reliable due to acid rain. You would need to bury it in a tomb to protect it from acid rain. That has the pesky problem of the pointers being lost over time.

    > At least someone needs to preserve the cat pictures and meme of our current human era for the cockroach people of the distant future. I'm not sure they will have a compatible Linux kernel and compiler available to build the ZFS drivers, or even punch card readers…

    Github's code vault found a solution for that:

    I vaguely recall another effort trying to include the needed hardware in time capsules, but I could be misremembering.

