2.2. Frontend Library (libStaticAnalyzerFrontend)

2.2.1. Introduction

This document will describe the frontend of the Static Analyzer, basically everything from compiling the analyzer from source, through it’s invocation up to the beginning of the analysis. It will touch on topics such as

  • How the analyzer is compiled, how tools such as TableGen are used to generate some of the code,
  • How to invoke the analyzer,
  • How crucial objects of the analyzer are initialized before the actual analysis begins, like
    • The AnalyzerOptions class, which entails how the command line options are parsed,
    • The CheckerManager class, which entails how the checkers of the analyzer are registered and loaded into it,
    • No list is complete without at least a third item.
  • How certain errors are handled with regards to backward compatibility,

starting from how an entry in the TableGen gets processed during the compilation of the project, how this process begins runtime when the analyzer is invoked, up to the point where the actual analysis begins.

The document will rely on the reader having a basic understanding about what checkers are, have invoked the analyzer at least a few times from the command line. If you also have at least registered a checker in the past up to the point where it shows up in clang -cc1 -analyzer-checker-help, that’s a plus, but not a requirement.

2.2.1.1. Overview

The following section is sort of a summary, and severeral items will be later revisited in greater detail.

2.2.1.1.1. Compilation

The Static Analyzer consists of 3 libraries, libStaticAnalyzerCore, libStaticAnalyzerCheckers and libStaticAnalyzerFrontend. The checker library depends on core, and frontend depends on both. Before any of them are compiled, TableGen is run on Checkers.td, according to the rules defined in ClangSACheckersEmitter.cpp, and generates the file Checkers.inc. By using the preprocessor, this, and other definition files (with the extension *.def) are converted into actual code, such as fields within AnalyzerOptions and function calls for registering checkers in CheckerRegistry.

Following this, the compilation goes on as usual. The fastest way of obtaining the analyzer for development is by configuring CMake with the following options:

  • Use the Ninja build system
  • Build in Release with asserts enabled (Only recommended for slower computers!)
  • Build shared libraries
  • Only build a single target triple
  • Use clang as the C/C++ compiler
  • Use gnu gold, or even better, LLD as a linker

An example configuration:

cmake \
  -G "Ninja" \
  -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_SHARED_LIBS=ON \
  -DLLVM_TARGETS_TO_BUILD=X86 \
  -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_ENABLE_SPHINX=ON \
  -DSPHINX_WARNINGS_AS_ERROS=OFF \
  -DCMAKE_CXX_COMPILER=clang++ \
  -DCMAKE_C_COMPILER=clang \
  -DLLVM_USE_LINKER=lld \
  -fuse-ld=lld \
  ../llvm

If you want to build the analyzer and nothing else, compile the target clang. For development purposes, compile check-clang-analysis.

2.2.1.1.2. Invocation

Other documents detail the difference between the driver and the frontend of clang far more precisely, but we’ll touch on this briefly: When you input clang into the command line, you invoke the driver. This compiler driver has the “look and feel” of a standard GCC compiler – it invokes several compiler components, collectively called the frontend, with options appropriate for your system, which is for example why you don’t have to specify where the standard libraries are. The Static Analyzer itself is a compiler component, or frontend action. You can tell the driver to invoke it with a default set of options with the --analyze flag:

# We might as well use the -c flag too, in order to skip code generation.
clang myfile.c --analyze

You won’t be able to see the command line options for frontend actions with the regular --help flag, nor will you be able to use them – for that, you’ll have to enter clang’s “frontend mode” with the -cc1 flag:

# Display all command line options
clang -cc1 --help

# Display all Static Analyzer options
clang -cc1 --help | grep analyze

You can, however, use the driver mode and still pass some options to the frontend, if you use -Xclang before each frontend command line option.

clang myfile.c --analyze -Xclang -analyzer-output=html

Every driver option is implicitly a frontend option too, so with -cc1, you can use whatever option you’d like without -Xclang or anything similar.

Currently, the only Static Analyzer related command line option for the driver is --analyze. Note that in frontend mode, clang doesn’t use a default set of options, so the bare minimum you’ll need is enabling the Static Analyzer frontend action with -analyze, enable at least a single checker, and specify an input file.

clang -cc1 -analyze -analyzer-checker=core filename.c

Although we don’t support running the analyzer without enabling the entire core package, it is possible, but might lead to crashes and incorrect reports.

2.2.1.1.2.1. Analyzer configurations

Two of the frontend analyzer flags, -analyzer-config-help and -analyzer-checker-option-help shows even more configuration options (or config options), that when specified in the command line, has to be preceded by -analyzer-config:

clang -cc1 [analyzer flags] -analyzer-config notes-as-events=true \
    -analyzer-config unix.DynamicMemoryModeling:Optimistic=true

One can always retrieve from a given analyzer invocation the full configuration, by enabling the debug.ConfigDumper checker:

clang -cc1 [analyzer flags] -analyzer=checker=debug.ConfigDumper

For backward compatibility reasons, these options will always be verified by default in frontend mode, but not in driver mode. This is configurable by the analyzer-config-compatibility-mode frontend flag.

Should the user supply the same option multiple times (with possibly different values), only the last one will be regarded.

2.2.1.1.3. Initializing the analyzer

First, ParseAnalyzerArgs in (clang repository)/lib/Frontend/CompilerInvocation.cpp parses every analyzer related command line arguments, validates them, with the exception of checker options.

Later, in (clang repository)/lib/FrontendTool/ExecuteCompilerInvocation.cpp, AnalysisAction is created, which creates an AnalysisConsumer. It’s constructor will inspect AnalyzerOptions and set up all initialization functions according to it. These functions will be called in AnalysisConsumer::Initialize, which will create all the necessary classes needed for the actual analysis. The most important among these is CheckerManager and AnalysisManager.

CheckerManager owns every checker object, and it’s interface allows AnalysisManager to run specific checkers on specific events. The most important part of it’s initialization is loading, or in other terms, registering checkers into it.

Checker registration is handled mostly by the CheckerRegistry class, which is constructed specifically for CheckerManager’s initialization, and is destructed right after it. After that, AnalyzerOptions is also regarded as fully initialized, as CheckerRegistry also validates all checker options.

The actual analysis begins after AnalysisConsumer::Initialize() is executed.

2.2.2. Checkers and checker registration

This section will detail

  • What we actually mean under the term “checker”,
  • How are they registered (and what registering actually means!),
  • How can the user create and load checker plugins,
  • How can we establish dependencies in between checkers,
  • How can we add checker options.

If you are only developing a single checker, chances are that you won’t need to read this entire document. However, if you are a long term developer or maintainer in the Static Analyzer, the more you know the better.

2.2.2.1. Terminology

As the analyzer matured over the years, specific terms that described one specific function can now mean a variety of different things. For example, in the early 2010s, we used the term “checks” (similarly to clang-tidy) instead of “checkers”, and there still are some remnants of this in class/object names and documentation. Among the most commonly misused words is “registration”.

This section aims to clarify most of these things. It will talk about things that will only be detailed later on, so feel free to skip some parts if they are unclear just yet.

2.2.2.1.1. Common file names

The short file names (as of writing this document) will refer to the following files:

  • Checkers.td: (clang repository)/include/clang/StaticAnalyzer/Checkers/Checkers.td
  • Checkerbase.td: (clang repository)/include/clang/StaticAnalyzer/Checkers/CheckerBase.td
  • Checkers.inc: (build directory)/tools/clang/include/clang/StaticAnalyzer/Checkers/ Checkers.inc
  • ClangSACheckersEmitter.cpp : (clang repository)/utils/TableGen/ClangSACheckersEmitter.cpp
  • RegisterCustomCheckersTest.cpp : (clang repository)/unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp

2.2.2.1.2. “Registering a checker”

The term “registering” will be used quite a bit in this document, so it’s important to note that what we actually mean under it. Unfortunately, in the code, “registering a checker” can misleadingly mean a couple different things, like

  • When CheckerManager::registerChecker is called, which is what we will refer to, when saying “registering a checker”,
  • When you add a new entry to Checkers.td, we will call this “making an entry for a builtin checker”,
  • When CheckerRegistry::addChecker is called, we will call this “adding a checker”.

2.2.2.1.3. Checkers

Checkers are basically the bread and butter of the analyzer. When specific events (such as a call to a function) happen, checkers may register to that event by implementing a callback (a method), that will be called.

2.2.2.1.3.1. The parts of a checker

Most checkers have their own file in (clang repository)/lib/StaticAnalyzer/Checkers/, which will contain a checker class on the top, a checker registry function and a checker shouldRegister function on the bottom. If the latter return with true, the checker registry function creates a single instance of the checker class called the checker object, which is owned by CheckerManager.

A package is not much more than a single string, used for bundling checkers into logical categories. Every checker is a part of a package, and any package can be a subpackage of another. If package builtin is a subpackge of core, it’s full name will be core.builtin, and it’s name will be builtin. Similarly if checker X is within the package Y, its full name is Y.X, and it’s name is X.

2.2.2.1.3.2. Checker dependencies

Checkers can depend on one another. If a dependency is disabled, so must be every checker that depends on it.

Should we imagine checker dependencies as a graph, it would be a directed forest, where the nodes are checkers: each directed tree describes a group of checker’s dependencies, a node’s parent would be it’s dependency, and is ensured to be registered before it’s children.

Currently, we don’t allow directed circles within this graph, but it would certainly be a great addition. Depending on packages, and packages dependning on either packages or checkers also isn’t supported yet.

2.2.2.1.3.3. “Builtin” and “plugin” checkers

We call a checker builtin, if it has an entry in Checkers.td. A checker is a plugin checker, if it was loaded from a plugin runtime.

There is a third category of checkers in this regard, that do not have an entry in the TableGen file, but neither is a plugin checker, for example in RegisterCustomCheckersTest.cpp. These go through the same process are builtin checkers, but without the code being generated for them.

Similarly, builtin packages have an entry in Checkers.td, and plugin packages are loaded from a plugin runtime.

2.2.2.1.3.4. Subcheckers

As stated earlier, most checkers have a single checker object, but not all. Subcehckers do not have one on their own, as they are most commonly built in another checker that does. For example, many checkers are implemented by having a checker object which models something (like dynamic memory allocation), and enabling certain subcheckers of it will make the modeling part emit certain reports (like emitting a report for double delete errors). Practically, subcheckers most of the time can be regarded as checker options to the main checker.

Natually, all subcheckers depend on their main checkers.

2.2.2.1.3.5. Command line options

Both checkers and packages can possess options. Each package option transitively belongs to all of its subpackages and checkers. These of these options must be preceded by -analyzer-config and must have the following format:

-analyzer-config CheckerOrPackageFullName:OptionName=Value

Should the user supply the same option multiple times (with possibly different values), only the last one will be regarded. If compatibility mode (which is implicitly enabled in driver mode) is disabled, these options will be verified, and additional verifications can be added to the checker’s registry function.

2.2.2.2. Creating a checker plugin

Checker plugins can be compiled on their own, but can only be used with a specific clang version. At the very least, it is a dynamic library that exports clang_analyzerAPIVersionString. This should be defined as follows:

extern "C"
const char clang_analyzerAPIVersionString[] =
    CLANG_ANALYZER_API_VERSION_STRING;

This is used to check whether the current version of the analyzer compatible with the plugin. Attempting to load plugins with incompatible version strings, or without a version string at all, will result in warnings and the plugins not being loaded.

To add a custom checker to the analyzer, the plugin must also define the function clang_registerCheckers.

extern "C"
void clang_registerCheckers(CheckerRegistry &registry) {
  registry.addChecker<MainCallChecker>(
      "example.MainCallChecker", "Disallows calls to functions called main",
      "");

  // Register more checkers, plugins, checker dependencies, options...
}

The clang_registerCheckers function may add any number of checkers to the registry. We’ll later discuss in detail the usage of CheckerRegistry.

2.2.2.2.1. Compiling a plugin

Compilation should be done with the help of an LLVM tool called llvm-config, and additionally, linked against libStaticAnalyzerCore. Please refer to it’s documentation page for details. We’ve created a github repository that contains a very minimal out-of-tree (not within the Clang repository) Static Analyzer plugin: https://github.com/Szelethus/minimal_csa_plugin/. For an in-tree implementation, see examples/analyzer-plugin.

2.2.2.2.2. Loading a plugin

To load a checker plugin, specify the full path to the dynamic library as the argument to the -load frontend option.

clang -cc1 -load </path/to/plugin.dylib> -analyze \
    -analyzer-checker=example.MainCallChecker

clang -Xclang -load -Xclang </path/to/plugin.so> --analyze \
    -Xclang -analyzer-checker=example.MainCallChecker

2.2.2.3. Non-generated, statically linked checkers

We briefly touched on a class called AnalysisAction, but that’s nowhere near the entire story. AnalysisAction is a derived class of ASTFrontendAction, that as of now houses a single overriden method CreateASTConsumer, that at the end of the day creates an AnalysisConsumer. However, any ASTFrontendAction descendant that does at least this much can run the analyzer.

A prime example of this can be found in RegisterCustomCheckersTest.cpp, which does this for unittesting purposes.

class TestAction : public ASTFrontendAction {
  class DiagConsumer : public PathDiagnosticConsumer {
    llvm::raw_ostream &Output;

  public:
    DiagConsumer(llvm::raw_ostream &Output) : Output(Output) {}
    void FlushDiagnosticsImpl(std::vector<const PathDiagnostic *> &Diags,
                              FilesMade *filesMade) override {
      for (const auto *PD : Diags)
        Output << PD->getCheckName() << ":" << PD->getShortDescription();
    }

    StringRef getName() const override { return "Test"; }
  };

  llvm::raw_ostream &DiagsOutput;

public:
  TestAction(llvm::raw_ostream &DiagsOutput) : DiagsOutput(DiagsOutput) {}

  std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &Compiler,
                                                 StringRef File) override {
    std::unique_ptr<AnalysisASTConsumer> AnalysisConsumer =
        CreateAnalysisConsumer(Compiler);

    AnalysisConsumer->AddDiagnosticConsumer(new DiagConsumer(DiagsOutput));

    AnalysisConsumer->AddCheckerRegistrationFn([](CheckerRegistry &Registry) {
      Registry.addChecker<CheckerT>(
          "custom.CustomChecker", "Description", "");

      // Register more checkers, plugins, checker dependencies, options...
    });

    return std::move(AnalysisConsumer);
  }
};

bool runCheckerOnCode(const std::string &Code, std::string &Diags) {
  llvm::raw_string_ostream OS(Diags);
  return tooling::runToolOnCode(new TestAction(OS), Code);
}

Using AnalysisConsumer::AddCheckerRegistrationFn, the user can gain access to a a CheckerRegisrty object, from which point checker registration is pretty much the same with plugin checkers.

2.2.2.4. Checker registration

The checker registration, or initialization process begins when the CheckerRegistry object is created. It will store a CheckerRegisty::CheckerInfo object for each checker containing their full name, a pointer to their checker registry function, and some other things that we will detail later. It’ll parse the user’s input about which checker should be enabled, resolves dependencies, validates checker options, and eventually calls the checker registry functions by supplying each with a CheckerManager object. By the time the CheckerRegistry object is destructed, all necessary checker objects have been created and initialized.

2.2.2.4.1. Registering non-builtin checkers

Both statically linked- and plugin checkers have to access to CheckerRegistry object, through which they can register themselves.

2.2.2.4.1.1. Registering a package

A new package can be added via CheckerRegistry::addPackage(), which expect a package full name.

A new package option can be added via CheckerRegistry::addPackageOption, which expects the package’s full name, the option’s name, the default value of it, a human-readable description and the option’s type. You can add several package options to a single package by supplying the same package full name when calling addPackageOption again.

2.2.2.4.1.2. Registering a checker

A new checker can be added via the CheckerRegisty::addChecker template method, which expects a full checker name, a human-readable description, a pointer to the checker registry function, a pointer to the checker’s shouldRegister function, a (preferably existing) link to the checker’s documentation page as regular parameters and the checker class as a template parameter.

A new checker option can be added via CheckerRegistry::addCheckerOption, which expects the checker’s full name, the option’s name, the default value of it, a human-readable description and the option’s type. You can add several checker options to a single checker by supplying the same checker full name when calling addCheckerOption again.

One can establish dependencies in between checkers by calling CheckerRegistry::addDependency, which expects in order the dependendt checker’s full name, and the dependency-checker’s full name.

2.2.2.4.2. Registering builtin checkers

Creating a new builtin checker is an easy process, as the code required for adding a checker, ensuring that it’s dependencies are registered beforehand, and few other things are generated from TableGen files according to the entry that was made for it. Usually, adding 5-10 lines to Checkers.td is all you need to do.

During the compilation of the analyzer, Checkers.td will be processed by TableGen, which will generate the Checkers.inc file according to how the generation was specified in ClangSACheckersEmitter.cpp. CheckerBase.td (basically the header file of Checkers.td) defines the actual structure of a checker entry.

2.2.2.4.2.1. Creating a basic entry for a builtin package

A package entry has a

  • Name,
  • (optional) Parent package, which expects a package as an argument. This is how one can express that this entry is a subpacke, and is used for generating the plugin’s full name,
  • (optional) Package options.
def PackageClassName : Package<"PackageName">;

With all optional fields:

def AnotherPackage : Package<"AnotherPackage">,
  ParentPackage<PackageClassName>,
  PackageOptions<[
    CmdLineOption<CommandLineOptionType,
                  "OptionName",
                  "OptionDescription",
                  "DefaultValue">,
    CmdLineOption<CommandLineOptionType2,
                  "OptionName2",
                  "OptionDescription2",
                  "DefaultValue2">,
  ]>;

We’ll define checkers inside packages:

let ParentPackage = AnotherPackage in {

// List of checker entries for the "core.builtin" package...

} // end "core.builtin"
2.2.2.4.2.2. Creating a basic entry for a builtin checker

A checker entry has a

  • Parent package, which specified that which package dies this checker belong to. This is assigned implicitly according to which let ParentPackage = ??? in { /* checker entry */ } block was the checker defined in.
  • Class name, that will be used for function name generation,
  • Checker name, that specifies the name of the checker, which will be used to generate the checker’s full name,
  • Description, which will be displayed for -analyzer-checker-help,
  • (optional) Dependencies, which specifies that what other checkers need to be registered before the current one,
  • (optional) Checker options.
  • Documentation state specifier, which specifies whether the checker has documentation, and is needed for certain output types (detailed in a later section).
def ClassName : Checker<"CheckerName">,
  HelpText<"Description">,
  Documentation<DocumentationStateSpecifier>;

With all optional fields:

def ClassName : Checker<"CheckerName">,
  HelpText<"Description">,
  Dependencies<[AnotherClassName, YetAnotherClassName]>,
  CheckerOptions<[
    CmdLineOption<CommandLineOptionType,
                  "OptionName",
                  "OptionDescription",
                  "DefaultValue">,
    CmdLineOption<CommandLineOptionType2,
                  "OptionName2",
                  "OptionDescription2",
                  "DefaultValue2">,
  ]>,
  Documentation<DocumentationStateSpecifier>;

2.2.3. Analyzer Outputs

Work in progress

2.2.4. Model injector

Work in progress