diffblue-cbmc/CODING_STANDARD.md

395 lines
18 KiB
Markdown
Raw Normal View History

Here a few minimalistic coding rules for the CPROVER source tree.
# Interfaces
- With any changes, consider the impact on other users of the code base. Users
frequently carry their own set of patches or build tools on top of the code
base. Large-scale changes negatively impact both scenarios.
- Tools that link against the code base can reasonably expect a stable
interface. We consider as public interface any objects, procedures, or classes
(and their methods) that are meant to be used outside a single directory.
- See below for how to document interfaces and how to mark parts of interfaces
as deprecated.
# Whitespaces
2017-10-16 21:34:49 +08:00
Formatting is enforced using clang-format. For more information about this, see
`COMPILING.md`. A brief summary of the formatting rules is given below:
- Use 2 spaces indent, no tabs.
- No lines wider than 80 chars.
- When line is wider, do the following:
- Subsequent lines should be indented two more than the initial line
- Break after `=` if it is part of an assignment
- For chained calls, prefer immediately before `.`
2017-06-23 01:33:45 +08:00
- For other operators (e.g. `&&`, `+`) prefer immediately after the
operator
- For brackets, break after the bracket
- In the case of function calls, put each argument on a separate line if
they do not fit after one line break
- Nested function calls do not need to be broken up into separate lines
even if the outer function call does.
- If a method is bigger than 50 lines, break it into parts.
- Put matching `{ }` into the same column, except for initializer lists and
lambdas, where you should place `{` directly after the closing `)`. This is
to comply with clang-format, which doesn't support aligned curly braces in
these cases.
2017-10-16 21:34:49 +08:00
- Spaces around binary operators (`=`, `+`, `==` ...)
- Space after comma (parameter lists, argument lists, ...)
- Space after colon inside `for`
- For pointers and references, the `*`/`&` should be attached to the variable
name as opposed to the type. E.g. for a pointer to an int the syntax would
be: `int *x;`
- No whitespaces at end of line
- No whitespaces in blank lines
- Put argument lists on next line (and indent 2 spaces) if too long
- Put parameters on separate lines (and indent 2 spaces) if too long
2017-10-16 21:34:49 +08:00
- Spaces around colon for inheritance, put inherited into separate lines in
case of multiple inheritance
- The initializer list follows the constructor with a whitespace around the
colon.
- `if(...)`, `else`, `for(...)`, `do`, and `while(...)` are always in a
separate line
- Break expressions in `if`, `for`, `while` if necessary and align them with
the first character following the opening parenthesis
- Use `{}` instead of `;` for the empty statement
- Single line blocks without `{ }` are allowed, but put braces around
multi-line blocks
- Use blank lines to visually separate logically cohesive code blocks within a
function
- Have a newline at the end of a file
# Comments
- Do not use `/* */`
- Each source and header file must start with a comment block stating the
author. See existing source for an example of the format of this block. This
should be followed by a Doxygen `\file` comment:
```c++
/// \file
/// <Some information about this file goes here>
```
Note that the `\file` tag must be immediately followed by a newline in order
for Doxygen to relate the comment to the current file.
- Each class, member variable and function should be preceded by a Doxygen
comment describing it when it is not immediately obvious what it does. The
format should match the [LLVM
guidelines](http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments),
with one extension: for functions, `\param` and `\return` comments longer than
a single line should have subsequent lines indented by two spaces, so that the
tags stand out. An example:
```c++
/// This sentence, until the first dot followed by whitespace, becomes
/// the brief description. More detailed text follows. Feel free to
/// break this into paragraphs to aid readability.
/// \param arg: This parameter doesn't need much description
/// \param long_arg: This parameter needs a long description. Extra
/// information about the parameter gets indented an extra two columns,
/// like this.
/// \param [out] out_arg: This parameter is mutated by the function, and
/// its value at the beginning of the function is not used
/// \param [in,out] in_out_arg: This parameter is mutated by the function, and
/// its value at the beginning of the function is used
/// \return The return value is literally the value returned by the
/// function. For out-parameters, use "\param [out]" or "\param [in,out]".
ret_typet function(argt arg, argt long_arg, argt &out_arg, argt &in_out_arg)
```
- The priority of documentation is readability. Therefore, feel free to use
Doxygen features, or to add whitespace for multi-paragraph comment blocks if
necessary.
- For functions or methods that provide an interface to a module, a comment
block should immediately precede the declaration of the entity
it documents. This implies that it will live in the header file.
The documentions should focus on the externally visible behavior, and
should discuss the implementation only to the extent necessary for
understanding usage. The implementation should be explained using
comments in the definition of the function or method.
- Functions or methods that are internal to a module are documented in a
comment block that precedes the definition.
- The documentation block must *immediately* precede the entity it documents.
Don't insert empty lines between docs and functions, because this will
confuse Doxygen.
- Put comments on a separate line
- Use comments to explain the non-obvious
- Use #if 0 for commenting out source code
- Use #ifdef DEBUG to guard debug code
- When deprecating interfaces, use the `DEPRECATED` macro, preferably together
with `SINCE`. For example,
```
DEPRECATED(SINCE(2019, 1, 31, "use other_method() instead"))
void deprecated_method();
```
marks `deprecated_method` as deprecated as of 2019-01-31. Any deprecated
functionality should remain in place for at least six months from the date of
deprecation. Before deprecating code, all in-tree uses should be replaced or
marked as deprecated.
- In the brief of function descriptions, prefer imperatives such as
`Remove first element...` over `Removes first element...`.
# Naming
2018-08-10 18:03:02 +08:00
- Identifiers should make clear the purpose of the thing they are naming.
- Identifiers may use the characters `[a-z0-9_]` and should start with a
lower-case letter (parameters in constructors may start with `_`).
- Omit names of parameters or exception objects when they are not used. If
parameter names help documenting an interface, keep the name and use
`(void)parameter_name;` in the body of the method.
- Use American spelling for identifiers.
- Separate basic words by `_`
- Avoid abbreviations (e.g. prefer `symbol_table` to `st`).
- User defined type identifiers have to be terminated by `t`. Moreover, before
2017-06-23 16:47:38 +08:00
`t` may not be `_`.
- Do not use `m_` prefix nor `_` suffix for names of attributes of structured
types.
- Enum values may use the characters `[A-Z0-9_]`
# Header files
- Avoid unnecessary `#include`s, especially in header files
- Prefer forward declaration to includes, but forward declare at the top of the
header file rather than in line
- Guard headers with `#ifndef CPROVER_DIRECTORIES_FILE_H`, etc
- The corresponding header for a given source file should always be the *first*
include in the source file. For example, given `foo.h` and `foo.cpp`, the
line `#include "foo.h"` should precede all other include statements in
`foo.cpp`.
2017-08-04 21:53:24 +08:00
- Use the C++ versions of C headers (e.g. `cmath` instead of `math.h`).
Some of the C headers use macros instead of functions which can have
unexpected consequences.
# Makefiles
- Each source file should appear on a separate line
- The final source file should still be followed by a trailing slash
- The last line should be a comment to not be deleted, i.e. should look like:
```makefile
SRC = source_file.cpp \
source_file2.cpp \
# Empty last line
```
2017-06-23 16:47:38 +08:00
This ensures the Makefiles can be easily merged.
# Program Command Line Options
- Each program contains a `program_name_parse_optionst` class which should
contain a define `PROGRAM_NAME_PARSE_OPTIONS` which is a string of all the
parse options in brackets (with a colon after the bracket if it takes a
parameter)
- Each parameter should be one per line to yield easy merging
- If parameters are shared between programs, they should be pulled out into a
common file and then included using a define
- The defines should be `OPT_FLAG_NAMES` which should go into the `OPTIONS`
define
- The defines should include `HELP_FLAG_NAMES` which should contain the help
output in the format:
```
" --flag explanations\n" \
" --another flag more explanation\n" \
<-------30 chars------>
```
- The defines may include `PARSE_OPTIONS_FLAG_NAMES` which move the options
from the command line into the options
# C++ features
- Do not use namespaces, except for anonymous namespaces.
- Prefer use of `using` instead of `typedef`.
- Prefer use of `class` instead of `struct` if there is an invariant between
members.
- Write type modifiers before the type specifier.
- Make references `const` whenever possible
- Make member functions `const` whenever possible
- Do not hide base class functions
- When overriding a virtual function, use `override` (without `virtual`)
- Single argument constructors must be `explicit`
- Avoid implicit conversions
- Avoid `friend` declarations
- Avoid iterators, use ranged `for` instead
- Avoid allocation with `new`/`delete`, use `unique_ptr`
- Avoid pointers, use references
- Avoid `char *`, use `std::string`
- For numbers, use `int`, `unsigned`, `long`, `unsigned long`, `double`
- Use `mp_integer`, not `BigInt`
- Use the functions in util for conversions between numbers and strings
- Avoid C-style functions. Use classes with an `operator()` instead.
- Use `irep_idt` for identifiers (not `std::string`)
- Avoid destructive updates if possible. The `irept` has constant time copy.
- Use instances of `std::size_t` for comparison with return values of `.size()`
of STL containers and algorithms, and use them as indices to arrays or
vectors.
- Do not use default values in public functions
- Use assertions to detect programming errors, e.g. whenever you make
assumptions on how your code is used
- Use exceptions only when the execution of the program has to abort because of
erroneous user input
- We allow to use 3rd-party libraries directly. No wrapper matching the coding
rules is required. Allowed libraries are: STL.
- When throwing, omit the brackets, i.e. `throw "error"`.
- Error messages should start with a lower case letter.
- Use the `auto` keyword if and only if one of the following
- The type is explicitly repeated on the RHS (e.g. a constructor call)
- Adding the type will increase confusion (e.g. iterators, function pointers)
- Avoid `assert`. If the condition is an actual invariant, use INVARIANT,
PRECONDITION, POSTCONDITION, CHECK_RETURN, UNREACHABLE or DATA_INVARIANT (also
see the documentation of the macros in `src/util/invariant.h`). If there are
possible reasons why it might fail, throw an exception.
- Use "should" style statements for messages in invariants (e.g. "array
should have a non-zero size") to make both the violation and the expected
behavior clear. (As opposed to "no zero size arrays" where it isn't clear
if the zero-size array is the problem, or the lack of it).
- The statements should start with a lower case letter.
- All raw pointers (such as those returned by `symbol_tablet::lookup`) are
assumed to be non-owning, and should not be `delete`d. Raw pointers that
point to heap-allocated memory should be private data members of an object
which safely manages the pointer. As such, `new` should only be used in
constructors, and `delete` in destructors. Never use `malloc` or `free`.
- Prefer brace style initialisation (i.e. `type_name{arguments...}`) over
parentheses for constructor calls
- The [C++ Core Guidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines)
are a useful reference for writing good modern C++.
# CProver conventions
- Avoid if at all possible using irept methods like `get(ID_name)`, instead cast
to a derived type (e.g. `class_typet`) and use the wrapper method `get_name`
- Use `can_cast_type`/`can_cast_expr` instead of directly checking the `id()`
of an `irept`.
# Architecture-specific code
- Avoid if possible.
- Use `__LINUX__`, `__MACH__`, and `_WIN32` to distinguish the architectures.
- Don't include architecture-specific header files without `#ifdef` ...
# Output
- Do not output to `cout` or `cerr` directly (except in temporary debug code,
and then guard `#include <iostream>` by `#ifdef DEBUG`)
- Derive from `messaget` if the class produces output and use the streams
provided (`status()`, `error()`, `debug()`, etc)
- Use `\n` instead of `std::endl`
# Unit tests
- Unit tests are written using [Catch](https://github.com/philsquared/Catch)
- For large classes:
- Create a separate file that contains the tests for each method of each
class
- The file should be named according to
`unit/class/path/class_name/function_name.cpp`
- For small classes:
- Create a separate file that contains the tests for all methods of each
class
- The file should be named according to `unit/class/path/class_name.cpp`
- Catch supports tagging, tests should be tagged with all the following tags:
- [core] should be used for all tests unless the test takes more than 1
second to run, then it should be tagged with [long]
- [folder_name] of the file being tested
- [class_name] of the class being tested
- [function_name] of the function being tested
---
You are allowed to break rules if you have a good reason to do so.
---
# Pre-commit hook to run cpplint locally
To install the hook
```sh
cp .githooks/pre-commit .git/hooks/pre-commit
```
or use a symbolic link. Then, when running git commit, you should get the
linter output (if any) before being prompted to enter a commit message. To
bypass the check (e.g. if there was a false positive), add the option
`--no-verify`.
2018-06-22 16:35:10 +08:00
# CODE COVERAGE
Code coverage metrics are provided using gcov and lcov. Ensure that you have
installed lcov from http://ltp.sourceforge.net/coverage/lcov.php note for
ubuntu lcov is available in the standard apt-get repos.
To get coverage metrics run the following script from the regression directory:
```
get_coverage.sh
```
This will:
1) Rebuild CBMC with gcov enabled
2) Run all the regression tests
3) Collate the coverage metrics
4) Provide an HTML report of the current coverage
# USING CLANG-FORMAT
CBMC uses clang-format to ensure that the formatting of new patches is readable
and consistent. There are two main ways of running clang-format: remotely, and
locally.
## RUNNING CLANG-FORMAT REMOTELY
When patches are submitted to CBMC, they are automatically run through
continuous integration (CI). One of the CI checks will run clang-format over
the diff that your pull request introduces. If clang-format finds formatting
issues at this point, the build will be failed, and a patch will be produced
in the CI output that you can apply to your code so that it conforms to the
style guidelines.
To apply the patch, copy and paste it into a local file (`patch.txt`) and then
run:
```
patch -p1 -i patch.txt
```
Now, you can commit and push the formatting fixes.
## RUNNING CLANG-FORMAT LOCALLY
### INSTALLATION
To avoid waiting until you've made a PR to find formatting issues, you can
install clang-format locally and run it against your code as you are working.
Different versions of clang-format have slightly different behaviors. CBMC uses
clang-format-7 as it is available the repositories for Ubuntu 18.04 and
2018-06-22 16:35:10 +08:00
Homebrew.
To install on a Unix-like system, try installing using the system package
manager:
```
apt-get install clang-format-7 # Run this on Ubuntu, Debian etc.
brew install clang-format@7 # Run this on a Mac with Homebrew installed
2018-06-22 16:35:10 +08:00
```
If your platform doesn't have a package for clang-format, you can download a
pre-built binary, or compile clang-format yourself using the appropriate files
from the [LLVM Downloads page](http://releases.llvm.org/download.html).
An installer for Windows (along with a Visual Studio plugin) can be found at
the [LLVM Snapshot Builds page](http://llvm.org/builds/).
### FORMATTING A RANGE OF COMMITS
Clang-format is distributed with a driver script called git-clang-format-7.
2018-06-22 16:35:10 +08:00
This script can be used to format git diffs (rather than entire files).
After committing some code, it is recommended to run:
```
git-clang-format-7 upstream/develop
2018-06-22 16:35:10 +08:00
```
*Important:* If your branch is based on a branch other than `upstream/develop`,
use the name or checksum of that branch instead. It is strongly recommended to
rebase your work onto the tip of the branch it's based on before running
`git-clang-format` in this way.
Note: By default, git-clang-format uses the git config variable
`clangformat.binary`. If you have multiple versions of clang-format installed,
you might need to update this, or explicitly specify the binary to use via
`--binary clang-format-7`.
2018-06-22 16:35:10 +08:00
### RETROACTIVELY FORMATTING INDIVIDUAL COMMITS
If your works spans several commits and you'd like to keep the formatting
correct in each individual commit, you can automatically rewrite the commits
with correct formatting.
The following command will stop at each commit in the range and run
clang-format on the diff at that point. This rewrites git history, so it's
*unsafe*, and you should back up your branch before running this command:
```
git filter-branch --tree-filter 'git-clang-format-7 upstream/develop' \
2018-06-22 16:35:10 +08:00
-- upstream/develop..HEAD
```
*Important*: `upstream/develop` should be changed in *both* places in the
command above if your work is based on a different branch. It is strongly
recommended to rebase your work onto the tip of the branch it's based on before
running `git-clang-format` in this way.