Coccinelle
Coccinelle is a tool for pattern matching and text transformation that has many uses in kernel development, including the application of complex, tree-wide patches and detection of problematic programming patterns.
Note
Linux and macOS development environments are supported, but not Windows.
Getting Coccinelle
The semantic patches included in the kernel use features and options
which are provided by Coccinelle version 1.0.0-rc11 and above.
Using earlier versions will fail as the option names used by
the Coccinelle files and coccicheck
have been updated.
Coccinelle is available through the package manager of many distributions, e.g. :
Debian
Fedora
Ubuntu
OpenSUSE
Arch Linux
NetBSD
FreeBSD
Some distribution packages are obsolete and it is recommended to use the latest version released from the Coccinelle homepage at http://coccinelle.lip6.fr/
Or from Github at:
https://github.com/coccinelle/coccinelle
Once you have it, run the following commands:
./autogen
./configure
make
as a regular user, and install it with:
sudo make install
More detailed installation instructions to build from source can be found at:
https://github.com/coccinelle/coccinelle/blob/master/install.txt
Supplemental documentation
For Semantic Patch Language(SmPL) grammar documentation refer to:
https://coccinelle.gitlabpages.inria.fr/website/documentation.html
Using Coccinelle on Zephyr
coccicheck
checker is the front-end to the Coccinelle infrastructure
and has various modes:
Four basic modes are defined: patch
, report
, context
, and
org
. The mode to use is specified by setting --mode=<mode>
or
-m=<mode>
.
patch
proposes a fix, when possible.report
generates a list in the following format: file:line:column-column: messagecontext
highlights lines of interest and their context in a diff-like style.Lines of interest are indicated with-
.org
generates a report in the Org mode format of Emacs.
Note that not all semantic patches implement all modes. For easy use
of Coccinelle, the default mode is report
.
Two other modes provide some common combinations of these modes.
chain
tries the previous modes in the order above until one succeeds.rep+ctxt
runs successively the report mode and the context mode. It should be used with the C option (described later) which checks the code on a file basis.
Examples
To make a report for every semantic patch, run the following command:
./scripts/coccicheck --mode=report
To produce patches, run:
./scripts/coccicheck --mode=patch
The coccicheck
target applies every semantic patch available in the
sub-directories of scripts/coccinelle
to the entire source code tree.
For each semantic patch, a commit message is proposed. It gives a description of the problem being checked by the semantic patch, and includes a reference to Coccinelle.
As any static code analyzer, Coccinelle produces false positives. Thus, reports must be carefully checked, and patches reviewed.
To enable verbose messages set --verbose=1
option, for example:
./scripts/coccicheck --mode=report --verbose=1
Coccinelle parallelization
By default, coccicheck
tries to run as parallel as possible. To change
the parallelism, set the --jobs=<number>
option. For example, to run
across 4 CPUs:
./scripts/coccicheck --mode=report --jobs=4
As of Coccinelle 1.0.2 Coccinelle uses Ocaml parmap for parallelization, if support for this is detected you will benefit from parmap parallelization.
When parmap is enabled coccicheck
will enable dynamic load balancing by using
--chunksize 1
argument, this ensures we keep feeding threads with work
one by one, so that we avoid the situation where most work gets done by only
a few threads. With dynamic load balancing, if a thread finishes early we keep
feeding it more work.
When parmap is enabled, if an error occurs in Coccinelle, this error
value is propagated back, the return value of the coccicheck
command captures this return value.
Using Coccinelle with a single semantic patch
The option --cocci
can be used to check a single
semantic patch. In that case, the variable must be initialized with
the name of the semantic patch to apply.
For instance:
./scripts/coccicheck --mode=report --cocci=<example.cocci>
or:
./scripts/coccicheck --mode=report --cocci=./path/to/<example.cocci>
Controlling which files are processed by Coccinelle
By default the entire source tree is checked.
To apply Coccinelle to a specific directory, pass the path of specific directory as an argument.
For example, to check drivers/usb/
one may write:
./scripts/coccicheck --mode=patch drivers/usb/
The report
mode is the default. You can select another one with the
--mode=<mode>
option explained above.
Debugging Coccinelle SmPL patches
Using coccicheck
is best as it provides in the spatch command line
include options matching the options used when we compile the kernel.
You can learn what these options are by using verbose option, you could
then manually run Coccinelle with debug options added.
Alternatively you can debug running Coccinelle against SmPL patches
by asking for stderr to be redirected to stderr, by default stderr
is redirected to /dev/null, if you’d like to capture stderr you
can specify the --debug=file.err
option to coccicheck
. For
instance:
rm -f cocci.err
./scripts/coccicheck --mode=patch --debug=cocci.err
cat cocci.err
Debugging support is only supported when using Coccinelle >= 1.0.2.
Additional Flags
Additional flags can be passed to spatch through the SPFLAGS variable. This works as Coccinelle respects the last flags given to it when options are in conflict.
./scripts/coccicheck --sp-flag="--use-glimpse"
Coccinelle supports idutils as well but requires coccinelle >= 1.0.6. When no ID file is specified coccinelle assumes your ID database file is in the file .id-utils.index on the top level of the kernel, coccinelle carries a script scripts/idutils_index.sh which creates the database with:
mkid -i C --output .id-utils.index
If you have another database filename you can also just symlink with this name.
./scripts/coccicheck --sp-flag="--use-idutils"
Alternatively you can specify the database filename explicitly, for instance:
./scripts/coccicheck --sp-flag="--use-idutils /full-path/to/ID"
Sometimes coccinelle doesn’t recognize or parse complex macro variables
due to insufficient definition. Therefore, to make it parsable we
explicitly provide the prototype of the complex macro using the
---macro-file-builtins <headerfile.h>
flag.
The <headerfile.h>
should contain the complete prototype of
the complex macro from which spatch engine can extract the type
information required during transformation.
For example:
Z_SYSCALL_HANDLER
is not recognized by coccinelle. Therefore, we
put its prototype in a header file, say for example mymacros.h
.
$ cat mymacros.h
#define Z_SYSCALL_HANDLER int xxx
Now we pass the header file mymacros.h
during transformation:
./scripts/coccicheck --sp-flag="---macro-file-builtins mymacros.h"
See spatch --help
to learn more about spatch options.
Note that the --use-glimpse
and --use-idutils
options
require external tools for indexing the code. None of them is
thus active by default. However, by indexing the code with
one of these tools, and according to the cocci file used,
spatch could proceed the entire code base more quickly.
SmPL patch specific options
SmPL patches can have their own requirements for options passed to Coccinelle. SmPL patch specific options can be provided by providing them at the top of the SmPL patch, for instance:
// Options: --no-includes --include-headers
Proposing new semantic patches
New semantic patches can be proposed and submitted by kernel
developers. For sake of clarity, they should be organized in the
sub-directories of scripts/coccinelle/
.
The cocci script should have the following properties:
The script must have
report
mode.The first few lines should state the purpose of the script using
///
comments . Usually, this message would be used as the commit log when proposing a patch based on the script.
Example
/// Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element
A more detailed information about the script with exceptional cases or false positives (if any) can be listed using
//#
comments.
Example
//# This makes an effort to find cases where ARRAY_SIZE can be used such as
//# where there is a division of sizeof the array by the sizeof its first
//# element or by any indexed element or the element type. It replaces the
//# division of the two sizeofs by ARRAY_SIZE.
Confidence: It is a property defined to specify the accuracy level of the script. It can be either
High
,Moderate
orLow
depending upon the number of false positives observed.
Example
// Confidence: High
Virtual rules: These are required to support the various modes framed in the script. The virtual rule specified in the script should have the corresponding mode handling rule.
Example
virtual context
@depends on context@
type T;
T[] E;
@@
(
* (sizeof(E)/sizeof(*E))
|
* (sizeof(E)/sizeof(E[...]))
|
* (sizeof(E)/sizeof(T))
)
Detailed description of the report
mode
report
generates a list in the following format:
file:line:column-column: message
Example
Running:
./scripts/coccicheck --mode=report --cocci=scripts/coccinelle/array_size.cocci
will execute the following part of the SmPL script:
<smpl>
@r depends on (org || report)@
type T;
T[] E;
position p;
@@
(
(sizeof(E)@p /sizeof(*E))
|
(sizeof(E)@p /sizeof(E[...]))
|
(sizeof(E)@p /sizeof(T))
)
@script:python depends on report@
p << r.p;
@@
msg="WARNING: Use ARRAY_SIZE"
coccilib.report.print_report(p[0], msg)
</smpl>
This SmPL excerpt generates entries on the standard output, as illustrated below:
ext/hal/nxp/mcux/drivers/lpc/fsl_wwdt.c:66:49-50: WARNING: Use ARRAY_SIZE
ext/hal/nxp/mcux/drivers/lpc/fsl_ctimer.c:74:53-54: WARNING: Use ARRAY_SIZE
ext/hal/nxp/mcux/drivers/imx/fsl_dcp.c:944:45-46: WARNING: Use ARRAY_SIZE
Detailed description of the patch
mode
When the patch
mode is available, it proposes a fix for each problem
identified.
Example
Running:
./scripts/coccicheck --mode=patch --cocci=scripts/coccinelle/misc/array_size.cocci
will execute the following part of the SmPL script:
<smpl>
@depends on patch@
type T;
T[] E;
@@
(
- (sizeof(E)/sizeof(*E))
+ ARRAY_SIZE(E)
|
- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)
|
- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)
)
</smpl>
This SmPL excerpt generates patch hunks on the standard output, as illustrated below:
diff -u -p a/ext/lib/encoding/tinycbor/src/cborvalidation.c b/ext/lib/encoding/tinycbor/src/cborvalidation.c
--- a/ext/lib/encoding/tinycbor/src/cborvalidation.c
+++ b/ext/lib/encoding/tinycbor/src/cborvalidation.c
@@ -325,7 +325,7 @@ static inline CborError validate_number(
static inline CborError validate_tag(CborValue *it, CborTag tag, int flags, int recursionLeft)
{
CborType type = cbor_value_get_type(it);
- const size_t knownTagCount = sizeof(knownTagData) / sizeof(knownTagData[0]);
+ const size_t knownTagCount = ARRAY_SIZE(knownTagData);
const struct KnownTagData *tagData = knownTagData;
const struct KnownTagData * const knownTagDataEnd = knownTagData + knownTagCount;
Detailed description of the context
mode
context
highlights lines of interest and their context
in a diff-like style.
Note
The diff-like output generated is NOT an applicable patch. The
intent of the context
mode is to highlight the important lines
(annotated with minus, -
) and gives some surrounding context
lines around. This output can be used with the diff mode of
Emacs to review the code.
Example
Running:
./scripts/coccicheck --mode=context --cocci=scripts/coccinelle/array_size.cocci
will execute the following part of the SmPL script:
<smpl>
@depends on context@
type T;
T[] E;
@@
(
* (sizeof(E)/sizeof(*E))
|
* (sizeof(E)/sizeof(E[...]))
|
* (sizeof(E)/sizeof(T))
)
</smpl>
This SmPL excerpt generates diff hunks on the standard output, as illustrated below:
diff -u -p ext/lib/encoding/tinycbor/src/cborvalidation.c /tmp/nothing/ext/lib/encoding/tinycbor/src/cborvalidation.c
--- ext/lib/encoding/tinycbor/src/cborvalidation.c
+++ /tmp/nothing/ext/lib/encoding/tinycbor/src/cborvalidation.c
@@ -325,7 +325,6 @@ static inline CborError validate_number(
static inline CborError validate_tag(CborValue *it, CborTag tag, int flags, int recursionLeft)
{
CborType type = cbor_value_get_type(it);
- const size_t knownTagCount = sizeof(knownTagData) / sizeof(knownTagData[0]);
const struct KnownTagData *tagData = knownTagData;
const struct KnownTagData * const knownTagDataEnd = knownTagData + knownTagCount;
Detailed description of the org
mode
org
generates a report in the Org mode format of Emacs.
Example
Running:
./scripts/coccicheck --mode=org --cocci=scripts/coccinelle/misc/array_size.cocci
will execute the following part of the SmPL script:
<smpl>
@r depends on (org || report)@
type T;
T[] E;
position p;
@@
(
(sizeof(E)@p /sizeof(*E))
|
(sizeof(E)@p /sizeof(E[...]))
|
(sizeof(E)@p /sizeof(T))
)
@script:python depends on org@
p << r.p;
@@
coccilib.org.print_todo(p[0], "WARNING should use ARRAY_SIZE")
</smpl>
This SmPL excerpt generates Org entries on the standard output, as illustrated below:
* TODO [[view:ext/lib/encoding/tinycbor/src/cborvalidation.c::face=ovl-face1::linb=328::colb=52::cole=53][WARNING should use ARRAY_SIZE]]
Coccinelle Mailing List
Subscribe to the coccinelle mailing list:
Archives: