This section presents a range of examples. Each example is presented along with some C code to which it is applied. The description explains the rules and the matching process.
One of the primary goals of Coccinelle is to perform software
evolution. For instance, Coccinelle could be used to perform function
renaming. In the following example, every occurrence of a call to the
function foo is replaced by a call to the
function bar.
Before | Semantic patch | After |
1 #DEFINE TEST "foo"
2
3 printf("foo");
4
5 int main(int i) {
6 //Test
7 int k = foo();
8
9 if(1) {
10 foo();
11 } else {
12 foo();
13 }
14
15 foo();
16 } | 1 @@
2
3 @@
4
5 - foo()
6 + bar() | 1 #DEFINE TEST "foo"
2
3 printf("foo");
4
5 int main(int i) {
6 //Test
7 int k = bar();
8
9 if(1) {
10 bar();
11 } else {
12 bar();
13 }
14
15 bar();
16 } |
Another important kind of evolution is the introduction or deletion of a
function argument. In the following example, the rule rule1 looks
for definitions of functions having return type irqreturn_t and
two parameters. A second anonymous rule then looks for calls to the
previously matched functions that have three arguments. The third argument
is then removed to correspond to the new function prototype.
1 @ rule1 @
2 identifier fn;
3 identifier irq, dev_id;
4 typedef irqreturn_t;
5 @@
6
7 static irqreturn_t fn (int irq, void *dev_id)
8 {
9 ...
10 }
11
12 @@
13 identifier rule1.fn;
14 expression E1, E2, E3;
15 @@
16
17 fn(E1, E2
18 - ,E3
19 ) |
drivers/atm/firestream.c at line 1653 before transformation |
1 static void fs_poll (unsigned long data)
2 {
3 struct fs_dev *dev = (struct fs_dev *) data;
4
5 fs_irq (0, dev, NULL);
6 dev->timer.expires = jiffies + FS_POLL_FREQ;
7 add_timer (&dev->timer);
8 } |
drivers/atm/firestream.c at line 1653 after transformation |
1 static void fs_poll (unsigned long data)
2 {
3 struct fs_dev *dev = (struct fs_dev *) data;
4
5 fs_irq (0, dev);
6 dev->timer.expires = jiffies + FS_POLL_FREQ;
7 add_timer (&dev->timer);
8 } |
To avoid code duplication or error prone code, the kernel provides
macros such as BUG_ON, DIV_ROUND_UP and
FIELD_SIZE. In these cases, the semantic patches look for
the old code pattern and replace it by the new code.
A semantic patch to introduce uses of the DIV_ROUND_UP macro
looks for the corresponding expression, i.e., (n + d − 1) /
d. When some code is matched, the metavariables n and d
are bound to their corresponding expressions. Finally, Coccinelle rewrites
the code with the DIV_ROUND_UP macro using the values bound to
n and d, as illustrated in the patch that follows.
Semantic patch to introduce uses of the DIV_ROUND_UP macro |
1 @ haskernel @
2 @@
3
4 #include <linux/kernel.h>
5
6 @ depends on haskernel @
7 expression n,d;
8 @@
9
10 (
11 - (((n) + (d)) - 1) / (d))
12 + DIV_ROUND_UP(n,d)
13 |
14 - (((n) + ((d) - 1)) / (d))
15 + DIV_ROUND_UP(n,d)
16 ) |
Example of a generated patch hunk |
1 --- a/drivers/atm/horizon.c
2 +++ b/drivers/atm/horizon.c
3 @@ -698,7 +698,7 @@ got_it:
4 if (bits)
5 *bits = (div<<CLOCK_SELECT_SHIFT) | (pre-1);
6 if (actual) {
7 - *actual = (br + (pre<<div) - 1) / (pre<<div);
8 + *actual = DIV_ROUND_UP(br, pre<<div);
9 PRINTD (DBG_QOS, "actual rate: %u", *actual);
10 }
11 return 0; |
The BUG_ON macro makes an assertion about the value of an expression. However, because some parts of the kernel define BUG_ON to be the empty statement when debugging is not wanted, care must be taken when the asserted expression may have some side-effects, as is the case of a function call. Thus, we create a rule introducing BUG_ON only in the case when the asserted expression does not perform a function call.
One particular piece of code that has the form of a function call is a use
of unlikely, which informs the compiler that a particular
expression is unlikely to be true. In this case, because unlikely
does not perform any side effect, it is safe to use BUG_ON. The
second rule takes care of this case. It furthermore disables the
isomorphism that allows a call to unlikely to be replaced with its
argument, as then the second rule would be the same as the first one.
1 @@
2 expression E,f;
3 @@
4
5 (
6 if (<+... f(...) ...+>) { BUG(); }
7 |
8 - if (E) { BUG(); }
9 + BUG_ON(E);
10 )
11
12 @ disable unlikely @
13 expression E,f;
14 @@
15
16 (
17 if (<+... f(...) ...+>) { BUG(); }
18 |
19 - if (unlikely(E)) { BUG(); }
20 + BUG_ON(E);
21 ) |
For instance, using the semantic patch above, Coccinelle generates
patches like the following one.
1 --- a/fs/ext3/balloc.c
2 +++ b/fs/ext3/balloc.c
3 @@ -232,8 +232,7 @@ restart:
4 prev = rsv;
5 }
6 printk("Window map complete.\n");
7 - if (bad)
8 - BUG();
9 + BUG_ON(bad);
10 }
11 #define rsv_window_dump(root, verbose) \
12 __rsv_window_dump((root), (verbose), __FUNCTION__) |
This SmPL match looks for NULL dereferences. Once an
expression has been compared to NULL, a dereference to this
expression is prohibited unless the pointer variable is reassigned.
Original |
1 foo = kmalloc(1024);
2 if (!foo) {
3 printk ("Error %s", foo->here);
4 return;
5 }
6 foo->ok = 1;
7 return; |
Semantic match |
1 @@
2 expression E, E1;
3 identifier f;
4 statement S1,S2,S3;
5 @@
6
7 * if (E == NULL)
8 {
9 ... when != if (E == NULL) S1 else S2
10 when != E = E1
11 * E->f
12 ... when any
13 return ...;
14 }
15 else S3 |
Matched lines |
1 foo = kmalloc(1024);
2 if (!foo) {
3 printk ("Error %s", foo->here);
4 return;
5 }
6 foo->ok = 1;
7 return; |
Coccinelle can embed Python code. Python code is used inside special SmPL rule annotated with script:python. Python rules inherit metavariables, such as identifier or token positions, from other SmPL rules. The inherited metavariables can then be manipulated by Python code.
The following semantic match looks for a call to the of_find_node_by_name function. This call increments a counter which must be decremented to release the resource. Then, when there is no call to of_node_put, no new assignment to the device_node variable n and a return statement is reached, a bug is detected and the position p1 and p2 are initialized. As the Python script rule depends only on the positions p1 and p2, it is evaluated. In the following case, some Emacs Org mode data are produced. This example illustrates the various fields that can be accessed in the Python code from a position variable.
1 @ r exists @
2 local idexpression struct device_node *n;
3 position p1, p2;
4 statement S1,S2;
5 expression E,E1;
6 @@
7
8 (
9 if (!(n@p1 = of_find_node_by_name(...))) S1
10 |
11 n@p1 = of_find_node_by_name(...)
12 )
13 <... when != of_node_put(n)
14 when != if (...) { <+... of_node_put(n) ...+> }
15 when != true !n || ...
16 when != n = E
17 when != E = n
18 if (!n || ...) S2
19 ...>
20 (
21 return <+...n...+>;
22 |
23 return@p2 ...;
24 |
25 n = E1
26 |
27 E1 = n
28 )
29
30 @ script:python @
31 p1 << r.p1;
32 p2 << r.p2;
33 @@
34
35 print "* TODO [[view:%s::face=ovl-face1::linb=%s::colb=%s::cole=%s][inc. counter:%s::%s]]" % (p1[0].file,p1[0].line,p1[0].column,p1[0].column_end,p1[0].file,p1[0].line)
36 print "[[view:%s::face=ovl-face2::linb=%s::colb=%s::cole=%s][return]]" % (p2[0].file,p2[0].line,p2[0].column,p2[0].column_end) |
Lines 13 to 17 list a variety of constructs that should not appear
between a call to of_find_node_by_name and a buggy return
site. Examples are a call to of_node_put (line 13) and a
transition into the then branch of a conditional testing whether
n is NULL (line 15). Any number of conditionals
testing whether n is NULL are allowed as indicated
by the use of a nest <... ...> to describe the path between
the call to of_find_node_by_name, the return and the
conditional in the pattern on line 18.
The previous semantic match has been used to generate the following
lines. They may be edited using the emacs Org mode to navigate in the code
from a site to another.
Note : Coccinelle provides some predefined Python functions, i.e., cocci.print_main, cocci.print_sec and cocci.print_secs. One could alternatively write the following SmPL rule instead of the previously presented one.
1 @ script:python @
2 p1 << r.p1;
3 p2 << r.p2;
4 @@
5
6 cocci.print_main(p1)
7 cocci.print_sec(p2,"return") |
The function cocci.print_secs is used when several
positions are matched by a single position variable and
every matched position should be printed.
Any metavariable could be inherited in the Python code. However, accessible fields are not currently equally supported among them.
If you consider the following SmPL file which uses the regexp functionality to filter the identifiers that contain, begin or end by foo,
1 @anyid@
2 type t;
3 identifier id;
4 @@
5 t id () {...}
6
7 @script:python@
8 x << anyid.id;
9 @@
10 print "Identifier: %s" % x
11
12 @contains@
13 type t;
14 identifier foo =~ ".*foo";
15 @@
16 t foo () {...}
17
18 @script:python@
19 x << contains.foo;
20 @@
21 print "Contains foo: %s" % x |
23 @endsby@
24 type t;
25 identifier foo =~ ".*foo$";
26 @@
27
28 t foo () {...}
29
30 @script:python@
31 x << endsby.foo;
32 @@
33 print "Ends by foo: %s" % x
34
35 @beginsby@
36 type t;
37 identifier foo =~ "^foo";
38 @@
39 t foo () {...}
40
41 @script:python@
42 x << beginsby.foo;
43 @@
44 print "Begins by foo: %s" % x |
and the following C program, on the left, which defines the functions
foo, bar, foobar, barfoobar and
barfoo, you will get the result on the right.
1 int foo () { return 0; }
2 int bar () { return 0; }
3 int foobar () { return 0; }
4 int barfoobar () { return 0; }
5 int barfoo () { return 0; } | 1 Identifier: foo
2 Identifier: bar
3 Identifier: foobar
4 Identifier: barfoobar
5 Identifier: barfoo
6 Contains foo: foo
7 Contains foo: foobar
8 Contains foo: barfoobar
9 Contains foo: barfoo
10 Ends by foo: foo
11 Ends by foo: barfoo
12 Begins by foo: foo
13 Begins by foo: foobar |