scripts/checkpatch.pl · 5269b605853c8f7581bf1c552aa94ddcbd8eba34 · matisse / android_kernel_samsung_matisse

checkpatch: strlen and strcmp should not be banned · f40e12fc
David Brown authored Jul 10, 2012


Software security is an important issue, in general, but it is
especially important in Linux kernel code.  Buffer overflows can have
wide-reaching ramifications and can often be readily exploited to
compromise the entire system.  It is important for every developer to
be aware of security issues while writing code.

However, I've noticed a few "rules" about coding that are resulting in
code that isn't any more secure, and has the disadvantage of obscuring
what the code is doing.  In most instances, the "corrected" code is
actually wrong: we've traded a perceived lack of safety for incorrect
behavior.  These obfuscations also make this code more distant from
upstream kernel standards.

I'm only going to focus here on strcmp/strncmp and strlen/strnlen.  I
choose these two, because in the context of the kernel, it's not easy
to make a general rule, such as "always use the 'n' variant".  These
function have different behavior, and the 'n' isn't just a blanket fix
that makes them better.  In many instances, the correct call is the
plain variant (strcpy has a strlcpy variant which is usually helpful).

Let's start with strlen/strnlen.

	size_t strlen(const char *);
	size_t strnlen(const char *, size_t);

The strlen() function scans for a NUL byte within a string, and
returns the number of characters that had to be skipped to get there.

The strnlen() call is similar, but will stop after after a maximal
number of characters, and return that result.  This variant was
intended for storing variable length strings in fixed-sized buffers,
where the full-length case did not have a trailing NUL.  This storage
model is very uncommon, as is the use of strnlen().

The question becomes, what is the maximal length you should be giving
to strnlen().  If the string is truly variable length (allocated and
filled), there really isn't a meaningful value to use for this.  The
only time that a max length makes sense is when you have something
like:

	char name[MAX_NAME_LENGTH];

but, in this case, strnlen() is still probably not what you want to be
using.  It would be safe to use, if you check the result, and if it is
MAX_NAME_LENGTH, raise some kind of error case.  If later code assumes
there is a NUL at the end, there will still be a buffer overflow.  In
this case, it is much better to check the length before storing it in
this field, and make sure there is room for the NUL.

If the string is a constant, passing in a length doesn't make sense,
since you would have to know the length of the string to check that.
There is no safety issue with calling strlen() on a constant.

So, the simple rule for strnlen()/strlen() is:

  - If the string doesn't have an obvious bound length, such as an
    allocated string, use strlen().

  - If the string is a constant, use strlen().

  - If there is a fixed buffer, strnlen() might make sense, but it is
    probably better to change the design to avoid these types of
    strings.

The only case where strnlen really makes sense is when you have a
string that is passed in from the user.  In this case, it is very
important to check the result, and if the length is at the maximum,
return an error, and don't try to do any processing on the string.

Moving on to strcmp/strncmp.  These functions are similar, except that
they take two string arguments, which gives a lot more combinations.

	int strcmp(const char *, const char *);
	int strncmp(const char *, const char *, size_t);

These will walk both strings (at the same time), and stop when
reaching the first difference between them.  This means that they will
never go further than the length of the shortest string being
compared.  As in strnlen, the max argument to strncmp sets a limit on
the comparison.  Similar to strnlen, the results are unusual when the
limit is reached, but in a sense, even worse, since it may consider
the strings equal without ever reaching the end of either.

Looking over the 200 some uses of strncmp in the msm code, almost all
of them do something akin to:

	strncmp(value, constant, strlen(constant))

If the call has added 1 to the 'strlen' result, the strncmp would just
become an inefficient, but otherwise identical version of strcmp.
However, without the +1, this compares the prefix of 'value' instead
of the entire string.  Only one instance of strncmp in the code
appears to be intentionally checking for a prefix.  The rest have
changed a simple string compare into an unintentional prefix match.

Because there are two strings, it is a little more complex to
determine which to use, but it is very similar.  It might seem that
strncmp() would be useful for checking an unknown buffer (say from
userspace).  However, since strncmp()'s result doesn't distinguish
between finding the end of the string, or hitting the max, there's no
way to know.  Some guidelines:

  - If one of the strings has a known bound length (such as a
    constant, or another string whose length has already been
    checked), AND this bound length is within the expected buffer of
    the other string, it is safe to use strcmp().

  - Otherwise, you may need to use something like strnlen() to
    determine a maximum length before calling strcmp().

  - strncmp() is useful to test a string for a prefix.  No other uses
    make sense.

To facilitate fixing these, remove strlen(), strcmp(), and
strcasecmp() from the list of calls that are banned.  Problems with
these calls need to be caught at a higher level (such as review), and
replacing them with the 'n' variants doesn't help anything.

This will be followed by some patches that fixup the incorrect code
introduced by this "ban".

Change-Id: I77dfe1f2f58e8c951e4b38b23f4ec79f8209b1dc
Signed-off-by: David Brown <davidb@codeaurora.org>
f40e12fc
checkpatch.pl 103 KB
Replace checkpatch.pl