• David Brown's avatar
    checkpatch: strlen and strcmp should not be banned · f40e12fc
    David Brown authored
    
    
    Software security is an important issue, in general, but it is
    especially important in Linux kernel code.  Buffer overflows can have
    wide-reaching ramifications and can often be readily exploited to
    compromise the entire system.  It is important for every developer to
    be aware of security issues while writing code.
    
    However, I've noticed a few "rules" about coding that are resulting in
    code that isn't any more secure, and has the disadvantage of obscuring
    what the code is doing.  In most instances, the "corrected" code is
    actually wrong: we've traded a perceived lack of safety for incorrect
    behavior.  These obfuscations also make this code more distant from
    upstream kernel standards.
    
    I'm only going to focus here on strcmp/strncmp and strlen/strnlen.  I
    choose these two, because in the context of the kernel, it's not easy
    to make a general rule, such as "always use the 'n' variant".  These
    function have different behavior, and the 'n' isn't just a blanket fix
    that makes them better.  In many instances, the correct call is the
    plain variant (strcpy has a strlcpy variant which is usually helpful).
    
    Let's start with strlen/strnlen.
    
    	size_t strlen(const char *);
    	size_t strnlen(const char *, size_t);
    
    The strlen() function scans for a NUL byte within a string, and
    returns the number of characters that had to be skipped to get there.
    
    The strnlen() call is similar, but will stop after after a maximal
    number of characters, and return that result.  This variant was
    intended for storing variable length strings in fixed-sized buffers,
    where the full-length case did not have a trailing NUL.  This storage
    model is very uncommon, as is the use of strnlen().
    
    The question becomes, what is the maximal length you should be giving
    to strnlen().  If the string is truly variable length (allocated and
    filled), there really isn't a meaningful value to use for this.  The
    only time that a max length makes sense is when you have something
    like:
    
    	char name[MAX_NAME_LENGTH];
    
    but, in this case, strnlen() is still probably not what you want to be
    using.  It would be safe to use, if you check the result, and if it is
    MAX_NAME_LENGTH, raise some kind of error case.  If later code assumes
    there is a NUL at the end, there will still be a buffer overflow.  In
    this case, it is much better to check the length before storing it in
    this field, and make sure there is room for the NUL.
    
    If the string is a constant, passing in a length doesn't make sense,
    since you would have to know the length of the string to check that.
    There is no safety issue with calling strlen() on a constant.
    
    So, the simple rule for strnlen()/strlen() is:
    
      - If the string doesn't have an obvious bound length, such as an
        allocated string, use strlen().
    
      - If the string is a constant, use strlen().
    
      - If there is a fixed buffer, strnlen() might make sense, but it is
        probably better to change the design to avoid these types of
        strings.
    
    The only case where strnlen really makes sense is when you have a
    string that is passed in from the user.  In this case, it is very
    important to check the result, and if the length is at the maximum,
    return an error, and don't try to do any processing on the string.
    
    Moving on to strcmp/strncmp.  These functions are similar, except that
    they take two string arguments, which gives a lot more combinations.
    
    	int strcmp(const char *, const char *);
    	int strncmp(const char *, const char *, size_t);
    
    These will walk both strings (at the same time), and stop when
    reaching the first difference between them.  This means that they will
    never go further than the length of the shortest string being
    compared.  As in strnlen, the max argument to strncmp sets a limit on
    the comparison.  Similar to strnlen, the results are unusual when the
    limit is reached, but in a sense, even worse, since it may consider
    the strings equal without ever reaching the end of either.
    
    Looking over the 200 some uses of strncmp in the msm code, almost all
    of them do something akin to:
    
    	strncmp(value, constant, strlen(constant))
    
    If the call has added 1 to the 'strlen' result, the strncmp would just
    become an inefficient, but otherwise identical version of strcmp.
    However, without the +1, this compares the prefix of 'value' instead
    of the entire string.  Only one instance of strncmp in the code
    appears to be intentionally checking for a prefix.  The rest have
    changed a simple string compare into an unintentional prefix match.
    
    Because there are two strings, it is a little more complex to
    determine which to use, but it is very similar.  It might seem that
    strncmp() would be useful for checking an unknown buffer (say from
    userspace).  However, since strncmp()'s result doesn't distinguish
    between finding the end of the string, or hitting the max, there's no
    way to know.  Some guidelines:
    
      - If one of the strings has a known bound length (such as a
        constant, or another string whose length has already been
        checked), AND this bound length is within the expected buffer of
        the other string, it is safe to use strcmp().
    
      - Otherwise, you may need to use something like strnlen() to
        determine a maximum length before calling strcmp().
    
      - strncmp() is useful to test a string for a prefix.  No other uses
        make sense.
    
    To facilitate fixing these, remove strlen(), strcmp(), and
    strcasecmp() from the list of calls that are banned.  Problems with
    these calls need to be caught at a higher level (such as review), and
    replacing them with the 'n' variants doesn't help anything.
    
    This will be followed by some patches that fixup the incorrect code
    introduced by this "ban".
    
    Change-Id: I77dfe1f2f58e8c951e4b38b23f4ec79f8209b1dc
    Signed-off-by: default avatarDavid Brown <davidb@codeaurora.org>
    f40e12fc
checkpatch.pl 103 KB