![]() |
Toolbox snapshot
The Reactive C++ Toolbox
|
The strncpy()
function may lead to buffer overruns when used incorrectly.
The purpose this article is to:
The strcpy()
function will result in a buffer overrun if the source string is larger than the destination:
The problem here is that the zero terminator will be written beyond the end of the buffer:
+---+---+---+---+---+---+ | A | y | l | e | t | t |\0 +---+---+---+---+---+---+ ^ 0 1 2 3 4 5 write overrun
Programmers often reach for strncpy()
as a possible solution, believing it to be a "bounded" version of strcpy()
:
But there are subtle differences. In this case, the function simply stops appending characters when the destination buffer is exhausted, and it does not truncate the destination string with a zero-terminator. This may lead to a read overrun if the consumer assumes a zero-terminated (c-style) string:
+---+---+---+---+---+---+ | A | y | l | e | t | t | +---+---+---+---+---+---+ 0 1 2 3 4 5
This often leaves programmers wondering why strncpy()
has such a seemingly broken contract. In order to answer this question, we need to stop thinking of strncpy()
as a bounded version of strcpy()
, and consider the use-case it was designed for.
One of the advantages of C and C++ is that they allow programmers to layout their data structures in contiguous regions of memory:
These "packed" data structures are ideal for wire protocols and file formats. When populating such structures, however, programmers must take care to avoid security vulnerabilities:
This is bad, because memory locations beyond the zero-terminator are leaked to the outside world:
+---+---+---+---+---+---+-------+---+ | M | a | r | k |\0 | ? | ... | ? | +---+---+---+---+---+---+-------+---+ 0 1 2 3 4 5 ... 63
If these memory locations happened to contain sensitive data, then your system is now vulnerable to attack.
To avoid exposing private memory locations, we must ensure that any unused gaps in our data structures are suitably filled. What we need is a function that copies the source string to the destination, and then fills any remaining bytes in the destination buffer with a padding:
The API can be simplified by inferring the size of the destination buffer:
This function can then be called as follows:
Which will result in a "space padded" string:
+---+---+---+---+---+---+ | M | a | r | k | | | +---+---+---+---+---+---+ 0 1 2 3 4 5
Note that it is technically more correct to refer to this string encoding as "space padded" rather than "space terminated", because the string will not be "space terminated" when the source string length is greater than or equal to the destination.
We are now ready to tackle the notorious strncpy()
contract. The FreeBSD man page states that:
strncpy()
copies at mostlen
characters fromsrc
intodst
. Ifsrc
is less thanlen
characters long, the remainder ofdst
is filled with `\0’ characters.
Notice anything familiar about this? It is precisely the function described in the previous section, except the pad character is now a zero (‘\0’) rather than a space.
The following idiom is often used when zero termination is required:
This idiom is neatly encapsulated by the following C++ template, which avoids common programming errors regarding the buffer size:
Although we now have a better understanding on the strncpy()
contract, we still need a solution for reading these zero-padded strings without overrunning the buffer. Consider the following member function:
If surname_
is a zero-padded string with no padded, then it will not be zero-terminated and the unwary user will likely overrun the buffer when reading. Assuming that we do not want a zero-terminated string, the cleanest solution is to use std::string_view<>
:
Again, the key point is to think of strncpy()
as a function that copies a c-style string into a fixed length buffer, and then zero pads any remaining bytes.
In summary:
strncpy()
is a poorly named function that is often misunderstood;strncpy()
, decide whether you actually want a zero-terminated string or a zero-padded fixed length string;strncpy()
to write the string and std::string_view<>
to read the string;One final word of advice: don't gloss over the details. Taking time to revisit the basics will often lead to new insights and fresh perspectives.