- C string handling
-
"C string" redirects here. For the underwear and swimwear, see C-string (clothing).
C Standard Library - Data types
- Character classification
- Strings
- Mathematics
- File input/output
- Date/time
- Localization
- Memory allocation
- Program control
- Miscellaneous headers:
In computer programming, a C string is a character string stored as an array containing the characters and terminated with a null character ('\0', called NUL in ASCII). The name refers to the C programming language which uses this null-terminated string representation. Alternative names are C string and ASCIIZ (note that C strings do not imply the use of ASCII).
The length of a C string is found by searching for the (first) NUL byte. This can be slow, as it takes O(n) (linear time) with respect to the string length. It also means that a NUL cannot be inside the string, as the only NUL is the one marking the end.
Contents
Definitions
The term, string, is used in C to describe a contiguous sequence of characters terminated by and including the first null byte.[1] A common misconception is that a string is an array, because string literals are converted to arrays during the compilation (or translation) phase.[2] It is important to remember that a string ends at the first NUL byte. An array or string literal that contains a null byte before the last byte therefore contains a string, or possibly several strings, but is not itself a string.[3]
The term, pointer to a string is used in C to describe a pointer to the initial (lowest addressed) byte of a string.[1] As pointers are used to pass a reference to a string to functions in C, documentation (including this page) will often use the term string when correct notation is to say pointer to string.
The term, length of a string is used in C to describe the number of bytes preceding the null character.[1]
strlen
is a standardised function commonly used to determine the length of a string.Overview of functions
Most of the functions that operate on C strings are defined in the
string.h
(cstring
header in C++). This header contains declarations of functions and types used not only for handling C strings but also various memory handling functions; the name is thus something of a misnomer.Functions declared in
string.h
are extremely popular, since as a part of the C standard library, they are guaranteed to work on any platform which supports C. However, some security issues exist with these functions, such as buffer overflows, leading programmers to prefer safer, possibly less portable variants. Also, the string functions only work with character encodings made of bytes, such as ASCII and UTF-8. In historical documentation the term "character" was often used instead of "byte", which if followed literally would mean that multi-byte encodings such as UTF-8 were not supported. The BSD documentation has been fixed to make this clear, but POSIX, Linux, and Windows documentation still uses "character" in many places. Functions to handle character encodings made up of larger code units than bytes, such as UTF-16, is generally achieved throughwchar.h
.Constants and types
Name Notes NULL
macro expanding to the null pointer constant; that is, a constant representing a pointer value which is guaranteed not to be a valid address of an object in memory. size_t
an unsigned integer type which is the type of the result of the sizeof
operator.Functions
- String manipulation
- copies one string to another
- write exactly n bytes to a string, copying from src or add 0's
- appends one string to another
- appends no more than n bytes from one string to another
- transforms a string according to the current locale
- String examination
- is a function in C, C++, PHP which returns the length of a string
- compares two strings
- compares specific number of bytes in two strings
- compares two strings according to the current locale
- finds the first occurrence of a byte
- finds the last occurrence of a byte
- finds the first occurrence of a byte not in a set of bytes
- finds the last occurrence of a byte not in a set of bytes
- finds the first occurrence of a byte in a set of bytes
- finds the first occurrence of a substring
- finds the next occurrence of a token
- Miscellaneous
char* strerror(int errnum);
- generates and reports a C-style string, containing an error message derived from the error code passed in with errnum.[4] The strerror function is not reentrant.
- Memory manipulation
- fills a buffer with a byte repeated
- copies one buffer to another
- copies one buffer to another, possibly overlapping, buffer
- compares two buffers
- finds the first occurrence of a byte
Numeric conversions
C standard library contains several functions for numeric conversions. They all are defined in the
stdlib.h
header (cstdlib
header in C++).atof
- converts a string to a floating-point valueatoi
,atol
,atoll
(C99/C++11) - converts a string to an integerstrtof
(C99/C++11),strtod
,strtold
(C99/C++11) - converts a string to a floating-point valuestrtol
,strtoll
- converts a string to a signed integerstrtoul
,strtoull
- converts a string to an unsigned integer
Popular extensions
memccpy
- SVID, POSIX - copies up to specified number of bytes between two memory areas, which must not overlap, stopping when a given byte is found[5]mempcpy
- GNU - a variant ofmemcpy
returning a pointer to the byte following the last written bytestrcat_s
- ISO/IEC WDTR 24731 - a variant ofstrcat
that checks for errors, such as destination buffer being too small, before copyingstrcpy_s
- ISO/IEC WDTR 24731 - a variant ofstrcpy
that checks for errors, such as destination buffer being too small, before copyingstrdup
- POSIX - allocates and duplicates a stringstrerror_r
- POSIX 1 - a variant ofstrerror
that is thread-safe.strlcpy
- a variant ofstrcpy
that truncates the copied string if the destination is too small[6]strlcat
- a variant ofstrcat
that truncates the appended string if the destination is too small[6]strsignal
- POSIX:2008 - returns string representation of a signal code. Not thread safe[7]strtok
- POSIX - a variant ofstrtok_r
that is thread-safe.[8]
Criticism
strcat_s
andstrcpy_s
attracted considerable criticism because even though they are defined in ISO/IEC WDTR 24731 standard, they are currently supported only by Microsoft Visual C++. Warning messages produced by Microsoft's compilers suggesting programmers use these functions instead of standard ones have been speculated by some to be a Microsoft attempt to lock developers to its platform.[9][10]strlcpy
andstrlcat
have been criticised on the basis that they create more problems than they solve[11] and lacking documentation[12]. Consequently they have not been included to Linux even though several other operating systems, notably OpenBSD, FreeBSD, Solaris, Mac OS X, implement it.See also
- String functions
- Null-terminated string
References
- ^ a b c "The C99 standard draft + TC3". Section 7.1.1p1. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf. Retrieved 7 January 2011.
- ^ "The C99 standard draft + TC3". Section 6.4.5p7. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf. Retrieved 7 January 2011.
- ^ "The C99 standard draft + TC3". Section 6.4.5 footnote 66. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf. Retrieved 7 January 2011.
- ^ strerror
- ^ "memccpy". Pubs.opengroup.org. http://pubs.opengroup.org/onlinepubs/009695399/functions/memccpy.html. Retrieved 9 November 2011.
- ^ a b Todd C. Miller; Theo de Raadt (1999). "strlcpy and strlcat - consistent, safe, string copy and concatenation.". USENIX '99. http://www.gratisoft.us/todd/papers/strlcpy.html.
- ^ "strsignal". Pubs.opengroup.org. http://pubs.opengroup.org/onlinepubs/9699919799/functions/strsignal.html. Retrieved 9 November 2011.
- ^ "strtok". Pubs.opengroup.org. http://pubs.opengroup.org/onlinepubs/009695399/functions/strtok.html. Retrieved 9 November 2011.
- ^ Danny Kalev. "They're at it again". InformIT. http://www.informit.com/blogs/blog.aspx?uk=Theyre-at-it-again. Retrieved 10 November 2011.
- ^ "Security Enhanced CRT, Safer Than Standard Library?". http://fsfoundry.org/codefreak/2008/09/15/security-crt-safer-than-standard-library/. Retrieved 10 November 2011.
- ^ libc-alpha mailing list, selected messages from 8 August 2000 thread: 53, 60, 61
- ^ Antill, James. "Security with string APIs: Security relevant things to look for in a string library API". http://www.and.org/vstr/security#strncpy-ex7. Retrieved 10 November 2011.
Categories:- C programming language
- C Standard Library
- String (computer science)
Wikimedia Foundation. 2010.