- printf format string
-
Printf format string (which stands for "print formatted") refers to a control parameter used by a class of functions typically associated with some types of programming languages. The format string specifies a method for rendering an arbitrary number of varied data type parameter(s) into a string. This string is then by default printed on the standard output stream, but variants exist that perform other tasks with the result. Characters in the format string are usually copied literally into the function's output, with the other parameters being rendered into the resulting text at points marked by format specifiers, which are typically introduced by a % character.
Contents
Timeline
Many programming languages implement a
printf
function, to output a formatted string. It originated from the C programming language, where it has a prototype similar to the following:int printf(const char *format, ...)
The string constant
format
provides a description of the output, with placeholders marked by "%" escape characters, to specify both the relative location and the type of output that the function should produce. The return value yields the number of printed characters.Fortran, COBOL
Fortran's variadic
PRINT
statement referenced a non-executableFORMAT
statement.PRINT 601, 123456, 1000.0, 3.1415, 250 601 FORMAT (8H RED NUM,I7,4H EXP,E8.1,5H REAL,F5.2,4H VAL,I4)
will print the following (on a new line, because of the leading blank character)[1]:
RED NUM 123456 EXP 1.0E 03 REAL 3.14 VAL 250
COBOL provided formatting via hierarchical data structure specification:
01 out-rec. 02 out-name picture x(20). 02 out-amount picture $9,999.99.
...
move me to out-name. move amount to out-amout. write out-rec.
1960s: BCPL, ALGOL 68, Multics PL/I
C's variadic
printf
has its origins in BCPL'swritef
function.ALGOL 68 Draft and Final report had the functions
inf
andoutf
, subsequently these were revised out of the original language and replaced with the now more familiarreadf/getf
andprintf/putf
.printf(($"Color "g", number1 "6d,", number2 "4zd,", hex "16r2d,", float "-d.2d,", unsigned value"-3d"."l$, "red", 123456, 89, BIN 255, 3.14, 250));
Multics has a standard function called
ioa_
with a wide variety of control codes. It was based on a machine-language facility from Multics's BOS (Bootstrap Operating System).call ioa_ ("Hello, ^a", "World!");
1970s: C, Lisp
printf("Color %s, number1 %d, number2 %05d, hex %x, float %5.2f, unsigned value %u.\n", "red", 123456, 89, 255, 3.14159, 250);
will print the following line (including new-line character, \n):
Color red, number1 123456, number2 00089, hex ff, float 3.14, unsigned value 250.
The
printf
function returns the number of characters printed, or a negative value if an output error occurs.Common Lisp has the
format
function.(format t "Hello, ~a" "World!")
prints
"Hello, World!"
on the standard output stream. If the first argument isnil
, format returns the string to its caller. The first argument can also be any output stream.format
was introduced into ZetaLisp at MIT in 1978, based on the Multicsioa_
, and was later adopted into the Common Lisp standard.1980s: Perl, Shell
Perl also has a
printf
function. Common Lisp has a format function which acts according to the same principles asprintf
, but uses different characters for output conversion. The GLib library containsg_print
, an implementation ofprintf
.Some Unix systems have a
printf
program for use in shell scripts. This can be used instead of echo in situations where the latter is not portable. For example:echo -n -e "$FOO\t$BAR"
may be rewritten portably as:
printf "%s\t%s" "$FOO" "$BAR"
1990s: PHP, Python
1991: Python's
%
operator harkens toprintf
's syntax when interpolating the contents of a tuple. This operator can, for example, be used with theprint
function:print("%s\t%s" % (foo,bar))
Version 2.6 of Python included the
str.format()
which is preferred to the obsolete%
which may go away in future versions of Python:print("If you multiply five and six you get {0}.".format(5*6))
1995: PHP also has the
printf
function, with the same specifications and usage as that in C/C++. MATLAB does not haveprintf
, but does have its two extensionssprintf
andfprintf
which use the same formatting strings.sprintf
returns a formatted string instead of producing a visual output.2000s: Java
2004: Java supported
printf
from version 1.5 onwards as a member of thePrintStream
[2] class, giving it the functionality of both theprintf
and fprintf functions. At the same timesprintf
-like functionality was added to theString
class by adding theformat(String, Object... args)
method.[3]// Write "Hello, World!" to standard output (like printf) System.out.printf("%s, %s", "Hello", "World!"); // create a String object with the value "Hello, World!" (like sprintf) String myString = String.format("%s, %s", "Hello", "World!");
Unlike most other implementations, Java's implementation of
printf
throws an exception on encountering a malformed format string.Format placeholders
Formatting takes place via placeholders within the format string. For example, if a program wanted to print out a person's age, it could present the output by prefixing it with "Your age is ". To denote that we want the integer for the age to be shown immediately after that message, we may use the format string:
"Your age is %d."
The syntax for a format placeholder is "%[parameter][flags][width][.precision][length]type".
- Parameter can be omitted or can be:
-
Character Description n$
n is the number of the parameter to display using this format specifier, allowing the parameters provided to be output multiple times, using varying format specifiers or in different orders. This is a POSIX extension and not in C99. Example: printf("%2$d %2$#x; %1$d %1$#x",16,17)
produces"17 0x11; 16 0x10"
- Flags can be zero or more (in any order) of:
-
Character Description + always denote the sign '+' or '-' of a number (the default is to omit the sign for positive numbers). Only applicable to numeric types. space prefixes non-negative signed numbers with a space - left-align the output of this placeholder (the default is to right-align the output). # Alternate form. For 'g' and 'G', trailing zeros are not removed. For 'f', 'F', 'e', 'E', 'g', 'G', the output always contains a decimal point. For 'o', 'x', and 'X', a 0, 0x, and 0X, respectively, is prepended to non-zero numbers. 0 use 0 instead of spaces to pad a field when the width option is specified. For example, printf("%2d", 3)
results in " 3", whileprintf("%02d", 3)
results in "03".
- Width specifies a minimum number of characters to output, and is typically used to pad fixed-width fields in tabulated output, where the fields would otherwise be smaller, although it does not cause truncation of oversized fields. A leading zero in the width value is interpreted as the zero-padding flag mentioned above, and a negative value is treated as the positive value in conjunction with the left-alignment "-" flag also mentioned above.
- Precision usually specifies a maximum limit on the output, depending on the particular formatting type. For floating point numeric types, it specifies the number of digits to the right of the decimal point that the output should be rounded. For the string type, it limits the number of characters that should be output, after which the string is truncated.
- Length can be omitted or be any of:
-
Character Description hh
For integer types, causes printf
to expect anint
sized integer argument which was promoted from achar
.h
For integer types, causes printf
to expect anint
sized integer argument which was promoted from ashort
.l
For integer types, causes printf
to expect along
sized integer argument.ll
For integer types, causes printf
to expect along long
sized integer argument.L
For floating point types, causes printf
to expect along double
argument.z
For integer types, causes printf
to expect asize_t
sized integer argument.j
For integer types, causes printf
to expect aintmax_t
sized integer argument.t
For integer types, causes printf
to expect aptrdiff_t
sized integer argument.
Additionally, several platform-specific length options came to exist prior to widespread use of the ISO C99 extensions:
-
Characters Description I
For signed integer types, causes printf
to expectptrdiff_t
sized integer argument; for unsigned integer types, causesprintf
to expectsize_t
sized integer argument. Commonly found in Win32/Win64 platforms.I32
For integer types, causes printf
to expect a 32-bit (double word) integer argument. Commonly found in Win32/Win64 platforms.I64
For integer types, causes printf
to expect a 64-bit (quad word) integer argument. Commonly found in Win32/Win64 platforms.q
For integer types, causes printf
to expect a 64-bit (quad word) integer argument. Commonly found in BSD platforms.
ISO C99 includes the
inttypes.h
header file that includes a number of macros for use in platform-independentprintf
coding. Example macros include:-
Characters Description PRId32
Typically equivalent to I32d
(Win32/Win64) ord
PRId64
Typically equivalent to I64d
(Win32/Win64),lld
(32-bit platforms) orld
(64-bit platforms)PRIi32
Typically equivalent to I32i
(Win32/Win64) ori
PRIi64
Typically equivalent to I64i
(Win32/Win64),lli
(32-bit platforms) orli
(64-bit platforms)PRIu32
Typically equivalent to I32u
(Win32/Win64) oru
PRIu64
Typically equivalent to I64u
(Win32/Win64),llu
(32-bit platforms) orlu
(64-bit platforms)PRIx64
Typically equivalent to I64x
(Win32/Win64),llx
(32-bit platforms) orlx
(64-bit platforms)
- Type can be any of:
-
Character Description d
,i
int
as a signed decimal number. '%d
' and '%i
' are synonymous for output, but are different when used withscanf()
for input.u
Print decimal unsigned int
.f
,F
double
in normal (fixed-point) notation. 'f' and 'F' only differs in how the strings for an infinite number or NaN are printed ('inf', 'infinity' and 'nan' for 'f', 'INF', 'INFINITY' and 'NAN' for 'F').e
,E
double
value in standard form ([-]d.ddd e[+/-]ddd). An E conversion uses the letter E (rather than e) to introduce the exponent. The exponent always contains at least two digits; if the value is zero, the exponent is 00. In Windows, the exponent contains three digits by default, e.g. 1.5e002, but this can be altered by Microsoft-specific_set_output_format
function.g
,G
double
in either normal or exponential notation, whichever is more appropriate for its magnitude. 'g' uses lower-case letters, 'G' uses upper-case letters. This type differs slightly from fixed-point notation in that insignificant zeroes to the right of the decimal point are not included. Also, the decimal point is not included on whole numbers.x
,X
unsigned int
as a hexadecimal number. 'x' uses lower-case letters and 'X' uses upper-case.o
unsigned int
in octal.s
null-terminated string. c
char
(character).p
void *
(pointer to void) in an implementation-defined format.n
Print nothing, but write number of characters successfully written so far into an integer pointer parameter. %
a literal '%' character (this type doesn't accept any flags, width, precision or length).
The width and precision formatting parameters may be omitted, or they can be a fixed number embedded in the format string, or passed as another function argument when indicated by an asterisk "*" in the format string. For example
printf("%*d", 5, 10)
will result in" 10"
being printed, with a total width of 5 characters, andprintf("%.*s", 3, "abcdef")
will result in "abc" being printed.If the syntax of a conversion specification is invalid, behavior is undefined, and can cause program termination. If there are too few function arguments provided to supply values for all the conversion specifications in the template string, or if the arguments are not of the correct types, the results are also undefined. Excess arguments are ignored. In a number of cases, the undefined behavior has led to "Format string attack" security vulnerabilities.
Some compilers, like the GNU Compiler Collection, will statically check the format strings of printf-like functions and warn about problems (when using the flags
-Wall
or-Wformat
). GCC will also warn about user-defined printf-style functions if the non-standard "format" __attribute__ is applied to the function.Risks of using field width versus explicit delimiters in tabular output
Using only field widths to provide for tabulation, as with a format like "
%8d%8d%8d
" for three integers in three 8-character columns, will not guarantee that field separation will be retained if large numbers occur in the data. Loss of field separation can easily lead to corrupt output. In systems which encourage the use of programs as building blocks in scripts, such corrupt data can often be forwarded into and corrupt further processing, regardless of whether the original programmer expected the output would only be read by human eyes. Such problems can be eliminated by including explicit delimiters, even spaces, in all tabular output formats. Simply changing the dangerous example from before to "%7d %7d %7d
" addresses this, formatting identically until numbers become larger, but then explicitly preventing them from becoming merged on output due to the explicitly-included spaces. Similar strategies apply to string data.Custom format placeholders
There are a few implementations of
printf
-like functions that allow extensions to the escape-character-based mini-language, thus allowing the programmer to have a specific formatting function for non-builtin types. One of the most well-known is the (now deprecated) glibc'sregister_printf_function()
. However, it is rarely used due to the fact that it conflicts with static format string checking. Another is Vstr custom formatters, which allows adding multi-character format names, and can work with static format checkers.Some applications (like the Apache HTTP Server) include their own
printf
-like function, and embed extensions into it. However these all tend to have the same problems thatregister_printf_function()
has.Most non-C languages that have a
printf
-like function work around the lack of this feature by just using the "%s
" format and converting the object to a string representation. C++ offers a notable exception, in that it has aprintf
function inherited from its C history, but also has a completely different mechanism that is preferred.Programming languages with printf
- AMPL
- awk
- Bourne shell (sh) and derivatives such as Korn shell (ksh), Bourne again shell (bash), or Z shell (zsh)
- C
- C++ (also provides overloaded shift operators and manipulators as an alternative for formatted output - see iostream and iomanip)
- Objective-C
- Clojure
- D
- F#
- GNU MathProg
- GNU Octave
- Go
- Haskell
- Java (since version 1.5)
- Maple
- Mathematica
- MATLAB
- Mythryl
- Objective Caml
- PHP
- Perl
- Python (using the % operator)
- R
- Ruby
- Vala (via
print()
andFileStream.printf()
)
See also
- C standard library
- Format string attack
- iostream
- ML (programming language)
- printf debugging
- printk (print kernel messages)
- scanf
Notes
- ^ "ASA Print Control Characters". http://www.felgall.com/asa.htm. Retrieved February 12, 2010.
- ^ "PrintStream (Java 2 Platform SE 5.0)". Sun Microsystems Inc.. 1994. http://java.sun.com/j2se/1.5.0/docs/api/java/io/PrintStream.html#printf(java.lang.String,%20java.lang.Object...). Retrieved 2008-11-18.
- ^ "String (Java 2 Platform SE 5.0)". Sun Microsystems Inc.. 1994. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#format(java.lang.String,%20java.lang.Object...). Retrieved 2008-11-18.
External links
- C++ reference for
std::fprintf
- gcc printf format specifications quick reference
- The Single UNIX® Specification, Issue 7 from The Open Group : print formatted output – System Interfaces Reference,
- The
Formatter
specification in Java 1.5 - GNU Bash
printf(1)
builtin
Unix command-line interface programs and shell builtins (more) File system Processes User environment Text processing Shell builtins Networking Searching Documentation Miscellaneous List of Unix utilities Categories:- C Standard Library
- Unix software
Wikimedia Foundation. 2010.