887 lines
28 KiB
HTML
887 lines
28 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|
|
"http://www.w3.org/TR/html4/strict.dtd">
|
|
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html;charset=US-ASCII">
|
|
<title>Text Formatting</title>
|
|
|
|
<style type="text/css">
|
|
|
|
body { color: #000000; background-color: #FFFFFF; }
|
|
del { text-decoration: line-through; color: #8B0040; }
|
|
ins { text-decoration: underline; color: #005100; }
|
|
|
|
p.example { margin-left: 2em; }
|
|
pre.example { margin-left: 2em; }
|
|
div.example { margin-left: 2em; }
|
|
|
|
code.extract { background-color: #F5F6A2; }
|
|
pre.extract { margin-left: 2em; background-color: #F5F6A2;
|
|
border: 1px solid #E1E28E; }
|
|
|
|
p.function { }
|
|
.attribute { margin-left: 2em; }
|
|
.attribute dt { float: left; font-style: italic;
|
|
padding-right: 1ex; }
|
|
.attribute dd { margin-left: 0em; }
|
|
|
|
blockquote.std { color: #000000; background-color: #F1F1F1;
|
|
border: 1px solid #D1D1D1;
|
|
padding-left: 0.5em; padding-right: 0.5em; }
|
|
blockquote.stddel { text-decoration: line-through;
|
|
color: #000000; background-color: #FFEBFF;
|
|
border: 1px solid #ECD7EC;
|
|
padding-left: 0.5empadding-right: 0.5em; ; }
|
|
|
|
blockquote.stdins { text-decoration: underline;
|
|
color: #000000; background-color: #C8FFC8;
|
|
border: 1px solid #B3EBB3; padding: 0.5em; }
|
|
|
|
table { border: 1px solid black; border-spacing: 0px;
|
|
margin-left: auto; margin-right: auto; }
|
|
th { text-align: left; vertical-align: top;
|
|
padding-left: 0.8em; border: none; }
|
|
td { text-align: left; vertical-align: top;
|
|
padding-left: 0.8em; border: none; }
|
|
|
|
</style>
|
|
|
|
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"
|
|
type="text/javascript"> </script>
|
|
|
|
<script type="text/javascript">$(function() {
|
|
var next_id = 0
|
|
function find_id(node) {
|
|
// Look down the first children of 'node' until we find one
|
|
// with an id. If we don't find one, give 'node' an id and
|
|
// return that.
|
|
var cur = node[0];
|
|
while (cur) {
|
|
if (cur.id) return curid;
|
|
if (cur.tagName == 'A' && cur.name)
|
|
return cur.name;
|
|
cur = cur.firstChild;
|
|
};
|
|
// No id.
|
|
node.attr('id', 'gensection-' + next_id++);
|
|
return node.attr('id');
|
|
};
|
|
|
|
// Put a table of contents in the #toc nav.
|
|
|
|
// This is a list of <ol> elements, where toc[N] is the list for
|
|
// the current sequence of <h(N+2)> tags. When a header of an
|
|
// existing level is encountered, all higher levels are popped,
|
|
// and an <li> is appended to the level
|
|
var toc = [$("<ol/>")];
|
|
$(':header').not('h1').each(function() {
|
|
var header = $(this);
|
|
// For each <hN> tag, add a link to the toc at the appropriate
|
|
// level. When toc is one element too short, start a new list
|
|
var levels = {H2: 0, H3: 1, H4: 2, H5: 3, H6: 4};
|
|
var level = levels[this.tagName];
|
|
if (typeof level == 'undefined') {
|
|
throw 'Unexpected tag: ' + this.tagName;
|
|
}
|
|
// Truncate to the new level.
|
|
toc.splice(level + 1, toc.length);
|
|
if (toc.length < level) {
|
|
// Omit TOC entries for skipped header levels.
|
|
return;
|
|
}
|
|
if (toc.length == level) {
|
|
// Add a <ol> to the previous level's last <li> and push
|
|
// it into the array.
|
|
var ol = $('<ol/>')
|
|
toc[toc.length - 1].children().last().append(ol);
|
|
toc.push(ol);
|
|
}
|
|
var header_text = header.text();
|
|
toc[toc.length - 1].append(
|
|
$('<li/>').append($('<a href="#' + find_id(header) + '"/>')
|
|
.text(header_text)));
|
|
});
|
|
$('#toc').append(toc[0]);
|
|
})
|
|
</script>
|
|
|
|
</head>
|
|
<body>
|
|
<h1>Text Formatting</h1>
|
|
|
|
<p>
|
|
2016-08-19
|
|
</p>
|
|
|
|
<address>
|
|
Victor Zverovich, victor.zverovich@gmail.com
|
|
</address>
|
|
|
|
<div id="toc"></div>
|
|
|
|
<h2><a name="Introduction">Introduction</a></h2>
|
|
|
|
<p>
|
|
This paper proposes a new text formatting functionality that can be used as a
|
|
safe and extensible alternative to the <code>printf</code> family of functions.
|
|
It is intended to complement the existing C++ I/O streams library and reuse
|
|
some of its infrastructure such as overloaded insertion operators for
|
|
user-defined types.
|
|
</p>
|
|
|
|
<p>
|
|
Example:
|
|
|
|
<pre class="example">
|
|
<code>std::string message = std::format("The answer is {}.", 42);</code>
|
|
</pre>
|
|
|
|
<h2><a name="Design">Design</a></h2>
|
|
|
|
<h3><a name="Syntax">Format String Syntax</a></h3>
|
|
|
|
<p>
|
|
Variations of the printf format string syntax are arguably the most popular
|
|
among the programming languages and C++ itself inherits <code>printf</code>
|
|
from C <a href="#1">[1]</a>. The advantage of the printf syntax is that many
|
|
programmers are familiar with it. However, in its current form it has a number
|
|
of issues:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>Many format specifiers like <code>hh</code>, <code>h</code>, <code>l</code>,
|
|
<code>j</code>, etc. are used only to convey type information.
|
|
They are redundant in type-safe formatting and would unnecessarily
|
|
complicate specification and parsing.</li>
|
|
<li>There is no standard way to extend the syntax for user-defined types.</li>
|
|
<li>There are subtle differences between different implementations. For example,
|
|
POSIX positional arguments <a href="#2">[2]</a> are not supported on
|
|
some systems <a href="#6">[6]</a>.</li>
|
|
<li>Using <code>'%'</code> in a custom format specifier, e.g. for
|
|
<code>put_time</code>-like time formatting, poses difficulties.</li>
|
|
</ul>
|
|
|
|
<p>
|
|
Although it is possible to address these issues, this will break compatibility
|
|
and can potentially be more confusing to users than introducing a different
|
|
syntax.
|
|
</p>
|
|
|
|
<p>
|
|
Therefore we propose a new syntax based on the ones used in Python
|
|
<a href="#3">[3]</a>, the .NET family of languages <a href="#4">[4]</a>,
|
|
and Rust <a href="#5">[5]</a>. This syntax employs <code>'{'</code> and
|
|
<code>'}'</code> as replacement field delimiters instead of <code>'%'</code>
|
|
and it is described in details in the <a href="#SyntaxRef">syntax reference</a>.
|
|
Here are some of the advantages:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>Consistent and easy to parse mini-language focused on formatting rather
|
|
than conveying type information</li>
|
|
<li>Extensibility and support for custom format strings for user-defined
|
|
types</li>
|
|
<li>Positional arguments</li>
|
|
<li>Support for both locale-specific and locale-independent formatting (see
|
|
<a href="#Locale">Locale Support</a>)</li>
|
|
<li>Formatting improvements such as better alignment control, fill character,
|
|
and binary format
|
|
</ul>
|
|
|
|
<p>
|
|
The syntax is expressive enough to enable translation, possibly automated,
|
|
of most printf format strings. The correspondence between <code>printf</code>
|
|
and the new syntax is given in the following table.
|
|
</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>printf</th><th>new</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr><td>-</td><td><</td></tr>
|
|
<tr><td>+</td><td>+</td></tr>
|
|
<tr><td><em>space</em></td><td><em>space</em></td></tr>
|
|
<tr><td>#</td><td>#</td></tr>
|
|
<tr><td>0</td><td>0</td></tr>
|
|
<tr><td>hh</td><td>unused</td></tr>
|
|
<tr><td>h</td><td>unused</td></tr>
|
|
<tr><td>l</td><td>unused</td></tr>
|
|
<tr><td>ll</td><td>unused</td></tr>
|
|
<tr><td>j</td><td>unused</td></tr>
|
|
<tr><td>z</td><td>unused</td></tr>
|
|
<tr><td>t</td><td>unused</td></tr>
|
|
<tr><td>L</td><td>unused</td></tr>
|
|
<tr><td>c</td><td>c (optional)</td></tr>
|
|
<tr><td>s</td><td>s (optional)</td></tr>
|
|
<tr><td>d</td><td>d (optional)</td></tr>
|
|
<tr><td>i</td><td>d (optional)</td></tr>
|
|
<tr><td>o</td><td>o</td></tr>
|
|
<tr><td>x</td><td>x</td></tr>
|
|
<tr><td>X</td><td>X</td></tr>
|
|
<tr><td>u</td><td>d (optional)</td></tr>
|
|
<tr><td>f</td><td>f</td></tr>
|
|
<tr><td>F</td><td>F</td></tr>
|
|
<tr><td>e</td><td>e</td></tr>
|
|
<tr><td>E</td><td>E</td></tr>
|
|
<tr><td>a</td><td>a</td></tr>
|
|
<tr><td>A</td><td>A</td></tr>
|
|
<tr><td>g</td><td>g (optional)</td></tr>
|
|
<tr><td>G</td><td>G</td></tr>
|
|
<tr><td>n</td><td>unused</td></tr>
|
|
<tr><td>p</td><td>p (optional)</td></tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
Width and precision are represented similarly in <code>printf</code> and the
|
|
proposed syntax with the only difference that runtime value is specified by
|
|
<code>*</code> in the former and <code>{}</code> in the latter, possibly with
|
|
the index of the argument inside the braces.
|
|
</p>
|
|
|
|
<p>
|
|
As can be seen from the table above, most of the specifiers remain the same
|
|
which simplifies migration from <code>printf</code>. Notable difference is
|
|
in the alignment specification. The proposed syntax allows left, center,
|
|
and right alignment represented by <code>'<'</code>, <code>'^'</code>,
|
|
and <code>'>'</code> respectively which is more expressive than the
|
|
corresponding <code>printf</code> syntax. The latter only supports left and
|
|
right (the default) alignment.
|
|
</p>
|
|
|
|
<p>
|
|
The following example uses center alignment and <code>'*'</code> as a fill
|
|
character:
|
|
</p>
|
|
|
|
<pre class="example">
|
|
<code>std::format("{:*^30}", "centered");</code>
|
|
</pre>
|
|
|
|
<p>
|
|
resulting in <code>"***********centered***********"</code>.
|
|
The same formatting cannot be easily achieved with <code>printf</code>.
|
|
</p>
|
|
|
|
<h3><a name="Extensibility">Extensibility</a></h3>
|
|
|
|
<p>
|
|
Both the format string syntax and the API are designed with extensibility in
|
|
mind. The mini-language can be extended for user-defined types and users can
|
|
provide functions that do parsing and formatting for such types.
|
|
</p>
|
|
|
|
<p>The general syntax of a replacement field in a format string is
|
|
|
|
<pre>
|
|
<code>replacement-field ::= '{' [arg-id] [':' format-spec] '}'</code>
|
|
</pre>
|
|
|
|
<p>
|
|
where <code>format-spec</code> is predefined for built-in types, but can be
|
|
customized for user-defined types. For example, the syntax can be extended
|
|
for <code>put_time</code>-like date and time formatting
|
|
</p>
|
|
|
|
<pre class="example">
|
|
<code>std::time_t t = std::time(nullptr);
|
|
std::string date = std::format("The date is {0:%Y-%m-%d}.", *std::localtime(&t));</code>
|
|
</pre>
|
|
|
|
<p>by providing an overload of <code>std::format_arg</code> for
|
|
<code>std::tm</code>:</p>
|
|
|
|
<p>TODO: example</p>
|
|
|
|
<h3><a name="Safety">Safety</a></h3>
|
|
|
|
<p>
|
|
Formatting functions rely on variadic templates instead of the mechanism
|
|
provided by <code><cstdarg></code>. The type information is captured
|
|
automatically and passed to formatters guaranteeing type safety and making
|
|
many of the <code>printf</code> specifiers redundant (see <a href="#Syntax">
|
|
Format String Syntax</a>). Buffer management is also automatic to prevent
|
|
buffer overflow errors common to <code>printf</code>.
|
|
</p>
|
|
|
|
<h3><a name="Locale">Locale Support</a></h3>
|
|
|
|
<p>
|
|
As pointed out in
|
|
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r1.html">
|
|
P0067R1: Elementary string conversions</a> there is a number of use
|
|
cases that do not require internationalization support, but do require high
|
|
throughput when produced by a server. These include various text-based
|
|
interchange formats such as JSON or XML. The need for locale-independent
|
|
functions for conversions between integers and strings and between
|
|
floating-point numbers and strings has also been highlighted in
|
|
<a href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4412.html">
|
|
N4412: Shortcomings of iostreams</a>. Therefore a user should be able to
|
|
easily control whether to use locales or not during formatting.
|
|
</p>
|
|
|
|
<p>
|
|
We follow Python's approach <a href="#3">[3]</a> and designate a separate format
|
|
specifier <code>'n'</code> for locale-aware numeric formatting. It applies to
|
|
all integral and floating-point types. All other specifiers produce output
|
|
unaffected by locale settings. This can also have positive peformance effect
|
|
because locale-independent formatting can be implemented more efficiently.
|
|
</p>
|
|
|
|
<h3><a name="PosArguments">Positional Arguments</a></h3>
|
|
|
|
<p>
|
|
An important feature for localization is the ability to rearrange formatting
|
|
arguments because the word order may vary in different languages
|
|
<a href="#3">[3]</a>. For example:
|
|
</p>
|
|
|
|
<pre class="example">
|
|
<code>printf("String `%s' has %d characters\n", string, length(string)));</code>
|
|
</pre>
|
|
|
|
<p>A possible German translation of the format string might be:</p>
|
|
|
|
<pre class="example">
|
|
<code>"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"</code>
|
|
</pre>
|
|
|
|
<p>
|
|
using POSIX positional arguments <a href="#2">[2]</a>. Unfortunately these
|
|
positional specifiers are not portable <a href="#6">[6]</a>. The C++ I/O
|
|
streams don't support such rearranging of arguments by design because they
|
|
are interleaved with the portions of the literal string:
|
|
</p>
|
|
|
|
<pre class="example">
|
|
<code>std::cout << "String `" << string << "' has " << length(string) << " characters\n";</code>
|
|
</pre>
|
|
|
|
<p>
|
|
The current proposal allows both positional and automatically numbered
|
|
arguments, for example:
|
|
</p>
|
|
|
|
<pre class="example">
|
|
<code>std::format("String `{}' has {} characters\n", string, length(string)));</code>
|
|
</pre>
|
|
|
|
<p>with the German translation of the format string:</p>
|
|
|
|
<pre class="example">
|
|
<code>"{1} Zeichen lang ist die Zeichenkette `{0}'\n"</code>
|
|
</pre>
|
|
|
|
<h3><a name="Locale">Performance</a></h3>
|
|
|
|
<p>TODO</p>
|
|
|
|
<h3><a name="Footprint">Binary Footprint</a></h3>
|
|
|
|
<p>TODO</p>
|
|
|
|
<h2><a name="Wording">Proposed Wording</a></h2>
|
|
|
|
<h3>Header <code><format></code> synopsis</h3>
|
|
|
|
<pre>
|
|
<code>namespace std {
|
|
class format_error;
|
|
|
|
class format_args;
|
|
|
|
template <class Char>
|
|
basic_string<Char> format(const Char* fmt, format_args args);
|
|
|
|
template <class Char, class ...Args>
|
|
basic_string<Char> format(const Char* fmt, const Args&... args);
|
|
}</code>
|
|
</pre>
|
|
|
|
<h3><a name="SyntaxRef">Format string syntax</a></h3>
|
|
|
|
<p>
|
|
Format strings contain <em>replacement fields</em> surrounded by curly braces
|
|
<code>{}</code>. Anything that is not contained in braces is considered literal
|
|
text, which is copied unchanged to the output. A brace character can be
|
|
included in the literal text by doubling: <code>{{</code> and <code>}}</code>.
|
|
</p>
|
|
|
|
<p>
|
|
The grammar for a replacement field is as follows:
|
|
</p>
|
|
|
|
<!-- The notation is the same as in n4296 22.4.3.1. -->
|
|
<pre>
|
|
<code>replacement-field ::= '{' [arg-id] [':' format-spec] '}'
|
|
arg-id ::= integer
|
|
integer ::= digit+
|
|
digit ::= '0'...'9'</code>
|
|
</pre>
|
|
|
|
<p>
|
|
In less formal terms, the replacement field can start with an
|
|
<code>arg-id</code> that specifies the argument whose value is to be formatted
|
|
and inserted into the output instead of the replacement field. The
|
|
<code>arg-id</code> is optionally followed by a <code>format-spec</code>,
|
|
which is preceded by a colon <code>':'</code>. These specify a non-default
|
|
format for the replacement value.
|
|
</p>
|
|
|
|
<p>
|
|
See also the <a href="FormatSpec">Format specification mini-language</a>
|
|
section.
|
|
</p>
|
|
|
|
<p>
|
|
If the numerical <code>arg-id</code>s in a format string are 0, 1, 2, ... in
|
|
sequence, they can all be omitted (not just some) and the numbers 0, 1, 2, ...
|
|
will be automatically inserted in that order.
|
|
</p>
|
|
|
|
<p>
|
|
Some simple format string examples:
|
|
</p>
|
|
|
|
<pre>
|
|
<code>"First, thou shalt count to {0}" // References the first argument
|
|
"Bring me a {}" // Implicitly references the first argument
|
|
"From {} to {}" // Same as "From {0} to {1}"</code>
|
|
</pre>
|
|
|
|
<p>
|
|
The <code>format-spec</code> field contains a specification of how the value
|
|
should be presented, including such details as field width, alignment, padding,
|
|
decimal precision and so on. Each value type can define its own <em>formatting
|
|
mini-language</em> or interpretation of the <code>format-spec</code>.
|
|
</p>
|
|
|
|
<p>
|
|
Most built-in types support a common formatting mini-language, which is
|
|
described in the next section.
|
|
</p>
|
|
|
|
<p>
|
|
A <code>format-spec</code> field can also include nested replacement fields
|
|
in certain position within it. These nested replacement fields can contain only
|
|
an argument index; format specifications are not allowed. This allows the
|
|
formatting of a value to be dynamically specified.
|
|
</p>
|
|
|
|
<h4><a name="FormatSpec">Format specification mini-language</a></h4>
|
|
|
|
<p>
|
|
<em>Format specifications</em> are used within replacement fields contained
|
|
within a format string to define how individual values are presented (see
|
|
<a href="SyntaxRef">Format string syntax</a>). Each formattable type may define
|
|
how the format specification is to be interpreted.
|
|
</p>
|
|
|
|
<p>
|
|
Most built-in types implement the following options for format specifications,
|
|
although some of the formatting options are only supported by the numeric types.
|
|
</p>
|
|
|
|
<p>
|
|
The general form of a <em>standard format specifier</em> is:
|
|
</p>
|
|
|
|
<pre>
|
|
<code>format-spec ::= [[fill] align] [sign] ['#'] ['0'] [width] ['.' precision] [type]
|
|
fill ::= <a character other than '{' or '}'>
|
|
align ::= '<' | '>' | '=' | '^'
|
|
sign ::= '+' | '-' | ' '
|
|
width ::= integer | '{' arg-id '}'
|
|
precision ::= integer | '{' arg-id '}'
|
|
type ::= int-type | 'a' | 'A' | 'c' | 'e' | 'E' | 'f' | 'F' | 'g' | 'G' | 'p' | 's'
|
|
int-type ::= 'b' | 'B' | 'd' | 'o' | 'x' | 'X'</code>
|
|
</pre>
|
|
|
|
<p>
|
|
The <code>fill</code> character can be any character other than <code>'{'</code>
|
|
or <code>'}'</code>. The presence of a fill character is signaled by the
|
|
character following it, which must be one of the alignment options. If the
|
|
second character of <code>format-spec</code> is not a valid alignment option,
|
|
then it is assumed that both the fill character and the alignment option are
|
|
absent.
|
|
|
|
<p>
|
|
The meaning of the various alignment options is as follows:
|
|
</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Option</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'<'</code></td>
|
|
<td>Forces the field to be left-aligned within the available space (this is
|
|
the default for most objects).</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'>'</code></td>
|
|
<td>Forces the field to be right-aligned within the available space (this is
|
|
the default for numbers).</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'='</code></td>
|
|
<td>Forces the padding to be placed after the sign (if any) but before the
|
|
digits. This is used for printing fields in the form
|
|
<code>+000000120</code>. This alignment option is only valid for numeric
|
|
types.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'^'</code></td>
|
|
<td>Forces the field to be centered within the available space.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
Note that unless a minimum field width is defined, the field width will always
|
|
be the same size as the data to fill it, so that the alignment option has no
|
|
meaning in this case.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>sign</code> option is only valid for number types, and can be one of
|
|
the following:
|
|
</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Option</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'+'</code></td>
|
|
<td>Indicates that a sign should be used for both positive as well as negative
|
|
numbers.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'-'</code></td>
|
|
<td>Indicates that a sign should be used only for negative numbers (this is
|
|
the default behavior).</td>
|
|
</tr>
|
|
<tr>
|
|
<td>space</td>
|
|
<td>Indicates that a leading space should be used on positive numbers, and a
|
|
minus sign on negative numbers.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
The <code>'#'</code> option causes the <em>alternate form</em to be used for
|
|
the conversion. The alternate form is defined differently for different types.
|
|
This option is only valid for integer and floating-point types. For integers,
|
|
when binary, octal, or hexadecimal output is used, this option adds the prefix
|
|
respective <code>"0b"</code> (<code>"0B"</code>), <code>"0"</code>, or
|
|
<code>"0x"</code> (<code>"0X"</code>) to the output value. Whether the prefix
|
|
is lower-case or upper-case is determined by the case of the type specifier,
|
|
for example, the prefix <code>"0x"</code> is used for the type <code>'x'</code>
|
|
and <code>"0X"</code> is used for <code>'X'</code>. For floating-point numbers
|
|
the alternate form causes the result of the conversion to always contain a
|
|
decimal-point character, even if no digits follow it. Normally, a decimal-point
|
|
character appears in the result of these conversions only if a digit follows it.
|
|
In addition, for <code>'g'</code> and <code>'G'</code> conversions, trailing
|
|
zeros are not removed from the result.
|
|
</p>
|
|
|
|
<p>
|
|
<code>width</code> is a decimal integer defining the minimum field width. If
|
|
not specified, then the field width will be determined by the content.
|
|
</p>
|
|
|
|
<p>
|
|
Preceding the <code>width</code> field by a zero (<code>'0'</code>) character
|
|
enables sign-aware zero-padding for numeric types. This is equivalent to a
|
|
<code>fill</code> character of <code>'0'</code> with an <code>alignment</code>
|
|
type of <code>'='</code>.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>precision</code> is a decimal number indicating how many digits should
|
|
be displayed after the decimal point for a floating-point value formatted with
|
|
<code>'f'</code> and <code>'F'</code>, or before and after the decimal point
|
|
for a floating-point value formatted with <code>'g'</code> or <code>'G'</code>.
|
|
For non-number types the field indicates the maximum field size - in other
|
|
words, how many characters will be used from the field content. The
|
|
<code>precision</code> is not allowed for integer, character, Boolean, and
|
|
pointer values.
|
|
</p>
|
|
|
|
<p>
|
|
Finally, the <code>type</code> determines how the data should be presented.
|
|
</p>
|
|
|
|
<p>The available string presentation types are:</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Type</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'s'</code></td>
|
|
<td>String format. This is the default type for strings and may be omitted.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>none</td>
|
|
<td>The same as <code>'s'</code>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>The available character presentation types are:</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Type</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'c'</code></td>
|
|
<td>Character format. This is the default type for characters and may be
|
|
omitted.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>none</td>
|
|
<td>The same as <code>'c'</code>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>The available integer presentation types are:</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Type</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'b'</code></td>
|
|
<td>Binary format. Outputs the number in base 2. Using the <code>'#'</code>
|
|
option with this type adds the prefix <code>"0b"</code> to the output
|
|
value.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'B'</code></td>
|
|
<td>Binary format. Outputs the number in base 2. Using the <code>'#'</code>
|
|
option with this type adds the prefix <code>"0B"</code> to the output
|
|
value.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'d'</code></td>
|
|
<td>Decimal integer. Outputs the number in base 10.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'o'</code></td>
|
|
<td>Octal format. Outputs the number in base 8.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'x'</code></td>
|
|
<td>Hex format. Outputs the number in base 16, using lower-case letters for the
|
|
digits above 9. Using the <code>'#'</code> option with this type adds the
|
|
prefix <code>"0x"</code> to the output value.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'X'</code></td>
|
|
<td>Hex format. Outputs the number in base 16, using upper-case letters for the
|
|
digits above 9. Using the <code>'#'</code> option with this type adds the
|
|
prefix <code>"0X"</code> to the output value.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'n'</code></td>
|
|
<td>Number. This is the same as <code>'d'</code>, except that it uses the
|
|
current locale setting to insert the appropriate number separator
|
|
characters.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>none</td>
|
|
<td>The same as <code>'d'</code>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
Integer presentation types can also be used with character and Boolean values.
|
|
Boolean values are formatted using textual representation, either true or false,
|
|
if the presentation type is not specified.
|
|
</p>
|
|
|
|
<p>The available presentation types for floating-point values are:</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Type</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'a'</code></td>
|
|
<td>Hexadecimal floating point format. Prints the number in base 16 with prefix
|
|
<code>"0x"</code> and lower-case letters for digits above 9. Uses
|
|
<code>'p'</code> to indicate the exponent.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'A'</code></td>
|
|
<td>Same as <code>'a'</code> except it uses upper-case letters for the prefix,
|
|
digits above 9 and to indicate the exponent.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'e'</code></td>
|
|
<td>Exponent notation. Prints the number in scientific notation using the
|
|
letter <code>'e'</code> to indicate the exponent.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'E'</code></td>
|
|
<td>Exponent notation. Same as <code>'e'</code> except it uses an upper-case
|
|
<code>'E'</code> as the separator character.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'f'</code></td>
|
|
<td>Fixed point. Displays the number as a fixed-point number.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'F'</code></td>
|
|
<td>Fixed point. Same as <code>'f'</code>, but converts <code>nan</code> to
|
|
<code>NAN</code> and <code>inf</code> to <code>INF</code>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'g'</code></td>
|
|
<td>General format. For a given precision <code>p >= 1</code>, this rounds the
|
|
number to <code>p</code> significant digits and then formats the result in
|
|
either fixed-point format or in scientific notation, depending on its
|
|
magnitude.
|
|
|
|
A precision of <code>0</code> is treated as equivalent to a precision of
|
|
<code>1</code>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>'n'</code></td>
|
|
<td>Number. This is the same as <code>'g'</code>, except that it uses the
|
|
current locale setting to insert the appropriate number separator
|
|
characters.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>none</td>
|
|
<td>The same as <code>'g'</code>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>The available presentation types for pointers are:</p>
|
|
|
|
<table>
|
|
<thead>
|
|
<tr><th>Type</th><th>Meaning</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>'p'</code></td>
|
|
<td>Pointer format. This is the default type for pointers and may be
|
|
omitted.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>none</td>
|
|
<td>The same as <code>'p'</code>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<h3>Class <code>format_error</code></h3>
|
|
|
|
<pre>
|
|
<code>class format_error : public std::runtime_error {
|
|
public:
|
|
explicit format_error(const string& what_arg);
|
|
explicit format_error(const char* what_arg);
|
|
};</code>
|
|
</pre>
|
|
|
|
<p>
|
|
The class <code>format_error</code> defines the type of objects thrown as
|
|
exceptions to report errors from the formatting library.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>format_error(const string& what_arg);</code></dt>
|
|
<dd>
|
|
<p><i>Effects</i>: Constructs an object of class <code>format_error</code>.</p>
|
|
<p><i>Postcondition</i>: <code>strcmp(what(), what_arg.c_str()) == 0</code>.</p>
|
|
</dd>
|
|
<dt><code>format_error(const char* what_arg);</code></dt>
|
|
<dd>
|
|
<p><i>Effects</i>: Constructs an object of class <code>format_error</code>.</p>
|
|
<p><i>Postcondition</i>: <code>strcmp(what(), what_arg) == 0</code>.</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3>Class <code>format_args</code></h3>
|
|
|
|
<p>TODO</p>
|
|
|
|
<h3>Function template <code>format</code></h3>
|
|
|
|
<dl>
|
|
<dt>
|
|
<code>template <class Char><br>
|
|
basic_string<Char> format(const Char* fmt, format_args args);<br>
|
|
<br>
|
|
template <class Char, class ...Args><br>
|
|
basic_string<Char> format(const Char* fmt, const Args&... args);</code>
|
|
</dt>
|
|
<dd>
|
|
|
|
<p><i>Requires</i>: <code>fmt</code> shall not be a null pointer.</p>
|
|
<p>
|
|
<i>Effects</i>: Each function returns a <code>basic_string</code> object
|
|
constructed from the format string argument <code>fmt</code> with each
|
|
replacement field substituted with the character representation of the
|
|
argument it refers to, formatted according to the specification given in the
|
|
field.
|
|
</p>
|
|
<p><i>Returns</i>: The formatted string.</p>
|
|
<p><i>Throws</i>: <code>format_error</code> if <code>fmt</code> is not a valid
|
|
format string.</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h2><a name="Implementation">Implementation</a></h2>
|
|
|
|
<p>
|
|
The ideas proposed in this paper have been implemented in the open-source fmt
|
|
library. TODO: link and mention other implementations (Boost Format, FastFormat)
|
|
</p>
|
|
|
|
<h2><a name="References">References</a></h2>
|
|
|
|
<p>
|
|
<a name="1">[1]</a>
|
|
<cite>The <code>fprintf</code> function. ISO/IEC 9899:2011. 7.21.6.1.</cite><br>
|
|
<a name="2">[2]</a>
|
|
<cite><a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/fprintf.html">
|
|
fprintf, printf, snprintf, sprintf - print formatted output</a>. The Open
|
|
Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Edition.</cite><br>
|
|
<a name="3">[3]</a>
|
|
<cite><a href="https://docs.python.org/3/library/string.html#format-string-syntax">
|
|
6.1.3. Format String Syntax</a>. Python 3.5.2 documentation.</cite><br>
|
|
<a name="4">[4]</a>
|
|
<cite><a href="https://msdn.microsoft.com/en-us/library/system.string.format(v=vs.110).aspx">
|
|
String.Format Method</a>. .NET Framework Class Library.</cite><br>
|
|
<a name="5">[5]</a>
|
|
<cite><a href="https://doc.rust-lang.org/std/fmt/">
|
|
Module <code>std::fmt</code></a>. The Rust Standard Library.</cite><br>
|
|
<a name="6">[6]</a>
|
|
<cite><a href="https://msdn.microsoft.com/en-us/library/56e442dc(v=vs.120).aspx">
|
|
Format Specification Syntax: printf and wprintf Functions</a>. C++ Language and
|
|
Standard Libraries.</cite><br>
|
|
<a name="7">[7]</a>
|
|
<cite><a href="ftp://ftp.gnu.org/old-gnu/Manuals/gawk-3.1.0/html_chapter/gawk_11.html">
|
|
10.4.2 Rearranging printf Arguments</a>. The GNU Awk User's Guide.</cite><br>
|
|
</p>
|
|
|
|
</body>
|