From d9ad7a8b05c508d0efb378d3dfec52008a3697c1 Mon Sep 17 00:00:00 2001 From: Arseny Kapoulkine Date: Thu, 26 Nov 2020 09:49:09 -0800 Subject: [PATCH] Update docs for 1.11 --- docs/manual.adoc | 39 ++++- docs/manual.html | 378 +++++++++++++++++++++++++------------------ docs/quickstart.adoc | 4 +- docs/quickstart.html | 284 ++++++++++++++++---------------- 4 files changed, 404 insertions(+), 301 deletions(-) diff --git a/docs/manual.adoc b/docs/manual.adoc index def9ced..1f80929 100644 --- a/docs/manual.adoc +++ b/docs/manual.adoc @@ -46,7 +46,7 @@ Thanks to *Vyacheslav Egorov* for documentation proofreading and fuzz testing. The pugixml library is distributed under the MIT license: .... -Copyright (c) 2006-2019 Arseny Kapoulkine +Copyright (c) 2006-2020 Arseny Kapoulkine Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation @@ -74,7 +74,7 @@ This means that you can freely use pugixml in your applications, both open-sourc .... This software is based on pugixml library (https://pugixml.org). -pugixml is Copyright (C) 2006-2019 Arseny Kapoulkine. +pugixml is Copyright (C) 2006-2020 Arseny Kapoulkine. .... [[install]] @@ -270,7 +270,7 @@ The XML document is represented with a tree data structure. The root of the tree [[xml_node_type]] The tree nodes can be of one of the following types (which together form the enumeration `xml_node_type`): -* Document node ([[node_document]]`node_document`) - this is the root of the tree, which consists of several child nodes. This node corresponds to <> class; note that <> is a sub-class of <>, so the entire node interface is also available. However, document node is special in several ways, which are covered below. There can be only one document node in the tree; document node does not have any XML representation. +* Document node ([[node_document]]`node_document`) - this is the root of the tree, which consists of several child nodes. This node corresponds to <> class; note that <> is a sub-class of <>, so the entire node interface is also available. However, document node is special in several ways, which are covered below. There can be only one document node in the tree; document node does not have any XML representation. Document generally has one child element node (see [[xml_document::document_element]]`document_element()`), although documents parsed from XML fragments (see [[parse_fragment]]`parse_fragment`) can have more than one. * Element/tag node ([[node_element]]`node_element`) - this is the most common type of node, which represents XML elements. Element nodes have a name, a collection of attributes and a collection of child nodes (both of which may be empty). The attribute is a simple name/value pair. The example XML representation of element nodes is as follows: + @@ -749,7 +749,7 @@ These flags control the resulting tree contents: * [[parse_embed_pcdata]]`parse_embed_pcdata` determines if PCDATA contents is to be saved as element values. Normally element nodes have names but not values; this flag forces the parser to store the contents as a value if PCDATA is the first child of the element node (otherwise PCDATA node is created as usual). This can significantly reduce the memory required for documents with many PCDATA nodes. To retrieve the data you can use `xml_node::value()` on the element nodes or any of the higher-level functions like `child_value` or `text`. This flag is *off* by default. Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is *off* by default. -* [[parse_fragment]]`parse_fragment` determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid. This flag is *off* by default. +* [[parse_fragment]]`parse_fragment` determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes. This flag is *off* by default. CAUTION: Using in-place parsing (<>) with `parse_fragment` flag may result in the loss of the last character of the buffer if it is a part of PCDATA. Since PCDATA values are null-terminated strings, the only way to resolve this is to provide a null-terminated buffer as an input to `load_buffer_inplace` - i.e. `doc.load_buffer_inplace("test\0", 5, pugi::parse_default | pugi::parse_fragment)`. @@ -818,6 +818,7 @@ As for rejecting invalid XML documents, there are a number of incompatibilities * XML data is not required to begin with document declaration; additionally, document declaration can appear after comments and other nodes. * Invalid document type declarations are silently ignored in some cases. * Unicode validation is not performed so invalid UTF sequences are not rejected. +* Document can contain multiple top-level element nodes. [[access]] == Accessing document data @@ -1287,7 +1288,9 @@ bool xml_attribute::set_value(unsigned int rhs); bool xml_attribute::set_value(long rhs); bool xml_attribute::set_value(unsigned long rhs); bool xml_attribute::set_value(double rhs); +bool xml_attribute::set_value(double rhs, int precision); bool xml_attribute::set_value(float rhs); +bool xml_attribute::set_value(float rhs, int precision); bool xml_attribute::set_value(bool rhs); bool xml_attribute::set_value(long long rhs); bool xml_attribute::set_value(unsigned long long rhs); @@ -1378,16 +1381,18 @@ include::samples/modify_add.cpp[tags=code] [[modify.remove]] === Removing nodes/attributes -[[xml_node::remove_attribute]][[xml_node::remove_child]] +[[xml_node::remove_attribute]][[xml_node::remove_attributes]][[xml_node::remove_child]][[xml_node::remove_children]] If you do not want your document to contain some node or attribute, you can remove it with one of the following functions: [source] ---- bool xml_node::remove_attribute(const xml_attribute& a); +bool xml_node::remove_attributes(); bool xml_node::remove_child(const xml_node& n); +bool xml_node::remove_children(); ---- -`remove_attribute` removes the attribute from the attribute list of the node, and returns the operation result. `remove_child` removes the child node with the entire subtree (including all descendant nodes and attributes) from the document, and returns the operation result. Removing fails if one of the following is true: +`remove_attribute` removes the attribute from the attribute list of the node, and returns the operation result. `remove_child` removes the child node with the entire subtree (including all descendant nodes and attributes) from the document, and returns the operation result. `remove_attributes` removes all the attributes of the node, and returns the operation result. `remove_children` removes all the child nodes of the node, and returns the operation result. Removing fails if one of the following is true: * The node the function is called on is null; * The attribute/node to be removed is null; @@ -1437,7 +1442,9 @@ bool xml_text::set(unsigned int rhs); bool xml_text::set(long rhs); bool xml_text::set(unsigned long rhs); bool xml_text::set(double rhs); +bool xml_text::set(double rhs, int precision); bool xml_text::set(float rhs); +bool xml_text::set(float rhs, int precision); bool xml_text::set(bool rhs); bool xml_text::set(long long rhs); bool xml_text::set(unsigned long long rhs); @@ -2131,6 +2138,24 @@ Because of the differences in document object models, performance considerations :!numbered: +[[v1.11]] +=== v1.11 ^2020-11-26^ + +Maintenance release. Changes: + +* New features: + . Add xml_node::remove_attributes and xml_node::remove_children + . Add a way to customize floating point precision via xml_attribute::set and xml_text::set overloads + +* XPath improvements: + . XPath parser now limits recursion depth which prevents stack overflow on malicious queries + +* Compatibility improvements: + . Fix Visual Studio warnings when built using clang-cl compiler + . Fix Wconversion warnings in gcc + . Fix Wzero-as-null-pointer-constant warnings in pugixml.hpp + . Work around several static analysis false positives + [[v1.10]] === v1.10 ^2019-09-15^ @@ -2868,8 +2893,10 @@ const unsigned int +++parse_wnorm_attribute bool +++remove_attribute+++(const xml_attribute& a); bool +++remove_attribute+++(const char_t* name); + bool +++remove_attributes+++(); bool +++remove_child+++(const xml_node& n); bool +++remove_child+++(const char_t* name); + bool +++remove_children+++(); xml_parse_result +++append_buffer+++(const void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto); diff --git a/docs/manual.html b/docs/manual.html index 4a1d501..35acf77 100644 --- a/docs/manual.html +++ b/docs/manual.html @@ -2,22 +2,21 @@ - + - + -pugixml 1.10 manual +pugixml 1.11 manual - @@ -750,9 +686,9 @@ pugixml is Copyright (C) 2006-2019 Arseny Kapoulkine.

You can download the latest source distribution as an archive:

-

pugixml-1.10.zip (Windows line endings) +

pugixml-1.11.zip (Windows line endings) / -pugixml-1.10.tar.gz (Unix line endings)

+pugixml-1.11.tar.gz (Unix line endings)

The distribution contains library source, documentation (the manual you’re reading now and the quick start guide) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive.

@@ -773,7 +709,7 @@ pugixml is Copyright (C) 2006-2019 Arseny Kapoulkine.
git clone https://github.com/zeux/pugixml
 cd pugixml
-git checkout v1.10
+git checkout v1.11
@@ -790,7 +726,7 @@ git checkout v1.10
-
svn checkout https://github.com/zeux/pugixml/tags/v1.10 pugixml
+
svn checkout https://github.com/zeux/pugixml/tags/v1.11 pugixml
@@ -838,7 +774,7 @@ git checkout v1.10
-
pugixml.cpp(3477) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add '#include "stdafx.h"' to your source?
+
pugixml.cpp(3477) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add '#include "stdafx.h"' to your source?
@@ -1057,13 +993,13 @@ In that example PUGIXML_API is inconsistent between several source
  • -

    Document node (node_document) - this is the root of the tree, which consists of several child nodes. This node corresponds to xml_document class; note that xml_document is a sub-class of xml_node, so the entire node interface is also available. However, document node is special in several ways, which are covered below. There can be only one document node in the tree; document node does not have any XML representation.

    +

    Document node (node_document) - this is the root of the tree, which consists of several child nodes. This node corresponds to xml_document class; note that xml_document is a sub-class of xml_node, so the entire node interface is also available. However, document node is special in several ways, which are covered below. There can be only one document node in the tree; document node does not have any XML representation. Document generally has one child element node (see document_element()), although documents parsed from XML fragments (see parse_fragment) can have more than one.

  • Element/tag node (node_element) - this is the most common type of node, which represents XML elements. Element nodes have a name, a collection of attributes and a collection of child nodes (both of which may be empty). The attribute is a simple name/value pair. The example XML representation of element nodes is as follows:

    -
    <node attr="value"><child/></node>
    +
    <node attr="value"><child/></node>
    @@ -1074,7 +1010,7 @@ In that example PUGIXML_API is inconsistent between several source

    Plain character data nodes (node_pcdata) represent plain text in XML. PCDATA nodes have a value, but do not have a name or children/attributes. Note that plain character data is not a part of the element node but instead has its own node; an element node can have several child PCDATA nodes. The example XML representation of text nodes is as follows:

    -
    <node> text1 <child/> text2 </node>
    +
    <node> text1 <child/> text2 </node>
    @@ -1085,7 +1021,7 @@ In that example PUGIXML_API is inconsistent between several source

    Character data nodes (node_cdata) represent text in XML that is quoted in a special way. CDATA nodes do not differ from PCDATA nodes except in XML representation - the above text example looks like this with CDATA:

    -
    <node> <![CDATA[text1]]> <child/> <![CDATA[text2]]> </node>
    +
    <node> <![CDATA[text1]]> <child/> <![CDATA[text2]]> </node>
    @@ -1096,7 +1032,7 @@ In that example PUGIXML_API is inconsistent between several source

    Comment nodes (node_comment) represent comments in XML. Comment nodes have a value, but do not have a name or children/attributes. The example XML representation of a comment node is as follows:

    -
    <!-- comment text -->
    +
    <!-- comment text -->
    @@ -1107,7 +1043,7 @@ In that example PUGIXML_API is inconsistent between several source

    Processing instruction node (node_pi) represent processing instructions (PI) in XML. PI nodes have a name and an optional value, but do not have children/attributes. The example XML representation of a PI node is as follows:

    -
    <?name value?>
    +
    <?name value?>
    @@ -1118,7 +1054,7 @@ In that example PUGIXML_API is inconsistent between several source

    Declaration node (node_declaration) represents document declarations in XML. Declaration nodes have a name ("xml") and an optional collection of attributes, but do not have value or children. There can be only one declaration node in a document; moreover, it should be the topmost node (its parent should be the document). The example XML representation of a declaration node is as follows:

    -
    <?xml version="1.0"?>
    +
    <?xml version="1.0"?>
    @@ -1129,7 +1065,7 @@ In that example PUGIXML_API is inconsistent between several source

    Document type declaration node (node_doctype) represents document type declarations in XML. Document type declaration nodes have a value, which corresponds to the entire document type contents; no additional nodes are created for inner elements like <!ENTITY>. There can be only one document type declaration node in a document; moreover, it should be the topmost node (its parent should be the document). The example XML representation of a document type declaration node is as follows:

    -
    <!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]>
    +
    <!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]>
    @@ -1792,7 +1728,7 @@ You should use the usual bitwise arithmetics to manipulate the bitmask: to enabl Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is off by default.

  • -

    parse_fragment determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid. This flag is off by default.

    +

    parse_fragment determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes. This flag is off by default.

@@ -1972,6 +1908,9 @@ The current behavior for Unicode conversion is to skip all invalid UTF sequences
  • Unicode validation is not performed so invalid UTF sequences are not rejected.

  • +
  • +

    Document can contain multiple top-level element nodes.

    +
  • @@ -2661,7 +2600,9 @@ All attributes have name and value, both of which are strings (value may be empt bool xml_attribute::set_value(long rhs); bool xml_attribute::set_value(unsigned long rhs); bool xml_attribute::set_value(double rhs); +bool xml_attribute::set_value(double rhs, int precision); bool xml_attribute::set_value(float rhs); +bool xml_attribute::set_value(float rhs, int precision); bool xml_attribute::set_value(bool rhs); bool xml_attribute::set_value(long long rhs); bool xml_attribute::set_value(unsigned long long rhs); @@ -2833,17 +2774,19 @@ Nodes and attributes do not exist without a document tree, so you can’t cr

    6.3. Removing nodes/attributes

    -

    +

    If you do not want your document to contain some node or attribute, you can remove it with one of the following functions:

    bool xml_node::remove_attribute(const xml_attribute& a);
    -bool xml_node::remove_child(const xml_node& n);
    +bool xml_node::remove_attributes(); +bool xml_node::remove_child(const xml_node& n); +bool xml_node::remove_children();
    -

    remove_attribute removes the attribute from the attribute list of the node, and returns the operation result. remove_child removes the child node with the entire subtree (including all descendant nodes and attributes) from the document, and returns the operation result. Removing fails if one of the following is true:

    +

    remove_attribute removes the attribute from the attribute list of the node, and returns the operation result. remove_child removes the child node with the entire subtree (including all descendant nodes and attributes) from the document, and returns the operation result. remove_attributes removes all the attributes of the node, and returns the operation result. remove_children removes all the child nodes of the node, and returns the operation result. Removing fails if one of the following is true:

      @@ -2918,7 +2861,9 @@ If you do not want your document to contain some node or attribute, you can remo bool xml_text::set(long rhs); bool xml_text::set(unsigned long rhs); bool xml_text::set(double rhs); +bool xml_text::set(double rhs, int precision); bool xml_text::set(float rhs); +bool xml_text::set(float rhs, int precision); bool xml_text::set(bool rhs); bool xml_text::set(long long rhs); bool xml_text::set(unsigned long long rhs); @@ -4041,6 +3986,58 @@ If exceptions are disabled, then in the event of parsing failure the query is in

      9. Changelog

      +

      v1.11 2020-11-26

      +
      +

      Maintenance release. Changes:

      +
      +
      +
        +
      • +

        New features:

        +
        +
          +
        1. +

          Add xml_node::remove_attributes and xml_node::remove_children

          +
        2. +
        3. +

          Add a way to customize floating point precision via xml_attribute::set and xml_text::set overloads

          +
        4. +
        +
        +
      • +
      • +

        XPath improvements:

        +
        +
          +
        1. +

          XPath parser now limits recursion depth which prevents stack overflow on malicious queries

          +
        2. +
        +
        +
      • +
      • +

        Compatibility improvements:

        +
        +
          +
        1. +

          Fix Visual Studio warnings when built using clang-cl compiler

          +
        2. +
        3. +

          Fix Wconversion warnings in gcc

          +
        4. +
        5. +

          Fix Wzero-as-null-pointer-constant warnings in pugixml.hpp

          +
        6. +
        7. +

          Work around several static analysis false positives

          +
        8. +
        +
        +
      • +
      +
      +
      +

      v1.10 2019-09-15

      Maintenance release. Changes:

      @@ -5641,8 +5638,10 @@ If exceptions are disabled, then in the event of parsing failure the query is in bool remove_attribute(const xml_attribute& a); bool remove_attribute(const char_t* name); + bool remove_attributes(); bool remove_child(const xml_node& n); bool remove_child(const char_t* name); + bool remove_children(); xml_parse_result append_buffer(const void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto); @@ -5862,8 +5861,79 @@ If exceptions are disabled, then in the event of parsing failure the query is in
      + \ No newline at end of file diff --git a/docs/quickstart.adoc b/docs/quickstart.adoc index b4d621d..ee15665 100644 --- a/docs/quickstart.adoc +++ b/docs/quickstart.adoc @@ -255,7 +255,7 @@ If filing an issue is not possible due to privacy or other concerns, you can con The pugixml library is distributed under the MIT license: .... -Copyright (c) 2006-2019 Arseny Kapoulkine +Copyright (c) 2006-2020 Arseny Kapoulkine Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation @@ -283,5 +283,5 @@ This means that you can freely use pugixml in your applications, both open-sourc .... This software is based on pugixml library (https://pugixml.org). -pugixml is Copyright (C) 2006-2019 Arseny Kapoulkine. +pugixml is Copyright (C) 2006-2020 Arseny Kapoulkine. .... diff --git a/docs/quickstart.html b/docs/quickstart.html index 26173aa..57e6a98 100644 --- a/docs/quickstart.html +++ b/docs/quickstart.html @@ -2,22 +2,21 @@ - + - + -pugixml 1.10 quick start guide +pugixml 1.11 quick start guide - @@ -1095,8 +1030,79 @@ pugixml is Copyright (C) 2006-2019 Arseny Kapoulkine.
      + \ No newline at end of file