pugixml

Author	SHA1	Message	Date
Arseny Kapoulkine	257fbb4e1b	Use raw pointers in xml_node::traverse implementation This makes it a bit faster and matches other internal code better.	2017-11-13 19:29:42 -08:00
Arseny Kapoulkine	344c74a74c	XPath: Always allocate xpath_strings on temporary stack for concat The static_buffer optimization seems to come from the time where the on-heap buffer was allocated using global memory operations. At this point the temporary buffer and temporary string storage all come from the evaluation stack (that can be partially allocated on heap...), so the extra logic isn't relevant for performance.	2017-11-13 19:10:36 -08:00
Arseny Kapoulkine	3860b5076f	Fix -Wshadow warning	2017-11-13 09:27:38 -08:00
Arseny Kapoulkine	4bd8771c2f	Implement correct move error handling for compact mode In compact mode, we currently can not support zero-allocation moves since some pointer assignments required during the move need to allocate hash table slots. This is mostly applicable to xml_document_struct::first_child, since the pointer to this element is used as a hash table key, but there are some contrived cases where parents of root's children need a hash slot and didn't have it before. These cases can be fixed by changing the compact encoding to be a bit more move friendly, but for now it's easier to handle the error and throw/return during move. When this happens, the source document doesn't change.	2017-11-13 08:57:16 -08:00
Arseny Kapoulkine	91a3c28862	Add count argument to compact_hash_table::rehash/reserve This allows us to do a single reserve for a known amount of assignments that is larger than the default minimum per reserve (16).	2017-11-13 08:37:34 -08:00
Arseny Kapoulkine	3af93a39d7	Clarify a note about compact hash behavior during move After move some nodes in the hash table can have keys that point to other; this makes the table somewhat larger but this does not impact correctness. The reason is that for us to access a key in the hash table, there should be a compact_pointer/string object with the state indicating that it is stored in a hash table, and with the address matching the key. For this to happen, we had to have put this object into this state which would mean that we'd overwrite the hash entry with the new, correct value. When nodes/pages are being removed, we do not clean up keys from the hash table - it's safe for the same reason, and thus move doesn't introduce additional contracts here.	2017-10-20 21:57:14 -07:00
Arseny Kapoulkine	febf25d1af	Fix -Wshadow warning	2017-09-25 21:48:37 -07:00
Arseny Kapoulkine	a567f12d76	Implement move support for xml_document This change implements the initial version of move construction and assignment support for documents. When moving a document to another document, we always make sure move target is in "clean" state (empty document), and proceed by relocating all structures in the most efficient way possible. Complications arise from the fact that the root (document) node is embedded into xml_document object, so all pointers to it have to change; this includes parent pointers of all first-level children as well as allocator pointers in all memory pages and previous pointer in the first on-heap memory page. Additionally, compact mode makes everything even more complicated because some of the pointers we need to update are stored in the hash table (in fact, document first_child pointer is very likely to be there; some parent pointers in first-level children will be using compact_shared_parent but some won't be) which requires allocating a new hash table which can fail. Some details of this process are not fully fleshed out, especially for compact mode; and this definitely requires many tests.	2017-09-25 19:31:18 -07:00
Arseny Kapoulkine	77d7e60379	Fix Clang/C2 compatibility Clang/C2 does not implement __builtin_expect; additionally we need to work around deprecation warnings for fopen by disabling them.	2017-07-17 22:15:35 -07:00
Arseny Kapoulkine	853333cd70	Use PUGI__MSVC_CRT_VERSION instead of _MSC_VER It's not clear whether we still need PUGI__MSVC_CRT_VERSION, but it's more consistent for now to use it for _snprintf_s since this is relying on a CRT extension, not on a compiler feature.	2017-06-22 20:28:06 -07:00
Arseny Kapoulkine	2252927c04	Deprecate xml_document::load(const char*) and xml_node::select_single_node These functions were deprecated via comments in 1.5 but never got the deprecated attribute; now is the time! Using deprecated functions produces a warning; to silence it, this change moves the relevant tests to a separate translation unit that has deprecation disabled.	2017-06-22 09:13:10 -07:00
Arseny Kapoulkine	208e2cf043	Change PUGI__SNPRINTF to use _countof for MSVC The macro only works correctly when the input argument is an array with a statically known size - pointers or arrays decayed to pointers won't work silently. While this is unlikely to surface issues that aren't caught in tests/code review, use _countof for MSVC to prevent such code from compiling.	2017-06-19 07:06:47 -07:00
Arseny Kapoulkine	b6995f06b9	Fix BorlandC compilation Rename partition to partition3 to resolve conflicts with std::partition.	2017-06-16 00:32:01 -07:00
Arseny Kapoulkine	95f013ba80	Refactor snprintf support Instead of branching code at each invocation site, use variadic macros to create a wrapping macro that use snprintf for the buffer of a statically known size. Variadic macros are supported by all C++11 compilers, as is snprintf; on MSVC 2005+ we don't necessarily have snprintf, but we can use _snprintf_s with _TRUNCATE to get the same behavior. In all other cases we fall back to sprintf, that (theoretically) can lead to a stack buffer overflow. In practice all snprintfs used in pugixml use buffers that should be large enough to never be overflown but snprintf is safe even if this is not the case.	2017-06-15 23:35:20 -07:00
Arseny Kapoulkine	207bc788e9	Use buffer with a static size in convert_number_to_mantissa_exponent We use references to arrays elsewhere in the codebase and there's just one caller for this function so it's easier to fix the size. This will simplify snprintf refactoring.	2017-06-15 22:58:46 -07:00
Arseny Kapoulkine	cd2804d3ee	Merge pull request #145 from noresources/snprintf use snprintf instead of sprintf	2017-06-15 21:34:04 -07:00
Arseny Kapoulkine	b3b44841f0	Mark all assert(false) statements as unreachable Now we can exclude these from code coverage since it's logically impossible to hit them in tests.	2017-06-15 09:26:23 -07:00
Renaud Guillard	0d8022eced	use snprintf if available, _snprintf or sprintf otherwise	2017-06-11 18:33:28 +02:00
Renaud Guillard	810f1f600d	use _snprintf if MSVC	2017-06-05 13:31:58 +02:00
Renaud Guillard	b5e9d933ad	use snprintf instead of sprintf	2017-06-04 21:10:19 +02:00
Arseny Kapoulkine	38edf255ae	Work around -fsanitize=integer issues Integer sanitizer is flagging unsigned integer overflow in several functions in pugixml; unsigned integer overflow is well defined but it may not necessarily be intended. Apart from hash functions, both string_to_integer and integer_to_string use unsigned overflow - string_to_integer uses it to perform two-complement negation so that the bulk of the operation can run using unsigned integers. This makes it possible to simplify overflow checking. Similarly integer_to_string negates the number before generating a decimal representation, but negating is impossible without unsigned overflow or special-casing certain integer limits. For now just silence the integer overflow using a special attribute; also move unsigned overflow into string_to_integer from get_value_* so that we have fewer functions marked with the attribute. Fixes #133.	2017-04-03 23:35:24 -07:00
Arseny Kapoulkine	101f32884f	Add missing PUGI__FN to string_to_integer	2017-03-21 22:06:19 -07:00
Arseny Kapoulkine	956be4ca4b	Revert "Fix gcc-4.8 compilation warning when using -Wstrict-overflow" This reverts commit `79109a8546`. This warning does not happen on gcc-4.8.4; the workaround introduces an unsigned integer overflow which results in a runtime error when compiled with integer sanitizer.	2017-03-21 21:57:16 -07:00
Stephan Beyer	87fc170cdf	Silence g++ 7.0.1 -Wimplicit-fallthrough warnings This is accomplished by putting a // fallthrough comment at the right place. This seems to be more portable than an attribute-based solution like [[fallthrough]] or __attribute__((fallthrough)).	2017-03-05 22:12:10 +01:00
Arseny Kapoulkine	8ce4592e15	Simplify compact_hash_table implementation Instead of a separate implementation for find/insert, use just one that can do both. This reduces the code size and simplifies code coverage; the resulting code is close to what we had in terms of performance and since hash table is a fall back should not affect any real workloads.	2017-03-03 07:11:22 -08:00
Arseny Kapoulkine	0991c1d283	Add invalid type assertion for offset_debug This will make sure we don't forget to implement offset_debug for new node types if they ever happen (really it's mostly for consistency).	2017-02-07 20:34:49 -08:00
Arseny Kapoulkine	2162a0d80c	XPath: Simplify sorting implementation Instead of a complicated partitioning scheme that tries to maintain the equal area in the middle, use a scheme where we keep the equal area in the left part of the array and then move it to the middle. Since generally sorted arrays don't contain many duplicates this extra copy is not too expensive, and it significantly simplifies the logic and maintains good complexity for sorting arrays with many equal elements nonetheless (unlike Hoare partitioning). Instead of a median of 9 just use a median of 3 - it performs pretty much identically on some internal performance tests, despite having a bit more comparisons in some cases. Finally, change the insertion sort threshold to 16 elements since that appears to have slightly better performance.	2017-02-07 00:05:50 -08:00
Arseny Kapoulkine	774d5fe9df	XPath: Optimize insertion_sort The previous implementation opted for doing two comparisons per element in the sorted case in order to remove one iterator bounds check per moved element when we actually need to copy. In our case however the comparator is pretty expensive (except for remove_duplicates which is fast as it is) so an extra object comparison hurts much more than an iterator comparison saves. This makes sorting by document order up to 3% faster for random sequences.	2017-02-06 19:28:33 -08:00
Arseny Kapoulkine	8cc3144e7b	XPath: Remove redundant calls from xml_node::select_nodes et al Instead of delegating to a method that just forwards the call to xpath_query call the relevant method directly.	2017-02-05 21:52:30 -08:00
Arseny Kapoulkine	00e39c581a	XPath: Remove evaluate_string_impl It adds one stack frame to string query evaluation and does not really simplify the code.	2017-02-05 21:50:13 -08:00
Arseny Kapoulkine	bcc7ed57a2	XPath: Simplify evaluation error flow Instead of having two checks for out-of-memory when exceptions are enabled, do just one and decide what to do based on whether we can throw.	2017-02-03 20:33:40 -08:00
Arseny Kapoulkine	33159924b1	XPath: Clean up out-of-memory parse error handling Instead of relying on a specific string in the parse result, use allocator error state to report the error and then convert it to a string if necessary. We currently have to manually trigger the OOM error in two places because we use global allocator in rare cases; we don't really need to do this so this will be cleaned up later.	2017-02-02 18:40:20 -08:00
Arseny Kapoulkine	0e3ccc7396	Remove redundant branch from xml_node::path() The code works fine regardless of the *j->name check, and omitting this makes the code more symmetric between the "count" and "write" stage; additionally this improves coverage - due to how strcpy_insitu works it's not really possible to get an empty non-NULL name in the node.	2017-02-01 21:05:37 -08:00
Arseny Kapoulkine	9c7897b8d2	Remove null pointer test from first_element_by_path All other functions treat null pointer inputs as invalid; now this function does as well.	2017-01-30 23:55:31 -08:00
Arseny Kapoulkine	f500435cb4	XPath: Remove (re)allocate_throw and setjmp Now error handling in XPath implementation relies on explicit error propagation and is converted to an appropriate result at the end.	2017-01-30 22:31:57 -08:00
Arseny Kapoulkine	9e40c58532	XPath: Replace all (re)allocate_throw with (re)allocate_nothrow This generates some out-of-memory code paths that are not covered by existing tests, which will need to be resolved later.	2017-01-30 22:28:57 -08:00
Arseny Kapoulkine	c370d1190d	XPath: Fix reallocate_nothrow to preserve existing state Instead of rolling back the allocation and trying to allocate again, explicitly handle inplace reallocate if possible, and allocate a new block otherwise. This is going to be important once we use reallocate_nothrow from a non-throwing context.	2017-01-30 22:10:13 -08:00
Arseny Kapoulkine	1a2e4b88ee	XPath: Use nonthrowing allocations in duplicate_string This requires explicit error handling for xpath_string::data calls.	2017-01-30 21:58:53 -08:00
Arseny Kapoulkine	ac150d504e	XPath: Throw std::bad_alloc if we got an out-of-memory error This allows us to gradually convert exception handling of out-of-memory during evaluation to a non-throwing approach without changing the observable behavior.	2017-01-30 21:58:53 -08:00
Arseny Kapoulkine	1b3e8614e7	XPath: Reword brace mismatch errors for clarity	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	1ed6d2102b	XPath: Improve error message for expressions like .[1] W3C specification does not allow predicates after abbreviated steps. Currently this results in parsing terminating at the step, which leads to confusing error messages like "Invalid query" or "Unmatched braces".	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	bc1e444694	XPath: Track allocation errors more explicitly Any time an allocation fails xpath_allocator can set an externally provided bool. The plan is to keep this bool up until evaluation ends, so that we can use it to discard the potentially malformed result.	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	635fe02801	XPath: Provide non-throwing and throwing allocations in xpath_allocator For both allocate and reallocate, provide both _nothrow and _throw functions; this change renames allocate() to allocate_throw() (same for reallocate) to make it easier to change the code to remove throwing variants.	2017-01-29 22:02:58 -08:00
Arseny Kapoulkine	6abf1d7c1a	XPath: Minor error handling refactoring Handle node type error before creating expression node	2017-01-29 21:53:23 -08:00
Arseny Kapoulkine	4fa2241d7b	XPath: Route out-of-memory errors through the exceptionless path We currently need to convert error based on the text to a different type of C++ exceptions when C++ exceptions are enabled.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	bd8e2d782e	XPath: Forward all node constructors through alloc_node This allows us to handle OOM during node allocation without triggering undefined behavior that occurs when placement new gets a NULL pointer.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	293fccf3b0	XPath: Do not use exceptions to propagate parsing errors Instead, return 0 and rely on parsing logic to propagate that all the way down, and convert result to exception to maintain existing interface.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	7bb433b141	XPath: Assume that every function can fail and return 0 Propagate the failure to the caller manually. This is a first step to parser structure that does not depend on exceptions or longjmp for error handling (and thus matches the XML parser). To preserve semantics we'll have to convert error code to exception later.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	d72c0763f9	XPath: Minor parsing refactoring Simplify function argument parsing by folding arg 0 parsing into the main loop, reuse expression parsing logic for unary expression	2017-01-29 20:15:14 -08:00
Arseny Kapoulkine	60e580c2a8	XPath: Remove parse_function_helper It was only used in three places and didn't really make the code more readable.	2017-01-29 20:04:34 -08:00

1 2 3 4 5 ...

680 Commits