pugixml

Author	SHA1	Message	Date
Arseny Kapoulkine	774d5fe9df	XPath: Optimize insertion_sort The previous implementation opted for doing two comparisons per element in the sorted case in order to remove one iterator bounds check per moved element when we actually need to copy. In our case however the comparator is pretty expensive (except for remove_duplicates which is fast as it is) so an extra object comparison hurts much more than an iterator comparison saves. This makes sorting by document order up to 3% faster for random sequences.	2017-02-06 19:28:33 -08:00
Arseny Kapoulkine	8cc3144e7b	XPath: Remove redundant calls from xml_node::select_nodes et al Instead of delegating to a method that just forwards the call to xpath_query call the relevant method directly.	2017-02-05 21:52:30 -08:00
Arseny Kapoulkine	00e39c581a	XPath: Remove evaluate_string_impl It adds one stack frame to string query evaluation and does not really simplify the code.	2017-02-05 21:50:13 -08:00
Arseny Kapoulkine	bcc7ed57a2	XPath: Simplify evaluation error flow Instead of having two checks for out-of-memory when exceptions are enabled, do just one and decide what to do based on whether we can throw.	2017-02-03 20:33:40 -08:00
Arseny Kapoulkine	33159924b1	XPath: Clean up out-of-memory parse error handling Instead of relying on a specific string in the parse result, use allocator error state to report the error and then convert it to a string if necessary. We currently have to manually trigger the OOM error in two places because we use global allocator in rare cases; we don't really need to do this so this will be cleaned up later.	2017-02-02 18:40:20 -08:00
Arseny Kapoulkine	0e3ccc7396	Remove redundant branch from xml_node::path() The code works fine regardless of the *j->name check, and omitting this makes the code more symmetric between the "count" and "write" stage; additionally this improves coverage - due to how strcpy_insitu works it's not really possible to get an empty non-NULL name in the node.	2017-02-01 21:05:37 -08:00
Arseny Kapoulkine	9c7897b8d2	Remove null pointer test from first_element_by_path All other functions treat null pointer inputs as invalid; now this function does as well.	2017-01-30 23:55:31 -08:00
Arseny Kapoulkine	f500435cb4	XPath: Remove (re)allocate_throw and setjmp Now error handling in XPath implementation relies on explicit error propagation and is converted to an appropriate result at the end.	2017-01-30 22:31:57 -08:00
Arseny Kapoulkine	9e40c58532	XPath: Replace all (re)allocate_throw with (re)allocate_nothrow This generates some out-of-memory code paths that are not covered by existing tests, which will need to be resolved later.	2017-01-30 22:28:57 -08:00
Arseny Kapoulkine	c370d1190d	XPath: Fix reallocate_nothrow to preserve existing state Instead of rolling back the allocation and trying to allocate again, explicitly handle inplace reallocate if possible, and allocate a new block otherwise. This is going to be important once we use reallocate_nothrow from a non-throwing context.	2017-01-30 22:10:13 -08:00
Arseny Kapoulkine	1a2e4b88ee	XPath: Use nonthrowing allocations in duplicate_string This requires explicit error handling for xpath_string::data calls.	2017-01-30 21:58:53 -08:00
Arseny Kapoulkine	ac150d504e	XPath: Throw std::bad_alloc if we got an out-of-memory error This allows us to gradually convert exception handling of out-of-memory during evaluation to a non-throwing approach without changing the observable behavior.	2017-01-30 21:58:53 -08:00
Arseny Kapoulkine	1b3e8614e7	XPath: Reword brace mismatch errors for clarity	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	1ed6d2102b	XPath: Improve error message for expressions like .[1] W3C specification does not allow predicates after abbreviated steps. Currently this results in parsing terminating at the step, which leads to confusing error messages like "Invalid query" or "Unmatched braces".	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	bc1e444694	XPath: Track allocation errors more explicitly Any time an allocation fails xpath_allocator can set an externally provided bool. The plan is to keep this bool up until evaluation ends, so that we can use it to discard the potentially malformed result.	2017-01-30 11:51:07 -08:00
Arseny Kapoulkine	635fe02801	XPath: Provide non-throwing and throwing allocations in xpath_allocator For both allocate and reallocate, provide both _nothrow and _throw functions; this change renames allocate() to allocate_throw() (same for reallocate) to make it easier to change the code to remove throwing variants.	2017-01-29 22:02:58 -08:00
Arseny Kapoulkine	6abf1d7c1a	XPath: Minor error handling refactoring Handle node type error before creating expression node	2017-01-29 21:53:23 -08:00
Arseny Kapoulkine	4fa2241d7b	XPath: Route out-of-memory errors through the exceptionless path We currently need to convert error based on the text to a different type of C++ exceptions when C++ exceptions are enabled.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	bd8e2d782e	XPath: Forward all node constructors through alloc_node This allows us to handle OOM during node allocation without triggering undefined behavior that occurs when placement new gets a NULL pointer.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	293fccf3b0	XPath: Do not use exceptions to propagate parsing errors Instead, return 0 and rely on parsing logic to propagate that all the way down, and convert result to exception to maintain existing interface.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	7bb433b141	XPath: Assume that every function can fail and return 0 Propagate the failure to the caller manually. This is a first step to parser structure that does not depend on exceptions or longjmp for error handling (and thus matches the XML parser). To preserve semantics we'll have to convert error code to exception later.	2017-01-29 21:09:12 -08:00
Arseny Kapoulkine	d72c0763f9	XPath: Minor parsing refactoring Simplify function argument parsing by folding arg 0 parsing into the main loop, reuse expression parsing logic for unary expression	2017-01-29 20:15:14 -08:00
Arseny Kapoulkine	60e580c2a8	XPath: Remove parse_function_helper It was only used in three places and didn't really make the code more readable.	2017-01-29 20:04:34 -08:00
Arseny Kapoulkine	f11c4d6847	XPath: alloc_string no longer returns NULL NULL return value will be reserved for the OOM error indicator.	2017-01-29 20:00:44 -08:00
Arseny Kapoulkine	d3b9e4e1e8	Update copyright year to 2017	2017-01-26 20:12:06 -08:00
Arseny Kapoulkine	05edb250ee	Work around cray++ compiler issue It's still not clear as to what exactly makes it emit this error when compiling string_to_integer: CC-3059 crayc++: INTERNAL __C_FILE_SCOPE_DATA__, File = <pugixml>/src/pugixml.cpp, Line = 4524, Column = 4 Expected no overflow in routine. But a viable workaround for now is to exploit the knowledge that it uses two-complement arithmetics and invert the sign manually. Fixes #125.	2016-12-01 20:49:46 -08:00
Arseny Kapoulkine	8df9f97cda	Silence 'cast increases required alignment of target type' warnings These warnings are emitted on some GCC versions when targeting ARM; the alignment is guaranteed to be correct due to how page offsets are set up but the compiler doesn't know.	2016-11-18 09:49:31 -08:00
Arseny Kapoulkine	9366f25136	Rename set_value_convert to set_value_bool It's too dangerous to overload here - easy to accidentally mix floating point path with boolean one.	2016-11-17 21:37:27 -08:00
Arseny Kapoulkine	2af2524db5	Fix 'comparison of unsigned expression < 0 is always false' warnings Unfortunately, some compilers don't suppress these kinds of warnings in template instantiations; solve this by moving the responsibility for computing negative bool to the caller. Also since we're doing that we don't really need to convert to unsigned in the implementation - might as well have the caller do it, which removes some type dispatch logic and slightly reduces binary size.	2016-11-17 21:33:54 -08:00
Arseny Kapoulkine	1e23402eb2	Change status_end_element_mismatch to point to closing tag name Previously the error offset pointed to the first mismatching character, which can be confusing especially if the start tag name is a prefix of the end tag name. Instead, move the offset to the first character of the name - that way it should be more obvious that the problem is that the entire name mismatches. Fixes #112.	2016-11-13 16:59:14 -08:00
Arseny Kapoulkine	cd7e0b04f6	Add format_no_empty_element_tags flag Setting this flag outputs start and end tag for every element, including empty elements. Fixes #118.	2016-11-09 09:11:30 -08:00
Arseny Kapoulkine	c75e3c45e5	Update version to 1.8 everywhere	2016-11-09 09:02:44 -08:00
Arseny Kapoulkine	17a215523c	XPath: Fix source indentation Split some lines into two and add braces in some places to make the code more readable.	2016-11-08 07:14:59 -08:00
Arseny Kapoulkine	e4c43a0aa2	Move compact hash table pointer setup to xml_document This keeps all code that creates document/allocator/page structures together.	2016-11-07 19:31:34 -08:00
Arseny Kapoulkine	9bc497267b	Remove xml_allocator copying during parsing The separate copy of allocator state in parser was meant to increase parsing performance by reducing aliasing/indirection, but benchmarks against the current source don't indicate that this is worthwhile. Removing this simplifies the code slightly and makes it possible to move compact hash table to the allocator.	2016-11-07 08:43:14 -08:00
Arseny Kapoulkine	2f98c62172	Rename xml_document::create/destroy for consistency	2016-11-07 08:22:54 -08:00
Arseny Kapoulkine	0d015e9a2c	Reduce MSVC version cutoff for move semantics support MSVC 2010 supported move semantics (partially - but should be good enough for our use case).	2016-11-06 11:51:16 -08:00
Arseny Kapoulkine	aa117cce42	Refactor move semantics support detection Do it in one place and set PUGIXML_HAS_MOVE if it's available.	2016-11-06 11:49:10 -08:00
iFarbod	b3fc28d177	Add VS2013 check for C++11 availability (#121 ) VS 2013 supports C++11, but __cplusplus macro isn't updated, and it is 199711 so the old check always fails, even though the compiler supports c++11.	2016-11-06 11:43:03 -08:00
Pavel Kryukov	d0b0cc75ad	Fix a comment before PUGIXML_OVERRIDE macro	2016-10-18 00:53:00 +03:00
Pavel Kryukov	3b58103157	Add 'override' keyword if C++11 is enabled	2016-10-05 20:11:07 +03:00
Arseny Kapoulkine	666a01d335	Use references for output variables While I grew to dislike references for this case, there are other functions in the source that use references so switch to that for consistency.	2016-07-15 19:12:21 -05:00
Arseny Kapoulkine	70d7c7904e	Implement encoding detection by name. This adds about 40 cycles for parsing <?xml version='1.0'?> declaration and about 70 cycles for parsing <?xml version='1.0' encoding='utf-8'?>, as measured on a Core i7, which should be negligible for all documents. Fixes #16.	2016-07-14 22:44:23 -07:00
Arseny Kapoulkine	2d5980b406	Adjust XML allocation pages to have the exact specified size Previously the page size was defining the data size, and due to additional headers (+ recently removed allocation padding) the actual allocation was a bit bigger. The problem is that some allocators round 2^N+k allocations to 2^N+M, which can result in noticeable waste of space. Specifically, on 64-bit OSX allocating the previous page size (32k+40) resulted in 32k+512 allocation, thereby wasting 472 bytes, or 1.4%. Now we have the allocation size specified exactly and just recompute the available data size, which can in small space savings depending on the allocator.	2016-04-14 08:43:06 -07:00
Arseny Kapoulkine	2e0ed8284b	Remove extra space in an empty tag for format_raw When using format_raw the space in the empty tag (<node />) is the only character that does not have to be there; so format_raw almost results in a minimal XML but not quite. It's pretty unlikely that this is crucial for any users - the formatting change should be benign, and it's better to improve format_raw than to add yet another flag. Fixes #87.	2016-04-14 00:30:24 -07:00
Arseny Kapoulkine	c6539ccef0	Refactor auto_deleter now that we only need to support one signature Also rename auto_deleter_fclose to close_file.	2016-04-03 13:30:34 -07:00
QUSpilPrgm	0564d55e19	Do not assume that fclose can be converted to int()(FILE) because some compilers use a special calling convention for stdlib functions like fclose	2016-03-24 17:33:10 +01:00
Arseny Kapoulkine	607e46f209	Refactor conversion from integer to string Unify the implementations by automatically deducing the unsigned type from its signed counterpart. That allows us to use a templated function instead of duplicating code.	2016-02-02 10:44:35 -08:00
Arseny Kapoulkine	f441c63ea4	Implement set/set_value/operator= for long types This makes the coverage for basic numeric types complete (sans long double). Fixes #78.	2016-02-02 08:39:45 -08:00
Stephan Beyer	f7aa65db8a	Fix whitespace issues Git warns when it finds "whitespace errors". This commit gets rid of these whitespace errors for code and adoc files.	2016-01-24 14:05:44 +01:00

1 2 3 4 5 ...

653 Commits