-
- Introduce `limits_error` method in `QPDFParser` for centralized limit-related error handling. - Enhance warnings and error messages with detailed limit identifiers (e.g., `parser-max-nesting`). - Refactor limit checks to improve maintainability and ensure uniformity in error reporting. - Update tests and output to reflect adjusted error handling approach.
-
…ted tests, documentation, and references across the codebase.
-
Enhance the `global` namespace by introducing `limit_errors` for tracking the number of exceeded limits. Update related tests and documentation to ensure functionality and clarity.
-
…sting return values for uninitialized objects, and cleaning up error handling logic.
-
…ndles on invalid input..
-
…eplacing redundant logic with centralized functions, and streamlining bad token handling for improved readability and maintainability.
-
…raints, replacing hardcoded values for nesting, container size, and error limits.
-
…oved shared pointer handling, remove deprecated object methods, and update all references.
-
Relocate `Objects` to `QPDF::Doc` for improved encapsulation of object-related logic. Adjust all relevant methods and references to use the new placement.
-
Relocate `reconstructed_xref` to `QPDF::Doc` for improved encapsulation of cross-reference reconstruction state. Adjust all references to use the updated placement.
-
Relocate `ParseGuard` to `QPDF::Doc` for better encapsulation of parsing logic. Adjust references in `QPDFParser` accordingly to use the new placement.
-
…std::string_view`, improving performance and code clarity.
-
Enhanced handling of unexpected tokens during xref table reconstruction. Adjusted logic for invalid tokens, ensuring better robustness during PDF parsing of corrupt PDF files.
-
Implemented stricter sanity checks to handle unexpected tokens like array/dictionary close and endobj/endstream more effectively. Improved warning messages and handling of corrupt objects to enhance PDF parsing robustness.
-
Converted multiple occurrences of `count()` to `contains()` throughout the codebase where the goal was to check key existence in containers. This improves code readability and aligns with modern C++ practices, particularly with C++20, making the intent more explicit and potentially aiding performance.
-
Reject objects containing arrays or dictionaries with more than 5000 elements. We are by definition dealing with damaged files, and such objects are extremely likely to be invalid or malicious.
-
... containing objects with no white-space between them. To enforce the rule that objects end at the start-offset of the next object, each object is parsed in it own object stream. To facilitate this, a new private API input source is::OffsetBuffer has been added which only contains the object but reports offsets relative to the start of the object stream. This is adapted from OffsetInputSource by changing the direction of the offset, endowing it with its own BufferInputSource and striooing out checks duplicated in BufferInputSource. Fixes the expected failure in the test case added in #1266.
-
Add static parse methods. Make all external access to QPDFParser through static methods. Make all non-static methods including constructors private.
-
Only build strings when needed.
-
#1349 introduced a limit on the maximum size of arrays and dictionaries contained in objects that generate errors during parsing, and #1354 reduced that limit to 5000 objects. However, the limit was only imposed once a further error was encountered. Stop adding objects to containers once the limit is reached. Fixes oss-fuzz issue 398060137
-
This improves indentation of long strings. This commit also fixes some trailing whitespace in ChangeLog.
-
Reduce the container size for which a single bad token will cause a failure from 100,000 to 5,000. Count missing dictionary keys as errors.
-
Fail if a bad token is encountered while parsing an array or dictionary with more than 100,000 elements. Fixes oss-fuzz case 388571629.
-
Currently, QPDFParser gives up attempting to parse an object if 5 near-consecutive bad tokens are encountered. Add a limit of a total of 15 bad tokens in a single object before giving up.
-
Create unresolved objects only for objects in the xref table (except during parsing of the xref table). Do not add indirect nulls into the the object cache as the result of a cache miss during a call to getObject except during parsing or creation/updating from JSON. To support this behaviour, add new private methods getObjectForParser and getObjectForJSON. As a result of this change, dangling references are treated as direct nulls rather than indirect nulls.
-
Also, don't search for /Contents name unless the result is used.
-
Also, fix test cases.