-
…otations in page objects - Repair invalid or missing Resources in page object trees with warnings - Remove invalid Annots arrays with warnings - Warn about duplicate annotations - Update test cases and output to reflect new error handling. - Improve robustness for annotation and resource validation.
-
Fuzzer was temporarily renamed in #1466 in order to allow a (fixed) time-out to age-out.
-
…logic in call to `initializeSpecialStreams`.
-
…or deterministic ID generation (fixes #1235).
-
…bnail processing loop to ensure consistency.
-
…eams, improving robustness against damaged PDFs.
-
After xref reconstruction treat the input file as suspect and apply sanity checks to all subsequent object reads. Remove `in_xref_reconstruction` flag and update references to use `reconstructed_xref` for simplified state management during xref processing. Adjust warnings for invalid dictionary keys in test output.
-
Enhanced handling of unexpected tokens during xref table reconstruction. Adjusted logic for invalid tokens, ensuring better robustness during PDF parsing of corrupt PDF files.
-
Implemented stricter sanity checks to handle unexpected tokens like array/dictionary close and endobj/endstream more effectively. Improved warning messages and handling of corrupt objects to enhance PDF parsing robustness.
-
Return the display value if the choices entry is an array of strings rather than a single string. Test file is need-appearances.pdf modified to contain one array entry.
-
Thanks to github user @cdosborn for the basic enhancement.
-
Only top-level XMP metadata is supposed to be left unencrypted. All other metadata is not treated specially.
-
As discussed in #1396.
-
Also, silently fix any angle that is a multiple of 90.
-
... to remove the /Root /StructTreeRoot and /MarkInfo entries.
-
... containing objects with no white-space between them. To enforce the rule that objects end at the start-offset of the next object, each object is parsed in it own object stream. To facilitate this, a new private API input source is::OffsetBuffer has been added which only contains the object but reports offsets relative to the start of the object stream. This is adapted from OffsetInputSource by changing the direction of the offset, endowing it with its own BufferInputSource and striooing out checks duplicated in BufferInputSource. Fixes the expected failure in the test case added in #1266.
-
This was due to the use of last_object_description, which is not set for the object stream itself. Also, modify the messages introduced #1391 and #1392 to report the supposed offset of the objects.
-
If startxref cannot be found in the last 1024 try finding it in the whole file and check whether it is valid.
-
Split reconstruction into three passes - scanning of input for objects and trailer, insertion of objects into the xref table, and loading the trailer. This allows insertion to take place in the usual reverse order and removes the need for a separate insertReconstructedXrefEntry method. It also allows trailer to be tried from most recent to oldest. Ignore any found trailers without /Root entry.
-
The original test file contains multiple entries with id 0 and offset 0. One entry has been modified such that the id is valid (6). Object streams with invalid offsets are a source of unreproduceable oss-fuzz time-outs.
-
Also add debugging information so we can save time if $^O used in GitHub Actions changes again.
-
Reduce the container size for which a single bad token will cause a failure from 100,000 to 5,000. Count missing dictionary keys as errors.
-
This requires a special build option.
-
When recovering XRef streams, start with the stream with the largest /Size rather than the largest offset. Also, if reconstruction fails to find a trailer with a valid /Root entry search for a root object.
-
Change QPDFWriter stream_decode_level default to qpdf_dl_generalized (fixes #1286)
-
Exercise stream containing objects with no white-space between them.
-
Also, fix disabling of preserve_encryption to be ignore stream_decode_level, but disable preserve_encryption if compress_streams is false. Fixes #1286
-
without filtering
-
Test fixing /P entry.
-
Fix QPDF::getAllPagesInternal warning
-
Provide correct obj_gen and offset.
-
Provide correct obj_gen.
-
This reverts commit ff2a78f579ebdd06b417e34260a17dba06e71137, reversing changes made to 8f54319f7a6514110f4b05cbbf1cb1c9fc8cb6a0.
-
This reverts commit 0e92cf6bf399249c603c3d0212e898fd29e71fcd, reversing changes made to 7d34b89a69e8e89c098dd373442f7df809c28eff.
-
Ghostscript 10.0.2 failed to handle the files changed in this commit, but ghostscript 10.0.4 handles them fine as do earlier versions. These files all have hybird xref in the form of a file with an xref table appended with a section that has an xref stream. They all have /PageLabels pointing to 107 0 R in the original file, with 107 higher than the highest object. The spec says that this should be treated as null, which results in /PageLabels null, which results in ghostscript errors in that version. While ghostscript 10.0.2 may be handling the file incorrectly, the file does something that's not really kosher, and it's easier to fix the files, which had not been changed since the very first open source release of qpdf, than to try to work around the issue. This was discovered with the GitHub actions runner was bumped to Ubuntu 24.04, which contains the buggy version of ghostscript. I was not able to find a specific ghostscript issue that addressed this, but the problem went away in either 10.0.3 or 10.0.4. Commenting out /PageLabels without changing offsets was a pragmatic move to avoid having to regenerate the xref tables manually. I just had to manually edit the binary xref stream to change the offset of one item (the new object 1), which I put at the end to avoid breaking other things.