-
Add constructors that take a QPDF object so we can issue warnings and create new indirect objects.
-
Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.
-
Also removes preclusion of stream references in stream parameters of filterable streams and reduces write times by about 8% by eliminating an extra traversal of the objects.
-
Refactor QPDF_Stream to use stream filter classes to handle supported stream filters as well.
-
StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.
-
If a page contains a form XObject, also filter the form XObject and remove its unreferenced resources.
-
fix-qdf assumes there is exactly one XRef stream and that it is at the end of the file.
-
If the value of /CS in the inline image dictionary was is key in the page's /Resource -> /ColorSpace dictionary, properly resolve it by referencing the proper colorspace, and not just the name, in the external image dictionary.
-
Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that the value of Contents entry is a hexadecimal string representation when ByteRange is specified. This commit makes QPDF always uses hexadecimal strings representation instead of literal strings for it.
-
It's detected in QPDFWriter instead of at parse time because I can't figure out how to construct a test case in a reasonable time. This commit moves the fuzz file into the regular test suite for a QTC coverage case.
-
For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.
-
On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.
-
Also add test cases for additional coverage on image optimization.
-
Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.
-
We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.
-
Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.
-
When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.
-
Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.
-
If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.
-
When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.
-
With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.