Commit 6aa811e5cd138beb7dab3b2e5d93cd91461bcdaa

Authored by Jay Berkenbilt
1 parent 10fe5143

Update documentation on zlib compatibility

ChangeLog
1 1 2023-12-20 Jay Berkenbilt <ejb@ql.org>
2 2  
  3 + * Update code and tests so that qpdf's test suite no longer
  4 + depends on the output of any specific zlib implementation. This
  5 + makes it possible to get a fully passing test suite with any
  6 + API-compatible zlib library. CI tests with the default zlib as
  7 + well as zlib-ng (including verifying that zlib-ng is not the
  8 + default), but any zlib implementation should work. Fixes #774.
  9 +
3 10 * Bug fix: with --compress-streams=n, don't compress object, XRef,
4 11 or linearization hint streams.
5 12  
... ...
README-maintainer.md
... ... @@ -275,7 +275,99 @@ Building docs from pull requests is also enabled.
275 275  
276 276 ## ZLIB COMPATIBILITY
277 277  
278   -XXX Write this
  278 +The qpdf test suite is designed to be independent of the output of any
  279 +particular version of zlib. There are several strategies to make this
  280 +work:
  281 +
  282 +* `build-scripts/test-alt-zlib` runs in CI and runs the test suite
  283 + with a non-default zlib. Please refer to that code for an example of
  284 + how to do this in case you want to test locally.
  285 +
  286 +* The test suite is full of cases that compare output PDF files with
  287 + expected PDF files in the test suite. If the file contains data that
  288 + was compressed by QPDFWriter, then the output file will depend on
  289 + the behavior of zlib. As such, using a simple comparison won't work.
  290 + There are several strategies used by the test suite.
  291 +
  292 + * A new program called `qpdf-test-compare`, in most cases, is a drop
  293 + in replacement for a simple file comparison. This code make sure
  294 + the two files have exactly the same number of objects with the
  295 + same object and generation numbers, and that corresponding objects
  296 + are identical with the following allowances (consult its source
  297 + code for all the details details):
  298 + * The `/Length` key is not compared in stream dictionaries.
  299 + * The second element of `/ID` is not compared.
  300 + * If the first and second element of `/ID` are the same, then the
  301 + first element if `/ID` is also not compared.
  302 + * If a stream is compressed with `/FlateDecode`, the
  303 + _uncompressed_ stream data is compared. Otherwise, the raw
  304 + stream data is compared.
  305 + * Generated fields in the `/Encrypt` dictionary are not compared,
  306 + though password-protected files must have the same password.
  307 + * Differences in the contents of `/XRef` streams are ignored.
  308 +
  309 + To use this, run `qpdf-test-compare actual.pdf expected.pdf`, and
  310 + expect the output to match `expected.pdf`. For example, if a test
  311 + used to be written like this;
  312 + ```perl
  313 + $td->runtest("check output",
  314 + {$td->FILE => "a.pdf"},
  315 + {$td->FILE => "out.pdf"});
  316 + ```
  317 + then write it like this instead:
  318 + ```perl
  319 + $td->runtest("check output",
  320 + {$td->COMMAND => "qpdf-test-compare a.pdf out.pdf"},
  321 + {$td->FILE => "out.pdf", $td->EXIT_STATUS => 0});
  322 + ```
  323 + You can look at `compare-for-test/qtest/compare.test` for
  324 + additional examples.
  325 +
  326 + Here's what's going on:
  327 + * If the files "match" according to the rules of
  328 + `qpdf-test-compare`, the output of the program is the expected
  329 + file.
  330 + * If the files do not match, the output is the actual file. The
  331 + reason is that, if a change is made that results in an expected
  332 + change to the expected file, the output of the comparison can be
  333 + used to replace the expected file (as long as it is definitely
  334 + known to be correct—no shortcuts here!). That way, it doesn't
  335 + matter which zlib you use to generate test files.
  336 + * As a special debugging tool, you can set the `QPDF_COMPARE_WHY`
  337 + environment variable to any value. In this case, if the files
  338 + don't match, the output is a description of the first thing in
  339 + the file that doesn't match. This is mostly useful for debugging
  340 + `qpdf-test-compare` itself, but it can also be helpful as a
  341 + sanity check that the differences are expected. If you are
  342 + trying to find out the _real_ differences, a suggestion is to
  343 + convert both files to qdf and compare them lexically.
  344 +
  345 + * There are some cases where `qpdf-test-compare` can't be used. For
  346 + example, if you need to actually test one of the things that
  347 + `qpdf-test-compare` ignores, you'll need some other mechanism.
  348 + There are tests for deterministic ID creation and xref streams
  349 + that have to implement other mechanisms. Also, linearization hint
  350 + streams and the linearization dictionary in a linearized file
  351 + contain file offsets. Rather than ignoring those, it can be
  352 + helpful to create linearized files using `--compress-streams=n`.
  353 + In that case, `QPDFWriter` won't compress any data, so the PDF
  354 + will be independent of the output of any particular zlib
  355 + implementation.
  356 +
  357 +You can find many examples of how tests were rewritten by looking at
  358 +the commits preceding the one that added this section of this README
  359 +file.
  360 +
  361 +Note about `/ID`: many test cases use `--static-id` to have a
  362 +predictable `/ID` for testing. Many other test cases use
  363 +`--deterministic-id`. While `--static-id` is unaffected by file
  364 +contents, `--deterministic-id` is based on file contents and so is
  365 +dependent on zlib output if there is any newly compressed data. By
  366 +using `qpdf-test-compare`, it's actually not necessary to use either
  367 +`--static-id` or `--deterministic-id`. It may still be necessary to
  368 +use `--static-aes-iv` if comparing encrypted files, but since
  369 +`qpdf-test-compare` ignores `/Perms`, a wider range of encrypted files
  370 +can be compared using `qpdf-test-compare`.
279 371  
280 372 ## HOW TO ADD A COMMAND-LINE ARGUMENT
281 373  
... ...
... ... @@ -18,27 +18,6 @@ Contents
18 18  
19 19 - [HISTORICAL NOTES](#historical-notes)
20 20  
21   -zlib-ng
22   -=======
23   -
24   -* Write ZLIB COMPATIBILITY section of README-maintainer.md.
25   -* Note: deterministic IDs are affected by choice of zlib
26   -
27   -```
28   -cd /tmp
29   -git clone https://github.com/zlib-ng/zlib-ng
30   -cd zlib-ng
31   -cmake -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/inst -DZLIB_COMPAT=ON
32   -cmake --build build -j $(nproc)
33   -(cd build; ctest --verbose)
34   -cmake --install build
35   -```
36   -
37   -Then run qpdf's test suite with
38   -```
39   -LD_PRELOAD=/tmp/inst/lib/libz.so.1 ctest --verbose
40   -```
41   -
42 21 Always
43 22 ======
44 23  
... ...
manual/release-notes.rst
... ... @@ -39,6 +39,21 @@ Planned changes for future 12.x (subject to change):
39 39 .. x.y.z: not yet released
40 40  
41 41 11.7.0: not yet released
  42 + - Bug fixes:
  43 +
  44 + - With ``--compress-streams=n``, qpdf was still compressing cross
  45 + reference streams, linearization hint streams, and object
  46 + streams. This has been fixed.
  47 +
  48 + - Build Enhancements:
  49 +
  50 + - The qpdf test suite now passes when qpdf is linked with an
  51 + alternative ``zlib`` implementation. There are no dependencies
  52 + anywhere in the qpdf test suite on any particular ``zlib``
  53 + output. Consult the ``ZLIB COMPATIBILITY`` section of
  54 + ``README-maintainer.md`` for a detailed explanation of how to
  55 + maintain this.
  56 +
42 57 - Library Enhancements:
43 58  
44 59 - Add C++ functions ``qpdf_c_wrap`` and ``qpdf_c_get_qpdf`` to the
... ...