Commit 6aa811e5cd138beb7dab3b2e5d93cd91461bcdaa

Authored by Jay Berkenbilt
1 parent 10fe5143

Update documentation on zlib compatibility

ChangeLog
1 2023-12-20 Jay Berkenbilt <ejb@ql.org> 1 2023-12-20 Jay Berkenbilt <ejb@ql.org>
2 2
  3 + * Update code and tests so that qpdf's test suite no longer
  4 + depends on the output of any specific zlib implementation. This
  5 + makes it possible to get a fully passing test suite with any
  6 + API-compatible zlib library. CI tests with the default zlib as
  7 + well as zlib-ng (including verifying that zlib-ng is not the
  8 + default), but any zlib implementation should work. Fixes #774.
  9 +
3 * Bug fix: with --compress-streams=n, don't compress object, XRef, 10 * Bug fix: with --compress-streams=n, don't compress object, XRef,
4 or linearization hint streams. 11 or linearization hint streams.
5 12
README-maintainer.md
@@ -275,7 +275,99 @@ Building docs from pull requests is also enabled. @@ -275,7 +275,99 @@ Building docs from pull requests is also enabled.
275 275
276 ## ZLIB COMPATIBILITY 276 ## ZLIB COMPATIBILITY
277 277
278 -XXX Write this 278 +The qpdf test suite is designed to be independent of the output of any
  279 +particular version of zlib. There are several strategies to make this
  280 +work:
  281 +
  282 +* `build-scripts/test-alt-zlib` runs in CI and runs the test suite
  283 + with a non-default zlib. Please refer to that code for an example of
  284 + how to do this in case you want to test locally.
  285 +
  286 +* The test suite is full of cases that compare output PDF files with
  287 + expected PDF files in the test suite. If the file contains data that
  288 + was compressed by QPDFWriter, then the output file will depend on
  289 + the behavior of zlib. As such, using a simple comparison won't work.
  290 + There are several strategies used by the test suite.
  291 +
  292 + * A new program called `qpdf-test-compare`, in most cases, is a drop
  293 + in replacement for a simple file comparison. This code make sure
  294 + the two files have exactly the same number of objects with the
  295 + same object and generation numbers, and that corresponding objects
  296 + are identical with the following allowances (consult its source
  297 + code for all the details details):
  298 + * The `/Length` key is not compared in stream dictionaries.
  299 + * The second element of `/ID` is not compared.
  300 + * If the first and second element of `/ID` are the same, then the
  301 + first element if `/ID` is also not compared.
  302 + * If a stream is compressed with `/FlateDecode`, the
  303 + _uncompressed_ stream data is compared. Otherwise, the raw
  304 + stream data is compared.
  305 + * Generated fields in the `/Encrypt` dictionary are not compared,
  306 + though password-protected files must have the same password.
  307 + * Differences in the contents of `/XRef` streams are ignored.
  308 +
  309 + To use this, run `qpdf-test-compare actual.pdf expected.pdf`, and
  310 + expect the output to match `expected.pdf`. For example, if a test
  311 + used to be written like this;
  312 + ```perl
  313 + $td->runtest("check output",
  314 + {$td->FILE => "a.pdf"},
  315 + {$td->FILE => "out.pdf"});
  316 + ```
  317 + then write it like this instead:
  318 + ```perl
  319 + $td->runtest("check output",
  320 + {$td->COMMAND => "qpdf-test-compare a.pdf out.pdf"},
  321 + {$td->FILE => "out.pdf", $td->EXIT_STATUS => 0});
  322 + ```
  323 + You can look at `compare-for-test/qtest/compare.test` for
  324 + additional examples.
  325 +
  326 + Here's what's going on:
  327 + * If the files "match" according to the rules of
  328 + `qpdf-test-compare`, the output of the program is the expected
  329 + file.
  330 + * If the files do not match, the output is the actual file. The
  331 + reason is that, if a change is made that results in an expected
  332 + change to the expected file, the output of the comparison can be
  333 + used to replace the expected file (as long as it is definitely
  334 + known to be correctโ€”no shortcuts here!). That way, it doesn't
  335 + matter which zlib you use to generate test files.
  336 + * As a special debugging tool, you can set the `QPDF_COMPARE_WHY`
  337 + environment variable to any value. In this case, if the files
  338 + don't match, the output is a description of the first thing in
  339 + the file that doesn't match. This is mostly useful for debugging
  340 + `qpdf-test-compare` itself, but it can also be helpful as a
  341 + sanity check that the differences are expected. If you are
  342 + trying to find out the _real_ differences, a suggestion is to
  343 + convert both files to qdf and compare them lexically.
  344 +
  345 + * There are some cases where `qpdf-test-compare` can't be used. For
  346 + example, if you need to actually test one of the things that
  347 + `qpdf-test-compare` ignores, you'll need some other mechanism.
  348 + There are tests for deterministic ID creation and xref streams
  349 + that have to implement other mechanisms. Also, linearization hint
  350 + streams and the linearization dictionary in a linearized file
  351 + contain file offsets. Rather than ignoring those, it can be
  352 + helpful to create linearized files using `--compress-streams=n`.
  353 + In that case, `QPDFWriter` won't compress any data, so the PDF
  354 + will be independent of the output of any particular zlib
  355 + implementation.
  356 +
  357 +You can find many examples of how tests were rewritten by looking at
  358 +the commits preceding the one that added this section of this README
  359 +file.
  360 +
  361 +Note about `/ID`: many test cases use `--static-id` to have a
  362 +predictable `/ID` for testing. Many other test cases use
  363 +`--deterministic-id`. While `--static-id` is unaffected by file
  364 +contents, `--deterministic-id` is based on file contents and so is
  365 +dependent on zlib output if there is any newly compressed data. By
  366 +using `qpdf-test-compare`, it's actually not necessary to use either
  367 +`--static-id` or `--deterministic-id`. It may still be necessary to
  368 +use `--static-aes-iv` if comparing encrypted files, but since
  369 +`qpdf-test-compare` ignores `/Perms`, a wider range of encrypted files
  370 +can be compared using `qpdf-test-compare`.
279 371
280 ## HOW TO ADD A COMMAND-LINE ARGUMENT 372 ## HOW TO ADD A COMMAND-LINE ARGUMENT
281 373
@@ -18,27 +18,6 @@ Contents @@ -18,27 +18,6 @@ Contents
18 18
19 - [HISTORICAL NOTES](#historical-notes) 19 - [HISTORICAL NOTES](#historical-notes)
20 20
21 -zlib-ng  
22 -=======  
23 -  
24 -* Write ZLIB COMPATIBILITY section of README-maintainer.md.  
25 -* Note: deterministic IDs are affected by choice of zlib  
26 -  
27 -```  
28 -cd /tmp  
29 -git clone https://github.com/zlib-ng/zlib-ng  
30 -cd zlib-ng  
31 -cmake -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/inst -DZLIB_COMPAT=ON  
32 -cmake --build build -j $(nproc)  
33 -(cd build; ctest --verbose)  
34 -cmake --install build  
35 -```  
36 -  
37 -Then run qpdf's test suite with  
38 -```  
39 -LD_PRELOAD=/tmp/inst/lib/libz.so.1 ctest --verbose  
40 -```  
41 -  
42 Always 21 Always
43 ====== 22 ======
44 23
manual/release-notes.rst
@@ -39,6 +39,21 @@ Planned changes for future 12.x (subject to change): @@ -39,6 +39,21 @@ Planned changes for future 12.x (subject to change):
39 .. x.y.z: not yet released 39 .. x.y.z: not yet released
40 40
41 11.7.0: not yet released 41 11.7.0: not yet released
  42 + - Bug fixes:
  43 +
  44 + - With ``--compress-streams=n``, qpdf was still compressing cross
  45 + reference streams, linearization hint streams, and object
  46 + streams. This has been fixed.
  47 +
  48 + - Build Enhancements:
  49 +
  50 + - The qpdf test suite now passes when qpdf is linked with an
  51 + alternative ``zlib`` implementation. There are no dependencies
  52 + anywhere in the qpdf test suite on any particular ``zlib``
  53 + output. Consult the ``ZLIB COMPATIBILITY`` section of
  54 + ``README-maintainer.md`` for a detailed explanation of how to
  55 + maintain this.
  56 +
42 - Library Enhancements: 57 - Library Enhancements:
43 58
44 - Add C++ functions ``qpdf_c_wrap`` and ``qpdf_c_get_qpdf`` to the 59 - Add C++ functions ``qpdf_c_wrap`` and ``qpdf_c_get_qpdf`` to the