Commit 5d63730b9347a755d2906f7a929db9dba71ea37f
1 parent
12d065c7
Clean up documentation
Showing
6 changed files
with
153 additions
and
114 deletions
TODO
| ... | ... | @@ -9,7 +9,9 @@ Before Release: |
| 9 | 9 | * Release qtest with updates to qtest-driver and copy back into qpdf |
| 10 | 10 | |
| 11 | 11 | Next: |
| 12 | -* JSON v2 fixes | |
| 12 | +* Support json v2 in the C API. At a minimum, write_json, | |
| 13 | + create_from_json, and update_from_json need to be there and should | |
| 14 | + take the same kinds of functions as the C API for logger. | |
| 13 | 15 | |
| 14 | 16 | Pending changes: |
| 15 | 17 | |
| ... | ... | @@ -65,19 +67,6 @@ direct objects, which are always "resolved" in QPDFObjectHandle. |
| 65 | 67 | |
| 66 | 68 | Soon: Break ground on "Document-level work" |
| 67 | 69 | |
| 68 | - | |
| 69 | -JSON v2 fixes | |
| 70 | -============= | |
| 71 | - | |
| 72 | -* Support json v2 in the C API. At a minimum, write_json, | |
| 73 | - create_from_json, and update_from_json need to be there and should | |
| 74 | - take the same kinds of functions as the C API for logger. | |
| 75 | - | |
| 76 | -* Address json.rst comment from m-holger: "The discussion of stream | |
| 77 | - objects is very wordy. Would a table similar to the style of the PDF | |
| 78 | - spec be easier to use?" | |
| 79 | - | |
| 80 | - | |
| 81 | 70 | Possible future JSON enhancements |
| 82 | 71 | ================================= |
| 83 | 72 | ... | ... |
job.sums
| ... | ... | @@ -8,10 +8,10 @@ include/qpdf/auto_job_c_pages.hh b3cc0f21029f6d89efa043dcdbfa183cb59325b6506001c |
| 8 | 8 | include/qpdf/auto_job_c_uo.hh ae21b69a1efa9333050f4833d465f6daff87e5b38e5106e49bbef5d4132e4ed1 |
| 9 | 9 | job.yml f9564f18b08a45d17328af43652645771d3498471820c858b8c9013a193e1412 |
| 10 | 10 | libqpdf/qpdf/auto_job_decl.hh 7844eba58edffb9494b19e8eca6fd59a24d6e152ca606c3b07da569f753df2da |
| 11 | -libqpdf/qpdf/auto_job_help.hh 700d7600b34588169c80f3e325e39e592e2f5c1af1cdac16614150ff38424b40 | |
| 11 | +libqpdf/qpdf/auto_job_help.hh 53306e4aef8aaca641c0087bc9e064ada1c44a94b826c0bcac7b4eb0c8c41fd5 | |
| 12 | 12 | libqpdf/qpdf/auto_job_init.hh fd1635a5ad6ba16b7ae008467145560a59a5ecfd10d29c5ef7cd0d8347747cd2 |
| 13 | 13 | libqpdf/qpdf/auto_job_json_decl.hh 06caa46eaf71db8a50c046f91866baa8087745a9474319fb7c86d92634cc8297 |
| 14 | 14 | libqpdf/qpdf/auto_job_json_init.hh 59545578a2e47c660ff98516ed53f06638be75eb4658e2a09d32cc08e0cb7268 |
| 15 | 15 | libqpdf/qpdf/auto_job_schema.hh 5352ef1be1ad7cc6f4f36dab88f2937d278e6bd3a0e2d46259794dc226c8ba6b |
| 16 | 16 | manual/_ext/qpdf.py 6add6321666031d55ed4aedf7c00e5662bba856dfcd66ccb526563bffefbb580 |
| 17 | -manual/cli.rst bbce4cfb662a96c8df0c8563f8065844b77aca7b4ec6385955546b9a455d9953 | |
| 17 | +manual/cli.rst 41ee93f23f46160fe9eaf7c99fd2ab3bd2e0f6792a341a35bdac1a41cb853ed5 | ... | ... |
libqpdf/qpdf/auto_job_help.hh
| ... | ... | @@ -813,7 +813,8 @@ ap.addOptionHelp("--json-key", "json", "limit which keys are in JSON output", R" |
| 813 | 813 | |
| 814 | 814 | This option is repeatable. If given, only the specified |
| 815 | 815 | top-level keys will be included in the JSON output. Otherwise, |
| 816 | -all keys will be included. | |
| 816 | +all keys will be included. With --json-output, when not given, | |
| 817 | +only the "qpdf" key will appear in the output. | |
| 817 | 818 | )"); |
| 818 | 819 | ap.addOptionHelp("--json-object", "json", "limit which objects are in JSON", R"(--json-object={trailer|obj[,gen]} |
| 819 | 820 | ... | ... |
manual/cli.rst
| ... | ... | @@ -913,7 +913,7 @@ Related Options |
| 913 | 913 | qpdf will recompress streams with generalized filters using flate |
| 914 | 914 | compression, effectively eliminating LZW and ASCII-based filters. |
| 915 | 915 | This is usually desirable behavior but can be disabled with |
| 916 | - ``--decode-level=none``. Note that ``--decode-level=node`` is the | |
| 916 | + ``--decode-level=none``. Note that ``--decode-level=none`` is the | |
| 917 | 917 | default when :qpdf:ref:`--json-output` is specified, but it can be |
| 918 | 918 | overridden in that case as well. |
| 919 | 919 | |
| ... | ... | @@ -3197,7 +3197,8 @@ Related Options |
| 3197 | 3197 | Starting with qpdf 11, when this option is specified, an output |
| 3198 | 3198 | file is optional (for backward compatibility) and defaults to |
| 3199 | 3199 | standard output. You may specify an output file to write the JSON |
| 3200 | - to a file rather than standard output. | |
| 3200 | + to a file rather than standard output. (Example: ``qpdf --json | |
| 3201 | + in.pdf out.json``) | |
| 3201 | 3202 | |
| 3202 | 3203 | Stream data is only included if :qpdf:ref:`--json-output` is |
| 3203 | 3204 | specified or if a value other than ``none`` is passed to |
| ... | ... | @@ -3225,14 +3226,16 @@ Related Options |
| 3225 | 3226 | |
| 3226 | 3227 | This option is repeatable. If given, only the specified |
| 3227 | 3228 | top-level keys will be included in the JSON output. Otherwise, |
| 3228 | - all keys will be included. | |
| 3229 | + all keys will be included. With --json-output, when not given, | |
| 3230 | + only the "qpdf" key will appear in the output. | |
| 3229 | 3231 | |
| 3230 | 3232 | This option is repeatable. If given, only the specified top-level |
| 3231 | 3233 | keys will be included in the JSON output. Otherwise, all keys will |
| 3232 | - be included. ``version`` and ``parameters`` will always appear in | |
| 3233 | - the output. If not given, all keys will be included, unless | |
| 3234 | + be included. If not given, all keys will be included, unless | |
| 3234 | 3235 | :qpdf:ref:`--json-output` was specified, in which case, only the |
| 3235 | - ``"qpdf"`` key will be included by default. | |
| 3236 | + ``"qpdf"`` key will be included by default. If | |
| 3237 | + :qpdf:ref:`--json-output` was not given, the ``version`` and | |
| 3238 | + ``parameters`` keys will always appear in the output. | |
| 3236 | 3239 | |
| 3237 | 3240 | .. qpdf:option:: --json-object={trailer|obj[,gen]} |
| 3238 | 3241 | |
| ... | ... | @@ -3311,8 +3314,8 @@ Related Options |
| 3311 | 3314 | output, but you can add additional keys with |
| 3312 | 3315 | :qpdf:ref:`--json-key`. |
| 3313 | 3316 | |
| 3314 | - - Excludes the ``"version"`` and ``"parameters"`` keys from the | |
| 3315 | - JSON output. | |
| 3317 | + - The ``"version"`` and ``"parameters"`` keys will be excluded from | |
| 3318 | + the JSON output. | |
| 3316 | 3319 | |
| 3317 | 3320 | If you want to look at the contents of streams easily as you would |
| 3318 | 3321 | in QDF mode (see :ref:`qdf`), you can use | ... | ... |
manual/conf.py
manual/json.rst
| ... | ... | @@ -24,28 +24,33 @@ represents the contents of a PDF file. This is distinct from the |
| 24 | 24 | interacting with qpdf the way the command-line tool does. For |
| 25 | 25 | information about that, see :ref:`qpdf-job`. |
| 26 | 26 | |
| 27 | -The qpdf JSON format is specific to qpdf. With JSON version 2, the | |
| 28 | -:qpdf:ref:`--json` command-line flag causes creation of a JSON | |
| 29 | -representation of all the objects in a PDF file. This includes an | |
| 30 | -unambiguous representation of the PDF object structure and also | |
| 31 | -provides JSON-formatted summaries of other information about the file. | |
| 32 | -This functionality is built into ``QPDFJob`` and can be accessed from | |
| 33 | -the ``qpdf`` command-line tool or from the ``QPDFJob`` C or C++ API. | |
| 34 | - | |
| 35 | -By default, stream data is omitted, but it can be included by | |
| 36 | -specifying the :qpdf:ref:`--json-stream-data` option. With stream data | |
| 37 | -included, the generated JSON file completely represents a PDF file. | |
| 38 | -You can think of this as using JSON as an *alternative syntax* for | |
| 39 | -representing a PDF file. Using qpdf JSON, it is possible to convert a | |
| 40 | -PDF file to JSON, manipulate the structure or contents of the objects | |
| 41 | -at a low level, and convert the results back to a PDF file. This | |
| 42 | -functionality can be accessed from the command-line with the | |
| 43 | -:qpdf:ref:`--json-input`, and :qpdf:ref:`--update-from-json` flags, or | |
| 44 | -from the API using the ``QPDF::writeJSON``, ``QPDF::createFromJSON``, | |
| 45 | -and ``QPDF::updateFromJSON`` methods. The :qpdf:ref:`--json-output` | |
| 46 | -flag changes a handful of defaults so that the resulting JSON is as | |
| 47 | -close as possible to the original input and is ready for being | |
| 48 | -converted back to PDF. | |
| 27 | +The qpdf JSON format is specific to qpdf. The :qpdf:ref:`--json` | |
| 28 | +command-line flag causes creation of a JSON representation the objects | |
| 29 | +in a PDF file along with JSON-formatted summaries of other information | |
| 30 | +about the file. This functionality is built into ``QPDFJob`` and can | |
| 31 | +be accessed from the ``qpdf`` command-line tool or from the | |
| 32 | +``QPDFJob`` C or C++ API. | |
| 33 | + | |
| 34 | +Starting with qpdf JSON version 2, from qpdf 11.0.0, the JSON output | |
| 35 | +includes an unambiguous and complete representation of the PDF objects | |
| 36 | +and header. The information without the JSON-formatted summaries of | |
| 37 | +other information is also available using the ``QPDF::writeJSON`` | |
| 38 | +method. | |
| 39 | + | |
| 40 | +By default, stream data is omitted from the JSON data, but it can be | |
| 41 | +included by specifying the :qpdf:ref:`--json-stream-data` option. With | |
| 42 | +stream data included, the generated JSON file completely represents a | |
| 43 | +PDF file. You can think of this as using JSON as an *alternative | |
| 44 | +syntax* for representing a PDF file. Using qpdf JSON, it is possible | |
| 45 | +to convert a PDF file to JSON, manipulate the structure or contents of | |
| 46 | +the objects at a low level, and convert the results back to a PDF | |
| 47 | +file. This functionality can be accessed from the command-line with | |
| 48 | +the :qpdf:ref:`--json-input`, and :qpdf:ref:`--update-from-json` | |
| 49 | +flags, or from the API using the ``QPDF::createFromJSON``, and | |
| 50 | +``QPDF::updateFromJSON`` methods. The :qpdf:ref:`--json-output` flag | |
| 51 | +changes a handful of defaults so that the resulting JSON is as close | |
| 52 | +as possible to the original input and is ready for being converted | |
| 53 | +back to PDF. | |
| 49 | 54 | |
| 50 | 55 | .. _json-terminology: |
| 51 | 56 | |
| ... | ... | @@ -71,7 +76,8 @@ This manual is not entirely consistent about its use of *dictionary* |
| 71 | 76 | vs. *object* because sometimes one term or another is clearer in |
| 72 | 77 | context. Just be aware of the ambiguity when reading the manual. We |
| 73 | 78 | frequently use the term *dictionary* to refer to a JSON object because |
| 74 | -of the consistency with PDF terminology. | |
| 79 | +of the consistency with PDF terminology, particular when referring to | |
| 80 | +a dictionary that contains information PDF objects. | |
| 75 | 81 | |
| 76 | 82 | .. _what-qpdf-json-is-not: |
| 77 | 83 | |
| ... | ... | @@ -121,12 +127,14 @@ qpdf JSON Object Representation |
| 121 | 127 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 122 | 128 | |
| 123 | 129 | This section describes the representation of PDF objects in qpdf JSON |
| 124 | -version 2. PDF objects are represented within the ``"qpdf"`` entry of | |
| 125 | -a qpdf JSON file. The ``"qpdf"`` entry is a two-element array. The | |
| 126 | -first element is a dictionary containing header-like information about | |
| 127 | -the file such as the PDF version. The second element is a dictionary | |
| 128 | -containing all the objects in the PDF file. We refer to this as the | |
| 129 | -*objects dictionary*. | |
| 130 | +version 2. An example appears in :ref:`json.example`. | |
| 131 | + | |
| 132 | +PDF objects are represented within the ``"qpdf"`` entry of a qpdf JSON | |
| 133 | +file. The ``"qpdf"`` entry is a two-element array. The first element | |
| 134 | +is a dictionary containing header-like information about the file such | |
| 135 | +as the PDF version. The second element is a dictionary containing all | |
| 136 | +the objects in the PDF file. We refer to this as the *objects | |
| 137 | +dictionary*. | |
| 130 | 138 | |
| 131 | 139 | The first element contains the following keys: |
| 132 | 140 | |
| ... | ... | @@ -136,17 +144,19 @@ The first element contains the following keys: |
| 136 | 144 | - ``"pdfversion"`` -- a string containing PDF version as indicated in |
| 137 | 145 | the PDF header (e.g. ``"1.7"``, ``"2.0"``) |
| 138 | 146 | |
| 139 | -- ``pushedinheritedpageresources`` -- a boolean indicating whether | |
| 140 | - the library pushed inherited resources down to the page level. | |
| 141 | - Certain library calls cause this to happen, and qpdf needs to know | |
| 142 | - when reading a JSON file back in whether it should do this as it may | |
| 143 | - cause certain objects to be renumbered. | |
| 147 | +- ``pushedinheritedpageresources`` -- a boolean indicating whether the | |
| 148 | + library pushed inherited resources down to the page level. Certain | |
| 149 | + library calls cause this to happen, and qpdf needs to know when | |
| 150 | + reading a JSON file back in whether it should do this as it may | |
| 151 | + cause certain objects to be renumbered. This field is ignored when | |
| 152 | + :qpdf:ref:`--update-from-json` was not given. | |
| 144 | 153 | |
| 145 | 154 | - ``calledgetallpages`` -- a boolean indicating whether |
| 146 | 155 | ``getAllPages`` was called prior to writing the JSON output. This |
| 147 | 156 | method causes page tree repair to occur, which may renumber some |
| 148 | 157 | objects (in very rare cases of corrupted page trees), so qpdf needs |
| 149 | - to know this information when reading a JSON file back in. | |
| 158 | + to know this information when reading a JSON file back in. This | |
| 159 | + field is ignored when :qpdf:ref:`--update-from-json` was not given. | |
| 150 | 160 | |
| 151 | 161 | - ``"maxobjectid"`` -- a number indicating the object ID of the |
| 152 | 162 | highest numbered object in the file. This is provided to make it |
| ... | ... | @@ -162,12 +172,12 @@ The first element contains the following keys: |
| 162 | 172 | if objects are removed from a PDF file.) |
| 163 | 173 | |
| 164 | 174 | The second element is the objects dictionary. Each key in the objects |
| 165 | -dictionary is either ``"trailer"`` or a string of the form ``"obj:O G | |
| 166 | -R"`` where ``O`` and ``G`` are the object and generation numbers and | |
| 167 | -``R`` is the literal string ``R``. This is the PDF syntax for the | |
| 168 | -indirect object reference prepended by ``obj:``. The value, | |
| 169 | -representing the object itself, is a JSON object whose structure is | |
| 170 | -described below. | |
| 175 | +dictionary is either ``"trailer"`` or a string of the form | |
| 176 | +:samp:`"obj:{O} {G} R"` where :samp:`{O}` and :samp:`{G}` are the | |
| 177 | +object and generation numbers and ``R`` is the literal string ``R``. | |
| 178 | +This is the PDF syntax for the indirect object reference prepended by | |
| 179 | +``obj:``. The value, representing the object itself, is a JSON object | |
| 180 | +whose structure is described below. | |
| 171 | 181 | |
| 172 | 182 | Top-level Stream Objects |
| 173 | 183 | Stream objects are represented as a JSON object with the single key |
| ... | ... | @@ -234,11 +244,11 @@ Object Values |
| 234 | 244 | JSON as ``"/text/plain"`` and in PDF as ``"/text#2fplain"``. |
| 235 | 245 | |
| 236 | 246 | - Indirect object references are represented as JSON strings that |
| 237 | - look like a PDF indirect object reference and have the form ``"O G | |
| 238 | - R"`` where ``O`` and ``G`` are the object and generation numbers | |
| 239 | - and ``R`` is the literal string ``R``. For example, ``"3 0 R"`` | |
| 240 | - would represent a reference to the object with object ID 3 and | |
| 241 | - generation 0. | |
| 247 | + look like a PDF indirect object reference and have the form | |
| 248 | + :samp:`"{O} {G} R"` where :samp:`{O}` and :samp:`{G}` are the | |
| 249 | + object and generation numbers and ``R`` is the literal string | |
| 250 | + ``R``. For example, ``"3 0 R"`` would represent a reference to the | |
| 251 | + object with object ID 3 and generation 0. | |
| 242 | 252 | |
| 243 | 253 | - PDF strings are represented as JSON strings in one of two ways: |
| 244 | 254 | |
| ... | ... | @@ -288,11 +298,11 @@ Object Values |
| 288 | 298 | |
| 289 | 299 | Note that writing JSON output is done by ``QPDF``, not ``QPDFWriter``. |
| 290 | 300 | As such, none of the things ``QPDFWriter`` does apply. This includes |
| 291 | -recompression of streams, renumbering of objects, anything to do with | |
| 292 | -object streams (which are not represented by qpdf JSON at all since | |
| 293 | -they are PDF syntax, not semantics), encryption, decryption, | |
| 294 | -linearization, QDF mode, etc. See :ref:`rewriting` for a more in-depth | |
| 295 | -discussion. | |
| 301 | +recompression of streams, renumbering of objects, removal of | |
| 302 | +unreferenced objects, anything to do with object streams (which are | |
| 303 | +not represented by qpdf JSON at all since they are PDF syntax, not | |
| 304 | +semantics), encryption, decryption, linearization, QDF mode, etc. See | |
| 305 | +:ref:`rewriting` for a more in-depth discussion. | |
| 296 | 306 | |
| 297 | 307 | .. _json.example: |
| 298 | 308 | |
| ... | ... | @@ -311,36 +321,55 @@ qpdf JSON format. |
| 311 | 321 | "pdfversion": "1.3", |
| 312 | 322 | "pushedinheritedpageresources": false, |
| 313 | 323 | "calledgetallpages": false, |
| 314 | - "maxobjectid": 5 | |
| 324 | + "maxobjectid": 6 | |
| 315 | 325 | }, |
| 316 | 326 | { |
| 317 | 327 | "obj:1 0 R": { |
| 318 | 328 | "value": { |
| 319 | - "/Pages": "2 0 R", | |
| 329 | + "/Pages": "3 0 R", | |
| 320 | 330 | "/Type": "/Catalog" |
| 321 | 331 | } |
| 322 | 332 | }, |
| 323 | 333 | "obj:2 0 R": { |
| 324 | 334 | "value": { |
| 335 | + "/Author": "u:Digits of ฯ", | |
| 336 | + "/CreationDate": "u:D:20220731155308-05'00'", | |
| 337 | + "/Creator": "u:A person typing in Emacs", | |
| 338 | + "/Keywords": "u:potato, example", | |
| 339 | + "/ModDate": "u:D:20220731155308-05'00'", | |
| 340 | + "/Producer": "u:qpdf", | |
| 341 | + "/Subject": "u:Example", | |
| 342 | + "/Title": "u:Something potato-related" | |
| 343 | + } | |
| 344 | + }, | |
| 345 | + "obj:3 0 R": { | |
| 346 | + "value": { | |
| 325 | 347 | "/Count": 1, |
| 326 | - "/Kids": [ "3 0 R" ], | |
| 348 | + "/Kids": [ | |
| 349 | + "4 0 R" | |
| 350 | + ], | |
| 327 | 351 | "/Type": "/Pages" |
| 328 | 352 | } |
| 329 | 353 | }, |
| 330 | - "obj:3 0 R": { | |
| 354 | + "obj:4 0 R": { | |
| 331 | 355 | "value": { |
| 332 | - "/Contents": "4 0 R", | |
| 333 | - "/MediaBox": [ 0, 0, 612, 792 ], | |
| 334 | - "/Parent": "2 0 R", | |
| 356 | + "/Contents": "5 0 R", | |
| 357 | + "/MediaBox": [ | |
| 358 | + 0, | |
| 359 | + 0, | |
| 360 | + 612, | |
| 361 | + 792 | |
| 362 | + ], | |
| 363 | + "/Parent": "3 0 R", | |
| 335 | 364 | "/Resources": { |
| 336 | 365 | "/Font": { |
| 337 | - "/F1": "5 0 R" | |
| 366 | + "/F1": "6 0 R" | |
| 338 | 367 | } |
| 339 | 368 | }, |
| 340 | 369 | "/Type": "/Page" |
| 341 | 370 | } |
| 342 | 371 | }, |
| 343 | - "obj:4 0 R": { | |
| 372 | + "obj:5 0 R": { | |
| 344 | 373 | "stream": { |
| 345 | 374 | "data": "eJxzCuFSUNB3M1QwMlEISQOyzY2AyEAhJAXI1gjIL0ksyddUCMnicg3hAgDLAQnI", |
| 346 | 375 | "dict": { |
| ... | ... | @@ -348,7 +377,7 @@ qpdf JSON format. |
| 348 | 377 | } |
| 349 | 378 | } |
| 350 | 379 | }, |
| 351 | - "obj:5 0 R": { | |
| 380 | + "obj:6 0 R": { | |
| 352 | 381 | "value": { |
| 353 | 382 | "/BaseFont": "/Helvetica", |
| 354 | 383 | "/Encoding": "/WinAnsiEncoding", |
| ... | ... | @@ -360,10 +389,11 @@ qpdf JSON format. |
| 360 | 389 | "value": { |
| 361 | 390 | "/ID": [ |
| 362 | 391 | "b:98b5a26966fba4d3a769b715b2558da6", |
| 363 | - "b:98b5a26966fba4d3a769b715b2558da6" | |
| 392 | + "b:6bea23330e0b9ff0ddb47b6757fb002e" | |
| 364 | 393 | ], |
| 394 | + "/Info": "2 0 R", | |
| 365 | 395 | "/Root": "1 0 R", |
| 366 | - "/Size": 6 | |
| 396 | + "/Size": 7 | |
| 367 | 397 | } |
| 368 | 398 | } |
| 369 | 399 | } |
| ... | ... | @@ -410,9 +440,6 @@ Here are some important things to know about qpdf JSON input. |
| 410 | 440 | - ``"maxobjectid"`` is ignored, so it is not necessary to update it |
| 411 | 441 | when adding new objects. |
| 412 | 442 | |
| 413 | - - ``"calledgetallpages"`` and ``"pushedinheritedpageresources"`` are | |
| 414 | - treated as false if omitted. | |
| 415 | - | |
| 416 | 443 | - ``"/Length"`` is ignored in all stream dictionaries. qpdf doesn't |
| 417 | 444 | put it there when it creates JSON output, and it is not necessary |
| 418 | 445 | to add it. |
| ... | ... | @@ -420,16 +447,24 @@ Here are some important things to know about qpdf JSON input. |
| 420 | 447 | - ``"/Size"`` is ignored if it appears in a trailer dictionary as |
| 421 | 448 | that is always recomputed by ``QPDFWriter``. |
| 422 | 449 | |
| 423 | - - Unknown keys at the to top level of the file, within ``objects``, | |
| 424 | - at the top level of each individual object (inside the object that | |
| 425 | - has the ``"value"`` or ``"stream"`` key) and directly within | |
| 426 | - ``"stream"`` are ignored for future compatibility. This includes | |
| 427 | - other top-level keys generated by ``qpdf`` itself (such as | |
| 428 | - ``"pages"``). As such, those keys don't have to be consistent with | |
| 429 | - the ``"qpdf"`` key if modifying a JSON file for conversion back to | |
| 430 | - PDF. If you wish to store application-specific metadata, you can | |
| 431 | - do so by adding a key whose name starts with ``x-``. qpdf is | |
| 432 | - guaranteed not to add any of its own keys that starts with ``x-``. | |
| 450 | + - Unknown keys at the top level of the file, within ``"qpdf"``, and | |
| 451 | + at the top level of each individual PDF object (inside the | |
| 452 | + dictionary that has the ``"value"`` or ``"stream"`` key) and | |
| 453 | + directly within ``"stream"`` are ignored for future compatibility. | |
| 454 | + This includes other top-level keys generated by ``qpdf`` itself | |
| 455 | + (such as ``"pages"``). As such, those keys don't have to be | |
| 456 | + consistent with the ``"qpdf"`` key if modifying a JSON file for | |
| 457 | + conversion back to PDF. If you wish to store application-specific | |
| 458 | + metadata, you can do so by adding a key whose name starts with | |
| 459 | + ``x-``. qpdf is guaranteed not to add any of its own keys that | |
| 460 | + starts with ``x-``. Note that any ``"version"`` key at the top | |
| 461 | + level is ignored. The JSON version is obtained from the | |
| 462 | + ``"jsonversion"`` key of the first element of the ``"qpdf"`` | |
| 463 | + field. | |
| 464 | + | |
| 465 | +- The values of ``"calledgetallpages"`` and | |
| 466 | + ``"pushedinheritedpageresources"`` are ignored when creating a file. | |
| 467 | + When updating a file, they treated as ``false`` if omitted. | |
| 433 | 468 | |
| 434 | 469 | - When qpdf reads a PDF file, the internal object numbers are always |
| 435 | 470 | preserved. However, when qpdf writes a file using ``QPDFWriter``, |
| ... | ... | @@ -465,14 +500,14 @@ Here are some important things to know about qpdf JSON input. |
| 465 | 500 | ``QPDF::updateFromJSON``), existing objects are updated in place. |
| 466 | 501 | This has the following implications: |
| 467 | 502 | |
| 468 | - - You may omit both ``"data"`` and ``"datafile"`` if the object you | |
| 469 | - are updating is already a stream. In that case the original stream | |
| 503 | + - If the object you are updating is a stream, you may omit both | |
| 504 | + ``"data"`` and ``"datafile"``. In that case the original stream | |
| 470 | 505 | data is preserved. You must always provide a stream dictionary, |
| 471 | 506 | but it may be empty. Note that an empty stream dictionary will |
| 472 | 507 | clear the old dictionary. There is no way to indicate that an old |
| 473 | 508 | stream dictionary should be left alone, so if your intention is to |
| 474 | - replace the stream data and preserve the dictionary, the | |
| 475 | - original dictionary must appear in the JSON file. | |
| 509 | + replace the stream data and preserve the dictionary, the original | |
| 510 | + dictionary must appear in the JSON file. | |
| 476 | 511 | |
| 477 | 512 | - You can change one object type to another object type including |
| 478 | 513 | replacing a stream with a non-stream or a non-stream with a |
| ... | ... | @@ -577,11 +612,14 @@ Compatibility |
| 577 | 612 | change would be any change that involves removal of a key, a change |
| 578 | 613 | to the format of data pointed to by a key, or a semantic change |
| 579 | 614 | that requires a different interpretation of a previously existing |
| 580 | - key. | |
| 615 | + key. Note that, starting with version 2, the JSON version also | |
| 616 | + appears in the ``"jsonversion"`` field of the first element of | |
| 617 | + ``"qpdf"`` field. | |
| 581 | 618 | |
| 582 | - With a specific qpdf JSON version, future versions of qpdf are free | |
| 583 | - to add additional keys but not to remove keys or change the type of | |
| 584 | - object that a key points to. | |
| 619 | + Within a specific qpdf JSON version, future versions of qpdf are | |
| 620 | + free to add additional keys but not to remove keys or change the | |
| 621 | + type of object that a key points to. That means that consumers of | |
| 622 | + qpdf JSON should ignore keys they don't know about. | |
| 585 | 623 | |
| 586 | 624 | Documentation |
| 587 | 625 | The :command:`qpdf` command can be invoked with the |
| ... | ... | @@ -634,7 +672,13 @@ Directness and Simplicity |
| 634 | 672 | functions in that it allows you to look at certain aspects of the |
| 635 | 673 | PDF file without having to understand all the nuances of the PDF |
| 636 | 674 | specification, while the raw objects allow you to mine the PDF for |
| 637 | - anything that the higher-level interfaces are lacking. | |
| 675 | + anything that the higher-level interfaces are lacking. It is | |
| 676 | + especially useful to create a JSON file with the ``"pages"`` and | |
| 677 | + ``"qpdf"`` keys and to use the ``"pages"`` information to find a | |
| 678 | + page rather than navigating the pages tree manually. This can be | |
| 679 | + done safely, and changes can made to the objects dictionary without | |
| 680 | + worrying about keeping ``"pages"`` up to date since it is ignored | |
| 681 | + when reading the file back in. | |
| 638 | 682 | |
| 639 | 683 | .. _json.considerations: |
| 640 | 684 | |
| ... | ... | @@ -741,10 +785,11 @@ version 2. |
| 741 | 785 | dictionary within ``"objects"``, and the PDF version was not |
| 742 | 786 | captured at all. |
| 743 | 787 | |
| 744 | -- Within the objects dictionary, keys are now ``"obj:O G R"`` where | |
| 745 | - ``O`` and ``G`` are the object and generation number. ``"trailer"`` | |
| 746 | - remains the key for the trailer dictionary. In v1, the ``obj:`` | |
| 747 | - prefix was not present. The rationale for this change is as follows: | |
| 788 | +- Within the objects dictionary, keys are now :samp:`"obj:{O} {G} R"` | |
| 789 | + where :samp:`{O}` and :samp:`{G}` are the object and generation | |
| 790 | + number. ``"trailer"`` remains the key for the trailer dictionary. In | |
| 791 | + v1, the ``obj:`` prefix was not present. The rationale for this | |
| 792 | + change is as follows: | |
| 748 | 793 | |
| 749 | 794 | - Having a unique prefix (``obj:``) makes it much easier to search |
| 750 | 795 | in the JSON file for the definition of an object | ... | ... |