Commit 3fe6a1f5e9810266c90a28b731f16f207b543ce3

Authored by Jay Berkenbilt
1 parent a3c890c0

TODO: update JSON and other changes

Showing 1 changed file with 80 additions and 19 deletions
... ... @@ -11,19 +11,10 @@ Before Release:
11 11 Next:
12 12 * output capture
13 13 * QPDFPagesTree -- avoid ever flattening the pages tree.
  14 +* JSON v2 fixes
14 15  
15 16 Pending changes:
16 17  
17   -* Think about whether additional JSON use cases would be served by
18   - having qpdf-v2 contain things other than "objects" or making qpdf
19   - --json include everything that --json-output includes. Right now, if
20   - you wanted to do something in json objects based on page
21   - information, you'd have to run qpdf --json and also qpdf
22   - --json-output separately. Also "qpdf-v2" doesn't follow the naming
23   - convention. See pinned email from m-holger with subject "qpdf
24   - json.rst" from June 5
25   - (04ad60e5-3274-4a9e-abde-3de97640d370@www.fastmail.com)
26   -* Good C API for json v2
27 18 * Check about runpath in the linux-bin distribution. I think the
28 19 appimage build specifically is setting the runpath, which is
29 20 actually desirable in this case. Make sure to understand and
... ... @@ -43,14 +34,10 @@ Pending changes:
43 34 reveal additional details, --show-encryption could potentially retry
44 35 with this option if the first time doesn't work. Then, with the file
45 36 open, we can read the encryption dictionary normally.
46   -* Nice to have:
47   - * In libtests, separate executables that need the object library
48   - from those that strictly use public API. Move as many of the test
49   - drivers from the qpdf directory into the latter category as long
50   - as doing so isn't too troublesome from a coverage standpoint.
51   - * Rework tests so that nothing is written into the source directory.
52   - Ideally then the entire build could be done with a read-only
53   - source tree.
  37 +* In libtests, separate executables that need the object library
  38 + from those that strictly use public API. Move as many of the test
  39 + drivers from the qpdf directory into the latter category as long
  40 + as doing so isn't too troublesome from a coverage standpoint.
54 41 * Consider adding fuzzer code for JSON
55 42  
56 43 Soon: Break ground on "Document-level work"
... ... @@ -128,6 +115,78 @@ sure /Count and /Parent are correct.
128 115 refs/attic/QPDFPagesTree-old -- original, abandoned branch -- clean up
129 116 when done.
130 117  
  118 +
  119 +JSON v2 fixes
  120 +=============
  121 +
  122 +* Get rid of separate format for --json and --json-output. Instead,
  123 + --json-output can just require an outfile and change some defaults
  124 + like which keys are present and json-stream-data. This makes it
  125 + easier to support use cases like being able to use information in
  126 + other top-level keys ("pages", "attachments", etc.) to drive
  127 + modifications made to objects without having to run qpdf twice. I
  128 + think --json-output should make the default key be only "qpdf" and
  129 + the default json-stream-data mode be inline, but make it so you can
  130 + use --json-stream-data and --json-stream-prefix with --json and
  131 + --json-keys with --json-output. These would be exactly the same:
  132 +
  133 + --json-output --json-keys=all -
  134 + --json --json-stream-data=inline
  135 +
  136 + And these:
  137 +
  138 + --json-output -
  139 + --json --json-stream-data=inline --json-key=qpdf
  140 +
  141 +* Change the name of the "qpdf-v2" key to "qpdf". Use that in place of
  142 + "objects" and change its content to a two-element array whose first
  143 + element is metadata required (or useful) for parsing and whose
  144 + second element contains the actual data. Use of an array is the only
  145 + way to ensure that the metadata is guaranteed to be parsed before we
  146 + start parsing the objects. Example:
  147 +
  148 + {
  149 + "qpdf": [
  150 + {
  151 + "jsonversion": 2,
  152 + "repairpagestree": false,
  153 + "maxobjectid": 10
  154 + },
  155 + {
  156 + "pdfversion": "1.3",
  157 + "objects": {
  158 + ...
  159 + }
  160 + }
  161 + ]
  162 + }
  163 +
  164 + This implies a few things:
  165 +
  166 + * QPDF::writeJSON will have to take an argument indicating whether
  167 + additional keys are being written which determines whether it
  168 + outputs the outer braces or not.
  169 +
  170 + * This changes the policy about additional extra keys. Have a
  171 + guarantee that qpdf will never add a key whose name is or starts
  172 + with "xdata". We still have to ignore unknown keys for future
  173 + compatibility, but at least this gives people a namespace they can
  174 + know will never conflict with future keys.
  175 +
  176 + * Change schema validation so that if the schema contains an array
  177 + with more than one element, the output has to have an array with
  178 + the same number of elements whose individual elements are
  179 + validated according to the regular rules.
  180 +
  181 +* Support json v2 in the C API. At a minimum, write_json,
  182 + create_from_json, and update_from_json need to be there and should
  183 + take the same kinds of functions as the C API for logger.
  184 +
  185 +* Address json.rst comment from m-holger: "The discussion of stream
  186 + objects is very wordy. Would a table similar to the style of the PDF
  187 + spec be easier to use?"
  188 +
  189 +
131 190 Possible future JSON enhancements
132 191 =================================
133 192  
... ... @@ -439,7 +498,9 @@ I find it useful to make reference to them in this list.
439 498  
440 499 * Look at https://bestpractices.coreinfrastructure.org/en
441 500  
442   - * Get rid of remaining assert() calls from non-test code.
  501 + * Rework tests so that nothing is written into the source directory.
  502 + Ideally then the entire build could be done with a read-only
  503 + source tree.
443 504  
444 505 * Large file tests fail with linux32 before and after cmake. This was
445 506 first noticed after 10.6.3. I don't think it's worth fixing.
... ...