Commit 37bface8a2984e299f2a7d7105f6876f77185939

Authored by Jay Berkenbilt
1 parent 55b00248

Brush up roadmap in TODO.md

Showing 1 changed file with 38 additions and 38 deletions
... ... @@ -31,23 +31,18 @@ Next
31 31  
32 32 * Spell check: Have the spell-check script synchronize cSpell.json with .idea/dictionaries/qpdf.xml,
33 33 which should be set to the union of all the validated user dictionaries.
34   -* Fix #874 -- make args in --encrypt to match the json and make positional fill in the gaps
35   -* Maybe fix #553 -- use file times for attachments
  34 +* Maybe fix #553 -- use file times for attachments (trivial with C++-20)
36 35 * std::string_view transition -- work being done by m-holger
37   -* Break ground on "Document-level work" -- TODO-pages.md lives on a separate branch.
38   -* Standard for CLI and Job JSON support for JSON-based command-line arguments. Come up with a
39   - standard way of supporting command-line arguments that take JSON specifications of things so that
40   - * there is a predictable way to indicate whether an argument is a file or a JSON blob
41   - * with QPDFJob JSON, make sure it is possible to directly include the JSON rather than having to
42   - stringify a JSON blob
43   - * One option might be to prepend file:// to a filename or otherwise to take a JSON blob. We could
44   - have that as a particular type of argument that would behave properly for both job JSON and CLI.
45   -* Support digital signatures. This probably requires support for incremental updates. See
46   - "incremental updates" in rejected ideas. That description is out of date but would need to be
47   - cleaned up. See also issue #22.
  36 +* Support incremental updates. See "incremental updates" in [General](#general). See also issue #22.
48 37 * Make it possible to see incremental updates in qdf mode.
49 38 * Make it possible to add incremental updates.
50 39 * We may want a writing mode that preserves object IDs. See #339.
  40 +* Support digital signatures. This probably requires support for incremental updates. First, add
  41 + support for verifying digital signatures. Then we can consider adding support for signing
  42 + documents, though the ability to sign documents is less useful without an interactive process of
  43 + filling in a field. We may want to support only a subset of digital signature with invisible
  44 + signature fields or with existing fields.
  45 +* Support public key security handler (Section 7.6.5.)
51 46  
52 47 Possible future JSON enhancements
53 48 =================================
... ... @@ -328,6 +323,36 @@ NOTE: Some items in this list refer to files in my personal home directory or th
328 323 publicly accessible. This includes things sent to me by email that are specifically not public. Even
329 324 so, I find it useful to make reference to them in this list.
330 325  
  326 +* Provide support in QPDFWriter for writing incremental updates. Provide support in qpdf for
  327 + preserving incremental updates. The goal should be that QDF mode should be fully functional for
  328 + files with incremental updates including fix_qdf. This will work best if original object IDs are
  329 + preserved when a file is written. We will also have to preserve generations, which are, I believe,
  330 + completely ignored by QPDFWriter. If an update adds an object with a higher generation, any
  331 + reference to the object with a lower generation resolves to the null object. Increasing the
  332 + generation represents reusing an object number, while keeping the generation the same is updating
  333 + an object. I think qpdf must handle generations correctly, but make sure to test this carefully.
  334 +
  335 + Note that there's nothing that says an indirect object in one update can't refer to an object that
  336 + doesn't appear until a later update. This means that QPDF has to hang onto indirect nulls,
  337 + including when they appear as dictionary values. In this case, QPDF_Dictionary::getKeys() ignores
  338 + all keys with null values, and hasKey() returns false for keys that have null values. We would
  339 + probably want to make QPDF_Dictionary able to handle the special case of keys that are indirect
  340 + nulls and basically never have it drop any keys that are indirect objects. We also have to make
  341 + sure that the testing for this handles non-trivial cases of the targets of indirect nulls being
  342 + replaced by real objects in an update. Such indirect nulls should appear in tests as dictionary
  343 + values and as array values. In the distant past, qpdf used to replace indirect nulls with direct
  344 + nulls, but I think there are no longer any remnants of that behavior.
  345 +
  346 + I'm not sure how this plays with linearization, if at all. For cases where incremental updates are
  347 + not being preserved as incremental updates and where the data is being folded in (as is always the
  348 + case with qpdf now), none of this should make any difference in the actual semantics of the files.
  349 +
  350 + One thought about how to implement this would be to have a QPDF object that is an incremental
  351 + update to an underlying QPDF object. Objects would be resolved from the underlying QPDF if not
  352 + found in the main one. When you write this type of QPDF, it can either flatten or it can write as
  353 + incremental updates. Perhaps, in incremental mode, QPDF reads each increment as a separate QPDF
  354 + with this kind of layering.
  355 +
331 356 * Consider enabling code scanning on GitHub.
332 357  
333 358 * Add an option --ignore-encryption to ignore encryption information and treat encrypted files as if
... ... @@ -567,31 +592,6 @@ Rejected Ideas
567 592  
568 593 * Investigate whether there is a way to automate the memory checker tests for Windows.
569 594  
570   -* (This idea may be revised with alterations. Some of this is out of date.) Provide support in
571   - QPDFWriter for writing incremental updates. Provide support in qpdf for preserving incremental
572   - updates. The goal should be that QDF mode should be fully functional for files with incremental
573   - updates including fix_qdf.
574   -
575   - Note that there's nothing that says an indirect object in one update can't refer to an object that
576   - doesn't appear until a later update. This means that QPDF has to treat indirect null objects
577   - differently from how it does now. QPDF drops indirect null objects that appear as members of
578   - arrays or dictionaries. For arrays, it's handled in QPDFWriter where we make indirect nulls
579   - direct. This is in a single if block, and nothing else in the code cares about it. We could just
580   - remove that if block and not break anything except a few test cases that exercise the current
581   - behavior. For dictionaries, it's more complicated. In this case, QPDF_Dictionary::getKeys()
582   - ignores all keys with null values, and hasKey() returns false for keys that have null values. We
583   - would probably want to make QPDF_Dictionary able to handle the special case of keys that are
584   - indirect nulls and basically never have it drop any keys that are indirect objects.
585   -
586   - If we make a change to have qpdf preserve indirect references to null objects, we have to note
587   - this in ChangeLog and in the release notes since this will change output files. We did this before
588   - when we stopped flattening scalar references, so this is probably not a big deal. We also have to
589   - make sure that the testing for this handles non-trivial cases of the targets of indirect nulls
590   - being replaced by real objects in an update. I'm not sure how this plays with linearization, if at
591   - all. For cases where incremental updates are not being preserved as incremental updates and where
592   - the data is being folded in (as is always the case with qpdf now), none of this should make any
593   - difference in the actual semantics of the files.
594   -
595 595 * The second xref stream for linearized files has to be padded only because we need file_size as
596 596 computed in pass 1 to be accurate. If we were not allowing writing to a pipe, we could seek back
597 597 to the beginning and fill in the value of /L in the linearization dictionary as an optimization to
... ...