Commit 37bface8a2984e299f2a7d7105f6876f77185939

Authored by Jay Berkenbilt
1 parent 55b00248

Brush up roadmap in TODO.md

Showing 1 changed file with 38 additions and 38 deletions
@@ -31,23 +31,18 @@ Next @@ -31,23 +31,18 @@ Next
31 31
32 * Spell check: Have the spell-check script synchronize cSpell.json with .idea/dictionaries/qpdf.xml, 32 * Spell check: Have the spell-check script synchronize cSpell.json with .idea/dictionaries/qpdf.xml,
33 which should be set to the union of all the validated user dictionaries. 33 which should be set to the union of all the validated user dictionaries.
34 -* Fix #874 -- make args in --encrypt to match the json and make positional fill in the gaps  
35 -* Maybe fix #553 -- use file times for attachments 34 +* Maybe fix #553 -- use file times for attachments (trivial with C++-20)
36 * std::string_view transition -- work being done by m-holger 35 * std::string_view transition -- work being done by m-holger
37 -* Break ground on "Document-level work" -- TODO-pages.md lives on a separate branch.  
38 -* Standard for CLI and Job JSON support for JSON-based command-line arguments. Come up with a  
39 - standard way of supporting command-line arguments that take JSON specifications of things so that  
40 - * there is a predictable way to indicate whether an argument is a file or a JSON blob  
41 - * with QPDFJob JSON, make sure it is possible to directly include the JSON rather than having to  
42 - stringify a JSON blob  
43 - * One option might be to prepend file:// to a filename or otherwise to take a JSON blob. We could  
44 - have that as a particular type of argument that would behave properly for both job JSON and CLI.  
45 -* Support digital signatures. This probably requires support for incremental updates. See  
46 - "incremental updates" in rejected ideas. That description is out of date but would need to be  
47 - cleaned up. See also issue #22. 36 +* Support incremental updates. See "incremental updates" in [General](#general). See also issue #22.
48 * Make it possible to see incremental updates in qdf mode. 37 * Make it possible to see incremental updates in qdf mode.
49 * Make it possible to add incremental updates. 38 * Make it possible to add incremental updates.
50 * We may want a writing mode that preserves object IDs. See #339. 39 * We may want a writing mode that preserves object IDs. See #339.
  40 +* Support digital signatures. This probably requires support for incremental updates. First, add
  41 + support for verifying digital signatures. Then we can consider adding support for signing
  42 + documents, though the ability to sign documents is less useful without an interactive process of
  43 + filling in a field. We may want to support only a subset of digital signature with invisible
  44 + signature fields or with existing fields.
  45 +* Support public key security handler (Section 7.6.5.)
51 46
52 Possible future JSON enhancements 47 Possible future JSON enhancements
53 ================================= 48 =================================
@@ -328,6 +323,36 @@ NOTE: Some items in this list refer to files in my personal home directory or th @@ -328,6 +323,36 @@ NOTE: Some items in this list refer to files in my personal home directory or th
328 publicly accessible. This includes things sent to me by email that are specifically not public. Even 323 publicly accessible. This includes things sent to me by email that are specifically not public. Even
329 so, I find it useful to make reference to them in this list. 324 so, I find it useful to make reference to them in this list.
330 325
  326 +* Provide support in QPDFWriter for writing incremental updates. Provide support in qpdf for
  327 + preserving incremental updates. The goal should be that QDF mode should be fully functional for
  328 + files with incremental updates including fix_qdf. This will work best if original object IDs are
  329 + preserved when a file is written. We will also have to preserve generations, which are, I believe,
  330 + completely ignored by QPDFWriter. If an update adds an object with a higher generation, any
  331 + reference to the object with a lower generation resolves to the null object. Increasing the
  332 + generation represents reusing an object number, while keeping the generation the same is updating
  333 + an object. I think qpdf must handle generations correctly, but make sure to test this carefully.
  334 +
  335 + Note that there's nothing that says an indirect object in one update can't refer to an object that
  336 + doesn't appear until a later update. This means that QPDF has to hang onto indirect nulls,
  337 + including when they appear as dictionary values. In this case, QPDF_Dictionary::getKeys() ignores
  338 + all keys with null values, and hasKey() returns false for keys that have null values. We would
  339 + probably want to make QPDF_Dictionary able to handle the special case of keys that are indirect
  340 + nulls and basically never have it drop any keys that are indirect objects. We also have to make
  341 + sure that the testing for this handles non-trivial cases of the targets of indirect nulls being
  342 + replaced by real objects in an update. Such indirect nulls should appear in tests as dictionary
  343 + values and as array values. In the distant past, qpdf used to replace indirect nulls with direct
  344 + nulls, but I think there are no longer any remnants of that behavior.
  345 +
  346 + I'm not sure how this plays with linearization, if at all. For cases where incremental updates are
  347 + not being preserved as incremental updates and where the data is being folded in (as is always the
  348 + case with qpdf now), none of this should make any difference in the actual semantics of the files.
  349 +
  350 + One thought about how to implement this would be to have a QPDF object that is an incremental
  351 + update to an underlying QPDF object. Objects would be resolved from the underlying QPDF if not
  352 + found in the main one. When you write this type of QPDF, it can either flatten or it can write as
  353 + incremental updates. Perhaps, in incremental mode, QPDF reads each increment as a separate QPDF
  354 + with this kind of layering.
  355 +
331 * Consider enabling code scanning on GitHub. 356 * Consider enabling code scanning on GitHub.
332 357
333 * Add an option --ignore-encryption to ignore encryption information and treat encrypted files as if 358 * Add an option --ignore-encryption to ignore encryption information and treat encrypted files as if
@@ -567,31 +592,6 @@ Rejected Ideas @@ -567,31 +592,6 @@ Rejected Ideas
567 592
568 * Investigate whether there is a way to automate the memory checker tests for Windows. 593 * Investigate whether there is a way to automate the memory checker tests for Windows.
569 594
570 -* (This idea may be revised with alterations. Some of this is out of date.) Provide support in  
571 - QPDFWriter for writing incremental updates. Provide support in qpdf for preserving incremental  
572 - updates. The goal should be that QDF mode should be fully functional for files with incremental  
573 - updates including fix_qdf.  
574 -  
575 - Note that there's nothing that says an indirect object in one update can't refer to an object that  
576 - doesn't appear until a later update. This means that QPDF has to treat indirect null objects  
577 - differently from how it does now. QPDF drops indirect null objects that appear as members of  
578 - arrays or dictionaries. For arrays, it's handled in QPDFWriter where we make indirect nulls  
579 - direct. This is in a single if block, and nothing else in the code cares about it. We could just  
580 - remove that if block and not break anything except a few test cases that exercise the current  
581 - behavior. For dictionaries, it's more complicated. In this case, QPDF_Dictionary::getKeys()  
582 - ignores all keys with null values, and hasKey() returns false for keys that have null values. We  
583 - would probably want to make QPDF_Dictionary able to handle the special case of keys that are  
584 - indirect nulls and basically never have it drop any keys that are indirect objects.  
585 -  
586 - If we make a change to have qpdf preserve indirect references to null objects, we have to note  
587 - this in ChangeLog and in the release notes since this will change output files. We did this before  
588 - when we stopped flattening scalar references, so this is probably not a big deal. We also have to  
589 - make sure that the testing for this handles non-trivial cases of the targets of indirect nulls  
590 - being replaced by real objects in an update. I'm not sure how this plays with linearization, if at  
591 - all. For cases where incremental updates are not being preserved as incremental updates and where  
592 - the data is being folded in (as is always the case with qpdf now), none of this should make any  
593 - difference in the actual semantics of the files.  
594 -  
595 * The second xref stream for linearized files has to be padded only because we need file_size as 595 * The second xref stream for linearized files has to be padded only because we need file_size as
596 computed in pass 1 to be accurate. If we were not allowing writing to a pipe, we could seek back 596 computed in pass 1 to be accurate. If we were not allowing writing to a pipe, we could seek back
597 to the beginning and fill in the value of /L in the linearization dictionary as an optimization to 597 to the beginning and fill in the value of /L in the linearization dictionary as an optimization to