Commit b802ca47e9ec854bc7cd11311ccce0ffdbf5165e

Authored by Jay Berkenbilt
1 parent cdff7a49

Comments about incremental update support

Also remove some trivial, non-functional code.
Showing 2 changed files with 35 additions and 8 deletions
... ... @@ -64,6 +64,40 @@
64 64 General
65 65 =======
66 66  
  67 + * Provide support in QPDFWriter for writing incremental updates.
  68 + Provide support in qpdf for preserving incremental updates. The
  69 + goal should be that QDF mode should be fully functional for files
  70 + with incremental updates including fix_qdf.
  71 +
  72 + Note that there's nothing that says an indirect object in one
  73 + update can't refer to an object that doesn't appear until a later
  74 + update. This means that QPDF has to treat indirect null objects
  75 + differently from how it does now. QPDF drops indirect null objects
  76 + that appear as members of arrays or dictionaries. For arrays, it's
  77 + handled in QPDFWriter where we make indirect nulls direct. This is
  78 + in a single if block, and nothing else in the code cares about it.
  79 + We could just remove that if block and not break anything except a
  80 + few test cases that exercise the current behavior. For
  81 + dictionaries, it's more complicated. In this case,
  82 + QPDF_Dictionary::getKeys() ignores all keys with null values, and
  83 + hasKey() returns false for keys that have null values. We would
  84 + probably want to make QPDF_Dictionary able to handle the special
  85 + case of keys that are indirect nulls and basically never have it
  86 + drop any keys that are indirect objects.
  87 +
  88 + If we make a change to have qpdf preserve indirect references to
  89 + null objects, we have to note this in ChangeLog and in the release
  90 + notes since this will change output files. We did this before when
  91 + we stopped flattening scalar references, so this is probably not a
  92 + big deal. We also have to make sure that the testing for this
  93 + handles non-trivial cases of the targets of indirect nulls being
  94 + replaced by real objects in an update. I'm not sure how this plays
  95 + with linearization, if at all. For cases where incremental updates
  96 + are not being preserved as incremental updates and where the data
  97 + is being folded in (as is always the case with qpdf now), none of
  98 + this should make any difference in the actual semantics of the
  99 + files.
  100 +
67 101 * When decrypting files with /R=6, hash_V5 is called more than once
68 102 with the same inputs. Caching the results or refactoring to reduce
69 103 the number of identical calls could improve performance for
... ...
libqpdf/QPDFWriter.cc
... ... @@ -980,10 +980,6 @@ QPDFWriter::enqueueObject(QPDFObjectHandle object)
980 980 " another file.");
981 981 }
982 982  
983   - if (object.isNull())
984   - {
985   - // This is a place-holder object for an object stream
986   - }
987 983 QPDFObjGen og = object.getObjGen();
988 984  
989 985 if (obj_renumber.count(og) == 0)
... ... @@ -2014,10 +2010,7 @@ QPDFWriter::prepareFileForWrite()
2014 2010 // Do a traversal of the entire PDF file structure replacing all
2015 2011 // indirect objects that QPDFWriter wants to be direct. This
2016 2012 // includes stream lengths, stream filtering parameters, and
2017   - // document extension level information. Also replace all
2018   - // indirect null references with direct nulls. This way, the only
2019   - // indirect nulls queued for output will be object stream place
2020   - // holders.
  2013 + // document extension level information.
2021 2014  
2022 2015 std::list<QPDFObjectHandle> queue;
2023 2016 queue.push_back(getTrimmedTrailer());
... ...