Commit b802ca47e9ec854bc7cd11311ccce0ffdbf5165e

Authored by Jay Berkenbilt
1 parent cdff7a49

Comments about incremental update support

Also remove some trivial, non-functional code.
Showing 2 changed files with 35 additions and 8 deletions
@@ -64,6 +64,40 @@ @@ -64,6 +64,40 @@
64 General 64 General
65 ======= 65 =======
66 66
  67 + * Provide support in QPDFWriter for writing incremental updates.
  68 + Provide support in qpdf for preserving incremental updates. The
  69 + goal should be that QDF mode should be fully functional for files
  70 + with incremental updates including fix_qdf.
  71 +
  72 + Note that there's nothing that says an indirect object in one
  73 + update can't refer to an object that doesn't appear until a later
  74 + update. This means that QPDF has to treat indirect null objects
  75 + differently from how it does now. QPDF drops indirect null objects
  76 + that appear as members of arrays or dictionaries. For arrays, it's
  77 + handled in QPDFWriter where we make indirect nulls direct. This is
  78 + in a single if block, and nothing else in the code cares about it.
  79 + We could just remove that if block and not break anything except a
  80 + few test cases that exercise the current behavior. For
  81 + dictionaries, it's more complicated. In this case,
  82 + QPDF_Dictionary::getKeys() ignores all keys with null values, and
  83 + hasKey() returns false for keys that have null values. We would
  84 + probably want to make QPDF_Dictionary able to handle the special
  85 + case of keys that are indirect nulls and basically never have it
  86 + drop any keys that are indirect objects.
  87 +
  88 + If we make a change to have qpdf preserve indirect references to
  89 + null objects, we have to note this in ChangeLog and in the release
  90 + notes since this will change output files. We did this before when
  91 + we stopped flattening scalar references, so this is probably not a
  92 + big deal. We also have to make sure that the testing for this
  93 + handles non-trivial cases of the targets of indirect nulls being
  94 + replaced by real objects in an update. I'm not sure how this plays
  95 + with linearization, if at all. For cases where incremental updates
  96 + are not being preserved as incremental updates and where the data
  97 + is being folded in (as is always the case with qpdf now), none of
  98 + this should make any difference in the actual semantics of the
  99 + files.
  100 +
67 * When decrypting files with /R=6, hash_V5 is called more than once 101 * When decrypting files with /R=6, hash_V5 is called more than once
68 with the same inputs. Caching the results or refactoring to reduce 102 with the same inputs. Caching the results or refactoring to reduce
69 the number of identical calls could improve performance for 103 the number of identical calls could improve performance for
libqpdf/QPDFWriter.cc
@@ -980,10 +980,6 @@ QPDFWriter::enqueueObject(QPDFObjectHandle object) @@ -980,10 +980,6 @@ QPDFWriter::enqueueObject(QPDFObjectHandle object)
980 " another file."); 980 " another file.");
981 } 981 }
982 982
983 - if (object.isNull())  
984 - {  
985 - // This is a place-holder object for an object stream  
986 - }  
987 QPDFObjGen og = object.getObjGen(); 983 QPDFObjGen og = object.getObjGen();
988 984
989 if (obj_renumber.count(og) == 0) 985 if (obj_renumber.count(og) == 0)
@@ -2014,10 +2010,7 @@ QPDFWriter::prepareFileForWrite() @@ -2014,10 +2010,7 @@ QPDFWriter::prepareFileForWrite()
2014 // Do a traversal of the entire PDF file structure replacing all 2010 // Do a traversal of the entire PDF file structure replacing all
2015 // indirect objects that QPDFWriter wants to be direct. This 2011 // indirect objects that QPDFWriter wants to be direct. This
2016 // includes stream lengths, stream filtering parameters, and 2012 // includes stream lengths, stream filtering parameters, and
2017 - // document extension level information. Also replace all  
2018 - // indirect null references with direct nulls. This way, the only  
2019 - // indirect nulls queued for output will be object stream place  
2020 - // holders. 2013 + // document extension level information.
2021 2014
2022 std::list<QPDFObjectHandle> queue; 2015 std::list<QPDFObjectHandle> queue;
2023 queue.push_back(getTrimmedTrailer()); 2016 queue.push_back(getTrimmedTrailer());