Commit ffe6af6f77036d9c725ce906df6020e4b5cac58d

Authored by Jay Berkenbilt
1 parent 96767fb1

Add comments explaining the foreign object copying code

These are the comments I would have liked to have been able to read
while fixing #449 and #478.
Showing 1 changed file with 44 additions and 3 deletions
libqpdf/QPDF.cc
... ... @@ -2253,9 +2253,50 @@ QPDF::replaceReserved(QPDFObjectHandle reserved,
2253 2253 QPDFObjectHandle
2254 2254 QPDF::copyForeignObject(QPDFObjectHandle foreign)
2255 2255 {
2256   - // Do not preclude use of copyForeignObject on page objects. It is
2257   - // a documented use case to copy pages this way if the intention
2258   - // is to not update the pages tree.
  2256 + // Here's an explanation of what's going on here.
  2257 + //
  2258 + // A QPDFObjectHandle that is an indirect object has an owning
  2259 + // QPDF. The object ID and generation refers to an object in the
  2260 + // owning QPDF. When we copy the QPDFObjectHandle from a foreign
  2261 + // QPDF into the local QPDF, we have to replace all indirect
  2262 + // object references with references to the corresponding object
  2263 + // in the local file.
  2264 + //
  2265 + // To do this, we maintain mappings from foreign object IDs to
  2266 + // local object IDs for each foreign QPDF that we are copying
  2267 + // from. The mapping is stored in an ObjCopier, which contains a
  2268 + // mapping from the foreign ObjGen to the local QPDFObjectHandle.
  2269 + //
  2270 + // To copy, we do a deep traversal of the foreign object with loop
  2271 + // detection to discover all indirect objects that are
  2272 + // encountered, stopping at page boundaries. Whenever we encounter
  2273 + // an indirect object, we check to see if we have already created
  2274 + // a local copy of it. If not, we allocate a "reserved" object
  2275 + // (or, for a stream, just a new stream) and store in the map the
  2276 + // mapping from the foreign object ID to the new object. While we
  2277 + // do this, we keep a list of objects to copy.
  2278 + //
  2279 + // Once we are done with the traversal, we copy all the objects
  2280 + // that we need to copy. However, the copies will contain indirect
  2281 + // object IDs that refer to objects in the foreign file. We need
  2282 + // to replace them with references to objects in the local file.
  2283 + // This is what replaceForeignIndirectObjects does. Once we have
  2284 + // created a copy of the foreign object with all the indirect
  2285 + // references replaced with new ones in the local context, we can
  2286 + // replace the local reserved object with the copy. This mechanism
  2287 + // allows us to copy objects with circular references in any
  2288 + // order.
  2289 +
  2290 + // For streams, rather than copying the objects, we set up the
  2291 + // stream data to pull from the original stream by using a stream
  2292 + // data provider. This is done in a manner that doesn't require
  2293 + // the original QPDF object but may require the original source of
  2294 + // the stream data with special handling for immediate_copy_from.
  2295 + // This logic is also in replaceForeignIndirectObjects.
  2296 +
  2297 + // Note that we explicitly allow use of copyForeignObject on page
  2298 + // objects. It is a documented use case to copy pages this way if
  2299 + // the intention is to not update the pages tree.
2259 2300 if (! foreign.isIndirect())
2260 2301 {
2261 2302 QTC::TC("qpdf", "QPDF copyForeign direct");
... ...