Commit ffe6af6f77036d9c725ce906df6020e4b5cac58d
1 parent
96767fb1
Add comments explaining the foreign object copying code
These are the comments I would have liked to have been able to read while fixing #449 and #478.
Showing
1 changed file
with
44 additions
and
3 deletions
libqpdf/QPDF.cc
| @@ -2253,9 +2253,50 @@ QPDF::replaceReserved(QPDFObjectHandle reserved, | @@ -2253,9 +2253,50 @@ QPDF::replaceReserved(QPDFObjectHandle reserved, | ||
| 2253 | QPDFObjectHandle | 2253 | QPDFObjectHandle |
| 2254 | QPDF::copyForeignObject(QPDFObjectHandle foreign) | 2254 | QPDF::copyForeignObject(QPDFObjectHandle foreign) |
| 2255 | { | 2255 | { |
| 2256 | - // Do not preclude use of copyForeignObject on page objects. It is | ||
| 2257 | - // a documented use case to copy pages this way if the intention | ||
| 2258 | - // is to not update the pages tree. | 2256 | + // Here's an explanation of what's going on here. |
| 2257 | + // | ||
| 2258 | + // A QPDFObjectHandle that is an indirect object has an owning | ||
| 2259 | + // QPDF. The object ID and generation refers to an object in the | ||
| 2260 | + // owning QPDF. When we copy the QPDFObjectHandle from a foreign | ||
| 2261 | + // QPDF into the local QPDF, we have to replace all indirect | ||
| 2262 | + // object references with references to the corresponding object | ||
| 2263 | + // in the local file. | ||
| 2264 | + // | ||
| 2265 | + // To do this, we maintain mappings from foreign object IDs to | ||
| 2266 | + // local object IDs for each foreign QPDF that we are copying | ||
| 2267 | + // from. The mapping is stored in an ObjCopier, which contains a | ||
| 2268 | + // mapping from the foreign ObjGen to the local QPDFObjectHandle. | ||
| 2269 | + // | ||
| 2270 | + // To copy, we do a deep traversal of the foreign object with loop | ||
| 2271 | + // detection to discover all indirect objects that are | ||
| 2272 | + // encountered, stopping at page boundaries. Whenever we encounter | ||
| 2273 | + // an indirect object, we check to see if we have already created | ||
| 2274 | + // a local copy of it. If not, we allocate a "reserved" object | ||
| 2275 | + // (or, for a stream, just a new stream) and store in the map the | ||
| 2276 | + // mapping from the foreign object ID to the new object. While we | ||
| 2277 | + // do this, we keep a list of objects to copy. | ||
| 2278 | + // | ||
| 2279 | + // Once we are done with the traversal, we copy all the objects | ||
| 2280 | + // that we need to copy. However, the copies will contain indirect | ||
| 2281 | + // object IDs that refer to objects in the foreign file. We need | ||
| 2282 | + // to replace them with references to objects in the local file. | ||
| 2283 | + // This is what replaceForeignIndirectObjects does. Once we have | ||
| 2284 | + // created a copy of the foreign object with all the indirect | ||
| 2285 | + // references replaced with new ones in the local context, we can | ||
| 2286 | + // replace the local reserved object with the copy. This mechanism | ||
| 2287 | + // allows us to copy objects with circular references in any | ||
| 2288 | + // order. | ||
| 2289 | + | ||
| 2290 | + // For streams, rather than copying the objects, we set up the | ||
| 2291 | + // stream data to pull from the original stream by using a stream | ||
| 2292 | + // data provider. This is done in a manner that doesn't require | ||
| 2293 | + // the original QPDF object but may require the original source of | ||
| 2294 | + // the stream data with special handling for immediate_copy_from. | ||
| 2295 | + // This logic is also in replaceForeignIndirectObjects. | ||
| 2296 | + | ||
| 2297 | + // Note that we explicitly allow use of copyForeignObject on page | ||
| 2298 | + // objects. It is a documented use case to copy pages this way if | ||
| 2299 | + // the intention is to not update the pages tree. | ||
| 2259 | if (! foreign.isIndirect()) | 2300 | if (! foreign.isIndirect()) |
| 2260 | { | 2301 | { |
| 2261 | QTC::TC("qpdf", "QPDF copyForeign direct"); | 2302 | QTC::TC("qpdf", "QPDF copyForeign direct"); |