Commit e7b8f297ba92f4cadf88efcb394830dc24d54738

Authored by Jay Berkenbilt
1 parent 8a217eb3

Support copying objects from another QPDF object

This includes QPDF::copyForeignObject and supporting foreign objects
as arguments to addPage*.
ChangeLog
  1 +2012-07-11 Jay Berkenbilt <ejb@ql.org>
  2 +
  3 + * Added new APIs to copy objects from one QPDF to another. This
  4 + includes letting QPDF::addPage() (and QPDF::addPageAt()) accept a
  5 + page object from another QPDF and adding
  6 + QPDF::copyForeignObject(). See QPDF.hh for details.
  7 +
  8 + * Add method QPDFObjectHandle::getOwningQPDF() to return the QPDF
  9 + object associated with an indirect QPDFObjectHandle.
  10 +
  11 + * Add convenience methods to QPDFObjectHandle: assertIndirect(),
  12 + isPageObject(), isPagesObject()
  13 +
  14 + * Cache when QPDF::pushInheritedAttributesToPage() has been called
  15 + to avoid traversing the pages trees multiple times. This state is
  16 + cleared by QPDF::updateAllPagesCache() and ignored by
  17 + QPDF::flattenPagesTree().
  18 +
1 2012-07-08 Jay Berkenbilt <ejb@ql.org> 19 2012-07-08 Jay Berkenbilt <ejb@ql.org>
2 20
3 * Add QPDFObjectHandle::newReserved to create a reserved object 21 * Add QPDFObjectHandle::newReserved to create a reserved object
@@ -28,76 +28,54 @@ Next @@ -28,76 +28,54 @@ Next
28 can only be used by one thread at a time, but multiple threads can 28 can only be used by one thread at a time, but multiple threads can
29 simultaneously use separate objects. 29 simultaneously use separate objects.
30 30
  31 + * Write some documentation about the design of copyForeignObject.
31 32
32 -Soon  
33 -==== 33 + * copyForeignObject still to do:
34 34
35 - * Provide an option to copy encryption parameters from another file.  
36 - This would make it possible to decrypt a file, manually work with  
37 - it, and then re-encrypt it using the original encryption parameters  
38 - including a possibly unknown owner password. 35 + - qpdf command
39 36
40 - * See if I can support the new encryption formats mentioned in the  
41 - open bug on sourceforge. Check other sourceforge bugs. 37 + Command line could be something like
42 38
43 - * Splitting/merging concepts 39 + --pages [ --new ] { file [password] numeric-range ... } ... --
44 40
45 - newPDF() could create a PDF with just a trailer, no pages, and a  
46 - minimal info. Then the page routines could be used to add pages to  
47 - it. 41 + The first file referenced would be the one whose other data would
  42 + be preserved (like trailer, info, encryption, outlines, etc.).
  43 + --new as first file would just use an empty file as the starting
  44 + point. Be explicit about whether outlines, etc., are handled.
  45 + They are not handled initially.
48 46
49 - Starting with any pdf, you should be able to copy objects from  
50 - another pdf. The copy should be smart about never traversing into  
51 - a /Page or /Pages. 47 + Example: to grab pages 1-5 from file1 and 11-15 from file2
52 48
53 - We could provide a method of copying objects from one PDF into  
54 - another. This would do whatever optimization is necessary (maybe  
55 - just optimizePagesTree) and then traverse the set of objects  
56 - specified to find all objects referenced by the set. Each of those  
57 - would be copied over with a table mapping old ID to new ID. This  
58 - would be done from bottom up most likely disallowing cycles or  
59 - handling them sanely. 49 + --pages file1.pdf 1-5 file2.pdf 11-15 --
60 50
61 - Command line could be something like 51 + To implement this, we would remove all pages from file1 except
  52 + pages 1 through 5. Then we would take pages 11 through 15 from
  53 + file2, copy them to the file, and add them as pages.
62 54
63 - --pages [ --new ] { file [password] numeric-range ... } ... -- 55 + - document that makeIndirectObject doesn't handle foreign objects
  56 + automatically because copying a foreign object is a big enough
  57 + deal that it should be explicit. However addPages* does handle
  58 + foreign page objects automatically.
64 59
65 - The first file referenced would be the one whose other data would  
66 - be preserved (like trailer, info, encryption, outlines, etc.).  
67 - --new as first file would just use an empty file as the starting  
68 - point. 60 + - Test /Outlines and see whether there's any point in handling
  61 + them in the API. Maybe just copying them over works. What
  62 + about command line tool? Also think about page labels.
69 63
70 - Example: to grab pages 1-5 from file1 and 11-15 from file2 64 + - Tests through qpdf command line: copy pages from multiple PDFs
  65 + starting with one PDF and also starting with empty.
71 66
72 - --pages file1.pdf 1-5 file2.pdf 11-15 -- 67 + * (Hopefully) Provide an option to copy encryption parameters from
  68 + another file. This would make it possible to decrypt a file,
  69 + manually work with it, and then re-encrypt it using the original
  70 + encryption parameters including a possibly unknown owner password.
73 71
74 - To implement this, we would remove all pages from file1 except  
75 - pages 1 through 5. Then we would take pages 11 through 15 from  
76 - file2 and add them to a set for transfer. This would end up  
77 - generating a list of indirect objects. We would copy those objects  
78 - shallowly to the new PDF keeping track of the mapping and replacing  
79 - any indirect object keys as appropriate, much like QPDFWriter does.  
80 72
81 - When all the objects are registered, we would add those pages to  
82 - the result.  
83 -  
84 - This approach could work for both splitting and merging. It's  
85 - possible it could be implemented now without any new APIs, but most  
86 - of the work should be doable by the library with only a small set  
87 - of additions. 73 +Soon
  74 +====
88 75
89 - newPDF()  
90 - QPDFObjectCopier c(qpdf1, qpdf2)  
91 - QPDFObjectHandle obj = c.copyObject(<object from qpdf1>)  
92 - Without traversing pages, copies all indirect objects referenced  
93 - by <object from qpdf1> preserving referential integrity and  
94 - returns an object handle in qpdf2 of the same object. If called  
95 - multiple times on the same object, retraverses in case there were  
96 - changes. 76 + * See if I can support the new encryption formats mentioned in the
  77 + open bug on sourceforge. Check other sourceforge bugs.
97 78
98 - QPDFObjectHandle obj = c.getMapping(<object from qpdf1>)  
99 - find the object in qpdf2 corresponding to the object from qpdf1.  
100 - Return the null object if none.  
101 79
102 General 80 General
103 ======= 81 =======
@@ -110,23 +88,11 @@ General @@ -110,23 +88,11 @@ General
110 * Update qpdf docs about non-ascii passwords. See thread from 88 * Update qpdf docs about non-ascii passwords. See thread from
111 2010-12-07,08 for details. 89 2010-12-07,08 for details.
112 90
113 - * Look at page splitting. Subramanyam provided a test file; see  
114 - ../misc/article-threads.pdf. Email Q-Count: 431864 from  
115 - 2009-11-03. See also "Splitting by Pages" below.  
116 -  
117 - * Consider writing a PDF merge utility. With 2.2, it would be  
118 - possible to have a StreamDataProvider that would allow stream data  
119 - to be directly copied from one PDF file to another. One possible  
120 - strategy would be to have a program that adds all the pages of one  
121 - file to the end of another file. The basic  
122 - strategy would be to create a table that adds new streams to the  
123 - original file, mapping the new streams' obj/gen to a stream in the  
124 - file whose pages are being appended. The StreamDataProvider, when  
125 - asked, could simply pipe the streams of the file being appended to  
126 - the provided pipeline and could copy the filter and decode  
127 - parameters from the original file. Being able to do this requires  
128 - a lot of the same logic as being able to do splitting, so a general  
129 - split/merge program would be a great addition. 91 + * Consider impact of article threads on page splitting/merging.
  92 + Subramanyam provided a test file; see ../misc/article-threads.pdf.
  93 + Email Q-Count: 431864 from 2009-11-03. Other things to consider:
  94 + outlines, page labels, thumbnails, zones. There are probably
  95 + others.
130 96
131 * See whether it's possible to remove the call to 97 * See whether it's possible to remove the call to
132 flattenScalarReferences. I can't easily figure out why I do it, 98 flattenScalarReferences. I can't easily figure out why I do it,
@@ -279,26 +245,3 @@ Index: QPDFWriter.cc @@ -279,26 +245,3 @@ Index: QPDFWriter.cc
279 245
280 * From a suggestion in bug 3152169, consisder having an option to 246 * From a suggestion in bug 3152169, consisder having an option to
281 re-encode inline images with an ASCII encoding. 247 re-encode inline images with an ASCII encoding.
282 -  
283 -  
284 -Splitting by Pages  
285 -==================  
286 -  
287 -Although qpdf does not currently support splitting a file into pages,  
288 -the work done for linearization covers almost all the work. To do  
289 -page splitting. If this functionality is needed, study  
290 -obj_user_to_objects and object_to_obj_users created in  
291 -QPDF_optimization for ideas. It's quite possible that the information  
292 -computed by calculateLinearizationData is actually sufficient to do  
293 -page splitting in many circumstances. That code knows which objects  
294 -are used by which pages, though it doesn't do anything page-specific  
295 -with outlines, thumbnails, page labels, or anything else.  
296 -  
297 -Another approach would be to traverse only pages that are being output  
298 -taking care not to traverse into the pages tree, and then to fabricate  
299 -a new pages tree.  
300 -  
301 -Either way, care must be taken to handle other things such as  
302 -outlines, page labels, thumbnails, threads, zones, etc. in a sensible  
303 -way. This may include simply omitting information other than page  
304 -content.  
include/qpdf/QPDF.hh
@@ -190,6 +190,28 @@ class QPDF @@ -190,6 +190,28 @@ class QPDF
190 replaceReserved(QPDFObjectHandle reserved, 190 replaceReserved(QPDFObjectHandle reserved,
191 QPDFObjectHandle replacement); 191 QPDFObjectHandle replacement);
192 192
  193 + // Copy an object from another QPDF to this one. The return value
  194 + // is an indirect reference to the copied object in this file.
  195 + // This method is intended to be used to copy non-page objects and
  196 + // will not copy page objects. To copy page objects, pass the
  197 + // foreign page object directly to addPage (or addPageAt). If you
  198 + // copy objects that contain references to pages, you should copy
  199 + // the pages first using addPage(At). Otherwise references to the
  200 + // pages that have not been copied will be replaced with nulls.
  201 +
  202 + // When copying objects with this method, object structure will be
  203 + // preserved, so all indirectly referenced indirect objects will
  204 + // be copied as well. This includes any circular references that
  205 + // may exist. The QPDF object keeps a record of what has already
  206 + // been copied, so shared objects will not be copied multiple
  207 + // times. This also means that if you mutate an object that has
  208 + // already been copied and try to copy it again, it won't work
  209 + // since the modified object will not be recopied. Therefore, you
  210 + // should do all mutation on the original file that you are going
  211 + // to do before you start copying its objects to a new file.
  212 + QPDF_DLL
  213 + QPDFObjectHandle copyForeignObject(QPDFObjectHandle foreign);
  214 +
193 // Encryption support 215 // Encryption support
194 216
195 enum encryption_method_e { e_none, e_unknown, e_rc4, e_aes }; 217 enum encryption_method_e { e_none, e_unknown, e_rc4, e_aes };
@@ -380,7 +402,10 @@ class QPDF @@ -380,7 +402,10 @@ class QPDF
380 // modify /Pages structures directly, you must call this method 402 // modify /Pages structures directly, you must call this method
381 // afterwards. This method updates the internal list of pages, so 403 // afterwards. This method updates the internal list of pages, so
382 // after calling this method, any previous references returned by 404 // after calling this method, any previous references returned by
383 - // getAllPages() will be valid again. 405 + // getAllPages() will be valid again. It also resets any state
  406 + // about having pushed inherited attributes in /Pages objects down
  407 + // to the pages, so if you add any inheritable attributes to a
  408 + // /Pages object, you should also call this method.
384 QPDF_DLL 409 QPDF_DLL
385 void updateAllPagesCache(); 410 void updateAllPagesCache();
386 411
@@ -389,11 +414,19 @@ class QPDF @@ -389,11 +414,19 @@ class QPDF
389 // resolved by explicitly setting the values in each /Page. 414 // resolved by explicitly setting the values in each /Page.
390 void pushInheritedAttributesToPage(); 415 void pushInheritedAttributesToPage();
391 416
392 - // Add new page at the beginning or the end of the current pdf 417 + // Add new page at the beginning or the end of the current pdf.
  418 + // The newpage parameter may be either a direct object, an
  419 + // indirect object from this QPDF, or an indirect object from
  420 + // another QPDF. If it is a direct object, it will be made
  421 + // indirect. If it is an indirect object from another QPDF, this
  422 + // method will call pushInheritedAttributesToPage on the other
  423 + // file and then copy the page to this QPDF using the same
  424 + // underlying code as copyForeignObject.
393 QPDF_DLL 425 QPDF_DLL
394 void addPage(QPDFObjectHandle newpage, bool first); 426 void addPage(QPDFObjectHandle newpage, bool first);
395 427
396 - // Add new page before or after refpage 428 + // Add new page before or after refpage. See comments for addPage
  429 + // for details about what newpage should be.
397 QPDF_DLL 430 QPDF_DLL
398 void addPageAt(QPDFObjectHandle newpage, bool before, 431 void addPageAt(QPDFObjectHandle newpage, bool before,
399 QPDFObjectHandle refpage); 432 QPDFObjectHandle refpage);
@@ -542,6 +575,29 @@ class QPDF @@ -542,6 +575,29 @@ class QPDF
542 qpdf_offset_t end_after_space; 575 qpdf_offset_t end_after_space;
543 }; 576 };
544 577
  578 + class ObjCopier
  579 + {
  580 + public:
  581 + std::map<ObjGen, QPDFObjectHandle> object_map;
  582 + std::vector<QPDFObjectHandle> to_copy;
  583 + std::set<ObjGen> visiting;
  584 + };
  585 +
  586 + class CopiedStreamDataProvider: public QPDFObjectHandle::StreamDataProvider
  587 + {
  588 + public:
  589 + virtual ~CopiedStreamDataProvider()
  590 + {
  591 + }
  592 + virtual void provideStreamData(int objid, int generation,
  593 + Pipeline* pipeline);
  594 + void registerForeignStream(ObjGen const& local_og,
  595 + QPDFObjectHandle foreign_stream);
  596 +
  597 + private:
  598 + std::map<ObjGen, QPDFObjectHandle> foreign_streams;
  599 + };
  600 +
545 void parse(char const* password); 601 void parse(char const* password);
546 void warn(QPDFExc const& e); 602 void warn(QPDFExc const& e);
547 void setTrailer(QPDFObjectHandle obj); 603 void setTrailer(QPDFObjectHandle obj);
@@ -602,6 +658,14 @@ class QPDF @@ -602,6 +658,14 @@ class QPDF
602 QPDFObjectHandle& stream_dict, 658 QPDFObjectHandle& stream_dict,
603 std::vector<PointerHolder<Pipeline> >& heap); 659 std::vector<PointerHolder<Pipeline> >& heap);
604 660
  661 + // Methods to support object copying
  662 + QPDFObjectHandle copyForeignObject(
  663 + QPDFObjectHandle foreign, bool allow_page);
  664 + void reserveObjects(QPDFObjectHandle foreign, ObjCopier& obj_copier,
  665 + bool top);
  666 + QPDFObjectHandle replaceForeignIndirectObjects(
  667 + QPDFObjectHandle foreign, ObjCopier& obj_copier, bool top);
  668 +
605 // Linearization Hint table structures. 669 // Linearization Hint table structures.
606 // Naming conventions: 670 // Naming conventions:
607 671
@@ -960,7 +1024,12 @@ class QPDF @@ -960,7 +1024,12 @@ class QPDF
960 QPDFObjectHandle trailer; 1024 QPDFObjectHandle trailer;
961 std::vector<QPDFObjectHandle> all_pages; 1025 std::vector<QPDFObjectHandle> all_pages;
962 std::map<ObjGen, int> pageobj_to_pages_pos; 1026 std::map<ObjGen, int> pageobj_to_pages_pos;
  1027 + bool pushed_inherited_attributes_to_pages;
963 std::vector<QPDFExc> warnings; 1028 std::vector<QPDFExc> warnings;
  1029 + std::map<QPDF*, ObjCopier> object_copiers;
  1030 + PointerHolder<QPDFObjectHandle::StreamDataProvider> copied_streams;
  1031 + // copied_stream_data_provider is owned by copied_streams
  1032 + CopiedStreamDataProvider* copied_stream_data_provider;
964 1033
965 // Linearization data 1034 // Linearization data
966 qpdf_offset_t first_xref_item_offset; // actual value from file 1035 qpdf_offset_t first_xref_item_offset; // actual value from file
include/qpdf/QPDFObjectHandle.hh
@@ -222,6 +222,11 @@ class QPDFObjectHandle @@ -222,6 +222,11 @@ class QPDFObjectHandle
222 QPDF_DLL 222 QPDF_DLL
223 bool isOrHasName(std::string const&); 223 bool isOrHasName(std::string const&);
224 224
  225 + // Return the QPDF object that owns an indirect object. Returns
  226 + // null for a direct object.
  227 + QPDF_DLL
  228 + QPDF* getOwningQPDF();
  229 +
225 // Create a shallow copy of an object as a direct object. Since 230 // Create a shallow copy of an object as a direct object. Since
226 // this is a shallow copy, for dictionaries and arrays, any keys 231 // this is a shallow copy, for dictionaries and arrays, any keys
227 // or items that were indirect objects will still be indirect 232 // or items that were indirect objects will still be indirect
@@ -454,9 +459,16 @@ class QPDFObjectHandle @@ -454,9 +459,16 @@ class QPDFObjectHandle
454 void assertReserved(); 459 void assertReserved();
455 460
456 QPDF_DLL 461 QPDF_DLL
  462 + void assertIndirect();
  463 + QPDF_DLL
457 void assertScalar(); 464 void assertScalar();
458 QPDF_DLL 465 QPDF_DLL
459 void assertNumber(); 466 void assertNumber();
  467 +
  468 + QPDF_DLL
  469 + bool isPageObject();
  470 + QPDF_DLL
  471 + bool isPagesObject();
460 QPDF_DLL 472 QPDF_DLL
461 void assertPageObject(); 473 void assertPageObject();
462 474
libqpdf/QPDF.cc
@@ -348,6 +348,23 @@ QPDF::ObjGen::operator&lt;(ObjGen const&amp; rhs) const @@ -348,6 +348,23 @@ QPDF::ObjGen::operator&lt;(ObjGen const&amp; rhs) const
348 ((this->obj == rhs.obj) && (this->gen < rhs.gen))); 348 ((this->obj == rhs.obj) && (this->gen < rhs.gen)));
349 } 349 }
350 350
  351 +void
  352 +QPDF::CopiedStreamDataProvider::provideStreamData(
  353 + int objid, int generation, Pipeline* pipeline)
  354 +{
  355 + QPDFObjectHandle foreign_stream =
  356 + this->foreign_streams[ObjGen(objid, generation)];
  357 + foreign_stream.pipeStreamData(pipeline, false, false, false);
  358 +}
  359 +
  360 +void
  361 +QPDF::CopiedStreamDataProvider::registerForeignStream(
  362 + ObjGen const& local_og, QPDFObjectHandle foreign_stream)
  363 +{
  364 + this->foreign_streams[local_og] = foreign_stream;
  365 +}
  366 +
  367 +
351 std::string const& 368 std::string const&
352 QPDF::QPDFVersion() 369 QPDF::QPDFVersion()
353 { 370 {
@@ -369,6 +386,8 @@ QPDF::QPDF() : @@ -369,6 +386,8 @@ QPDF::QPDF() :
369 cf_file(e_none), 386 cf_file(e_none),
370 cached_key_objid(0), 387 cached_key_objid(0),
371 cached_key_generation(0), 388 cached_key_generation(0),
  389 + pushed_inherited_attributes_to_pages(false),
  390 + copied_stream_data_provider(0),
372 first_xref_item_offset(0), 391 first_xref_item_offset(0),
373 uncompressed_after_compressed(false) 392 uncompressed_after_compressed(false)
374 { 393 {
@@ -2067,6 +2086,244 @@ QPDF::replaceReserved(QPDFObjectHandle reserved, @@ -2067,6 +2086,244 @@ QPDF::replaceReserved(QPDFObjectHandle reserved,
2067 replacement); 2086 replacement);
2068 } 2087 }
2069 2088
  2089 +QPDFObjectHandle
  2090 +QPDF::copyForeignObject(QPDFObjectHandle foreign)
  2091 +{
  2092 + return copyForeignObject(foreign, false);
  2093 +}
  2094 +
  2095 +QPDFObjectHandle
  2096 +QPDF::copyForeignObject(QPDFObjectHandle foreign, bool allow_page)
  2097 +{
  2098 + if (! foreign.isIndirect())
  2099 + {
  2100 + QTC::TC("qpdf", "QPDF copyForeign direct");
  2101 + throw std::logic_error(
  2102 + "QPDF::copyForeign called with direct object handle");
  2103 + }
  2104 + QPDF* other = foreign.getOwningQPDF();
  2105 + if (other == this)
  2106 + {
  2107 + QTC::TC("qpdf", "QPDF copyForeign not foreign");
  2108 + throw std::logic_error(
  2109 + "QPDF::copyForeign called with object from this QPDF");
  2110 + }
  2111 +
  2112 + ObjCopier& obj_copier = this->object_copiers[other];
  2113 + if (! obj_copier.visiting.empty())
  2114 + {
  2115 + throw std::logic_error("obj_copier.visiting is not empty"
  2116 + " at the beginning of copyForeignObject");
  2117 + }
  2118 +
  2119 + // Make sure we have an object in this file for every referenced
  2120 + // object in the old file. obj_copier.object_map maps foreign
  2121 + // ObjGen to local objects. For everything new that we have to
  2122 + // copy, the local object will be a reservation, unless it is a
  2123 + // stream, in which case the local object will already be a
  2124 + // stream.
  2125 + reserveObjects(foreign, obj_copier, true);
  2126 +
  2127 + if (! obj_copier.visiting.empty())
  2128 + {
  2129 + throw std::logic_error("obj_copier.visiting is not empty"
  2130 + " after reserving objects");
  2131 + }
  2132 +
  2133 + // Copy any new objects and replace the reservations.
  2134 + for (std::vector<QPDFObjectHandle>::iterator iter =
  2135 + obj_copier.to_copy.begin();
  2136 + iter != obj_copier.to_copy.end(); ++iter)
  2137 + {
  2138 + QPDFObjectHandle& to_copy = *iter;
  2139 + QPDFObjectHandle copy =
  2140 + replaceForeignIndirectObjects(to_copy, obj_copier, true);
  2141 + if (! to_copy.isStream())
  2142 + {
  2143 + ObjGen og(to_copy.getObjectID(), to_copy.getGeneration());
  2144 + replaceReserved(obj_copier.object_map[og], copy);
  2145 + }
  2146 + }
  2147 + obj_copier.to_copy.clear();
  2148 +
  2149 + return obj_copier.object_map[ObjGen(foreign.getObjectID(),
  2150 + foreign.getGeneration())];
  2151 +}
  2152 +
  2153 +void
  2154 +QPDF::reserveObjects(QPDFObjectHandle foreign, ObjCopier& obj_copier,
  2155 + bool top)
  2156 +{
  2157 + if (foreign.isReserved())
  2158 + {
  2159 + throw std::logic_error(
  2160 + "QPDF: attempting to copy a foreign reserved object");
  2161 + }
  2162 +
  2163 + if (foreign.isPagesObject())
  2164 + {
  2165 + QTC::TC("qpdf", "QPDF not copying pages object");
  2166 + return;
  2167 + }
  2168 +
  2169 + if ((! top) && foreign.isPageObject())
  2170 + {
  2171 + QTC::TC("qpdf", "QPDF not crossing page boundary");
  2172 + return;
  2173 + }
  2174 +
  2175 + if (foreign.isIndirect())
  2176 + {
  2177 + ObjGen foreign_og(foreign.getObjectID(), foreign.getGeneration());
  2178 + if (obj_copier.visiting.find(foreign_og) != obj_copier.visiting.end())
  2179 + {
  2180 + QTC::TC("qpdf", "QPDF loop reserving objects");
  2181 + return;
  2182 + }
  2183 + QTC::TC("qpdf", "QPDF copy indirect");
  2184 + obj_copier.visiting.insert(foreign_og);
  2185 + std::map<ObjGen, QPDFObjectHandle>::iterator mapping =
  2186 + obj_copier.object_map.find(foreign_og);
  2187 + if (mapping == obj_copier.object_map.end())
  2188 + {
  2189 + obj_copier.to_copy.push_back(foreign);
  2190 + QPDFObjectHandle reservation;
  2191 + if (foreign.isStream())
  2192 + {
  2193 + reservation = QPDFObjectHandle::newStream(this);
  2194 + }
  2195 + else
  2196 + {
  2197 + reservation = QPDFObjectHandle::newReserved(this);
  2198 + }
  2199 + obj_copier.object_map[foreign_og] = reservation;
  2200 + }
  2201 + }
  2202 +
  2203 + if (foreign.isArray())
  2204 + {
  2205 + QTC::TC("qpdf", "QPDF reserve array");
  2206 + int n = foreign.getArrayNItems();
  2207 + for (int i = 0; i < n; ++i)
  2208 + {
  2209 + reserveObjects(foreign.getArrayItem(i), obj_copier, false);
  2210 + }
  2211 + }
  2212 + else if (foreign.isDictionary())
  2213 + {
  2214 + QTC::TC("qpdf", "QPDF reserve dictionary");
  2215 + std::set<std::string> keys = foreign.getKeys();
  2216 + for (std::set<std::string>::iterator iter = keys.begin();
  2217 + iter != keys.end(); ++iter)
  2218 + {
  2219 + reserveObjects(foreign.getKey(*iter), obj_copier, false);
  2220 + }
  2221 + }
  2222 + else if (foreign.isStream())
  2223 + {
  2224 + QTC::TC("qpdf", "QPDF reserve stream");
  2225 + reserveObjects(foreign.getDict(), obj_copier, false);
  2226 + }
  2227 +
  2228 + if (foreign.isIndirect())
  2229 + {
  2230 + ObjGen foreign_og(foreign.getObjectID(), foreign.getGeneration());
  2231 + obj_copier.visiting.erase(foreign_og);
  2232 + }
  2233 +}
  2234 +
  2235 +QPDFObjectHandle
  2236 +QPDF::replaceForeignIndirectObjects(
  2237 + QPDFObjectHandle foreign, ObjCopier& obj_copier, bool top)
  2238 +{
  2239 + QPDFObjectHandle result;
  2240 + if ((! top) && foreign.isIndirect())
  2241 + {
  2242 + QTC::TC("qpdf", "QPDF replace indirect");
  2243 + ObjGen foreign_og(foreign.getObjectID(), foreign.getGeneration());
  2244 + std::map<ObjGen, QPDFObjectHandle>::iterator mapping =
  2245 + obj_copier.object_map.find(foreign_og);
  2246 + if (mapping == obj_copier.object_map.end())
  2247 + {
  2248 + // This case would occur if this is a reference to a Page
  2249 + // or Pages object that we didn't traverse into.
  2250 + QTC::TC("qpdf", "QPDF replace foreign indirect with null");
  2251 + result = QPDFObjectHandle::newNull();
  2252 + }
  2253 + else
  2254 + {
  2255 + result = obj_copier.object_map[foreign_og];
  2256 + }
  2257 + }
  2258 + else if (foreign.isArray())
  2259 + {
  2260 + QTC::TC("qpdf", "QPDF replace array");
  2261 + result = QPDFObjectHandle::newArray();
  2262 + int n = foreign.getArrayNItems();
  2263 + for (int i = 0; i < n; ++i)
  2264 + {
  2265 + result.appendItem(
  2266 + replaceForeignIndirectObjects(
  2267 + foreign.getArrayItem(i), obj_copier, false));
  2268 + }
  2269 + }
  2270 + else if (foreign.isDictionary())
  2271 + {
  2272 + QTC::TC("qpdf", "QPDF replace dictionary");
  2273 + result = QPDFObjectHandle::newDictionary();
  2274 + std::set<std::string> keys = foreign.getKeys();
  2275 + for (std::set<std::string>::iterator iter = keys.begin();
  2276 + iter != keys.end(); ++iter)
  2277 + {
  2278 + result.replaceKey(
  2279 + *iter,
  2280 + replaceForeignIndirectObjects(
  2281 + foreign.getKey(*iter), obj_copier, false));
  2282 + }
  2283 + }
  2284 + else if (foreign.isStream())
  2285 + {
  2286 + QTC::TC("qpdf", "QPDF replace stream");
  2287 + ObjGen foreign_og(foreign.getObjectID(), foreign.getGeneration());
  2288 + result = obj_copier.object_map[foreign_og];
  2289 + result.assertStream();
  2290 + QPDFObjectHandle dict = result.getDict();
  2291 + QPDFObjectHandle old_dict = foreign.getDict();
  2292 + std::set<std::string> keys = old_dict.getKeys();
  2293 + for (std::set<std::string>::iterator iter = keys.begin();
  2294 + iter != keys.end(); ++iter)
  2295 + {
  2296 + dict.replaceKey(
  2297 + *iter,
  2298 + replaceForeignIndirectObjects(
  2299 + old_dict.getKey(*iter), obj_copier, false));
  2300 + }
  2301 + if (this->copied_stream_data_provider == 0)
  2302 + {
  2303 + this->copied_stream_data_provider = new CopiedStreamDataProvider();
  2304 + this->copied_streams = this->copied_stream_data_provider;
  2305 + }
  2306 + ObjGen local_og(result.getObjectID(), result.getGeneration());
  2307 + this->copied_stream_data_provider->registerForeignStream(
  2308 + local_og, foreign);
  2309 + result.replaceStreamData(this->copied_streams,
  2310 + dict.getKey("/Filter"),
  2311 + dict.getKey("/DecodeParms"));
  2312 + }
  2313 + else
  2314 + {
  2315 + foreign.assertScalar();
  2316 + result = foreign;
  2317 + result.makeDirect();
  2318 + }
  2319 +
  2320 + if (top && (! result.isStream()) && result.isIndirect())
  2321 + {
  2322 + throw std::logic_error("replacement for foreign object is indirect");
  2323 + }
  2324 +
  2325 + return result;
  2326 +}
2070 2327
2071 void 2328 void
2072 QPDF::swapObjects(int objid1, int generation1, int objid2, int generation2) 2329 QPDF::swapObjects(int objid1, int generation1, int objid2, int generation2)
libqpdf/QPDFObjectHandle.cc
@@ -355,6 +355,14 @@ QPDFObjectHandle::isOrHasName(std::string const&amp; value) @@ -355,6 +355,14 @@ QPDFObjectHandle::isOrHasName(std::string const&amp; value)
355 return false; 355 return false;
356 } 356 }
357 357
  358 +// Indirect object accessors
  359 +QPDF*
  360 +QPDFObjectHandle::getOwningQPDF()
  361 +{
  362 + // Will be null for direct objects
  363 + return this->qpdf;
  364 +}
  365 +
358 // Dictionary mutators 366 // Dictionary mutators
359 367
360 void 368 void
@@ -784,6 +792,7 @@ QPDFObjectHandle::makeDirectInternal(std::set&lt;int&gt;&amp; visited) @@ -784,6 +792,7 @@ QPDFObjectHandle::makeDirectInternal(std::set&lt;int&gt;&amp; visited)
784 } 792 }
785 793
786 dereference(); 794 dereference();
  795 + this->qpdf = 0;
787 this->objid = 0; 796 this->objid = 0;
788 this->generation = 0; 797 this->generation = 0;
789 798
@@ -946,6 +955,16 @@ QPDFObjectHandle::assertReserved() @@ -946,6 +955,16 @@ QPDFObjectHandle::assertReserved()
946 } 955 }
947 956
948 void 957 void
  958 +QPDFObjectHandle::assertIndirect()
  959 +{
  960 + if (! isIndirect())
  961 + {
  962 + throw std::logic_error(
  963 + "operation for indirect object attempted on direct object");
  964 + }
  965 +}
  966 +
  967 +void
949 QPDFObjectHandle::assertScalar() 968 QPDFObjectHandle::assertScalar()
950 { 969 {
951 assertType("Scalar", isScalar()); 970 assertType("Scalar", isScalar());
@@ -957,11 +976,24 @@ QPDFObjectHandle::assertNumber() @@ -957,11 +976,24 @@ QPDFObjectHandle::assertNumber()
957 assertType("Number", isNumber()); 976 assertType("Number", isNumber());
958 } 977 }
959 978
  979 +bool
  980 +QPDFObjectHandle::isPageObject()
  981 +{
  982 + return (this->isDictionary() && this->hasKey("/Type") &&
  983 + (this->getKey("/Type").getName() == "/Page"));
  984 +}
  985 +
  986 +bool
  987 +QPDFObjectHandle::isPagesObject()
  988 +{
  989 + return (this->isDictionary() && this->hasKey("/Type") &&
  990 + (this->getKey("/Type").getName() == "/Pages"));
  991 +}
  992 +
960 void 993 void
961 QPDFObjectHandle::assertPageObject() 994 QPDFObjectHandle::assertPageObject()
962 { 995 {
963 - if (! (this->isDictionary() && this->hasKey("/Type") &&  
964 - (this->getKey("/Type").getName() == "/Page"))) 996 + if (! isPageObject())
965 { 997 {
966 throw std::logic_error("page operation called on non-Page object"); 998 throw std::logic_error("page operation called on non-Page object");
967 } 999 }
libqpdf/QPDF_optimization.cc
@@ -232,6 +232,14 @@ QPDF::pushInheritedAttributesToPage(bool allow_changes, bool warn_skipped_keys) @@ -232,6 +232,14 @@ QPDF::pushInheritedAttributesToPage(bool allow_changes, bool warn_skipped_keys)
232 // Traverse pages tree pushing all inherited resources down to the 232 // Traverse pages tree pushing all inherited resources down to the
233 // page level. 233 // page level.
234 234
  235 + // The record of whether we've done this is cleared by
  236 + // updateAllPagesCache(). If we're warning for skipped keys,
  237 + // re-traverse unconditionally.
  238 + if (this->pushed_inherited_attributes_to_pages && (! warn_skipped_keys))
  239 + {
  240 + return;
  241 + }
  242 +
235 // key_ancestors is a mapping of page attribute keys to a stack of 243 // key_ancestors is a mapping of page attribute keys to a stack of
236 // Pages nodes that contain values for them. 244 // Pages nodes that contain values for them.
237 std::map<std::string, std::vector<QPDFObjectHandle> > key_ancestors; 245 std::map<std::string, std::vector<QPDFObjectHandle> > key_ancestors;
@@ -240,6 +248,7 @@ QPDF::pushInheritedAttributesToPage(bool allow_changes, bool warn_skipped_keys) @@ -240,6 +248,7 @@ QPDF::pushInheritedAttributesToPage(bool allow_changes, bool warn_skipped_keys)
240 this->trailer.getKey("/Root").getKey("/Pages"), 248 this->trailer.getKey("/Root").getKey("/Pages"),
241 key_ancestors, this->all_pages, allow_changes, warn_skipped_keys); 249 key_ancestors, this->all_pages, allow_changes, warn_skipped_keys);
242 assert(key_ancestors.empty()); 250 assert(key_ancestors.empty());
  251 + this->pushed_inherited_attributes_to_pages = true;
243 } 252 }
244 253
245 void 254 void
libqpdf/QPDF_pages.cc
@@ -89,6 +89,7 @@ QPDF::updateAllPagesCache() @@ -89,6 +89,7 @@ QPDF::updateAllPagesCache()
89 QTC::TC("qpdf", "QPDF updateAllPagesCache"); 89 QTC::TC("qpdf", "QPDF updateAllPagesCache");
90 this->all_pages.clear(); 90 this->all_pages.clear();
91 this->pageobj_to_pages_pos.clear(); 91 this->pageobj_to_pages_pos.clear();
  92 + this->pushed_inherited_attributes_to_pages = false;
92 getAllPages(); 93 getAllPages();
93 } 94 }
94 95
@@ -161,6 +162,12 @@ QPDF::insertPage(QPDFObjectHandle newpage, int pos) @@ -161,6 +162,12 @@ QPDF::insertPage(QPDFObjectHandle newpage, int pos)
161 QTC::TC("qpdf", "QPDF insert non-indirect page"); 162 QTC::TC("qpdf", "QPDF insert non-indirect page");
162 newpage = this->makeIndirectObject(newpage); 163 newpage = this->makeIndirectObject(newpage);
163 } 164 }
  165 + else if (newpage.getOwningQPDF() != this)
  166 + {
  167 + QTC::TC("qpdf", "QPDF insert foreign page");
  168 + newpage.getOwningQPDF()->pushInheritedAttributesToPage();
  169 + newpage = this->copyForeignObject(newpage, true);
  170 + }
164 else 171 else
165 { 172 {
166 QTC::TC("qpdf", "QPDF insert indirect page"); 173 QTC::TC("qpdf", "QPDF insert indirect page");
qpdf/qpdf.testcov
@@ -218,3 +218,18 @@ QPDF unknown key not inherited 0 @@ -218,3 +218,18 @@ QPDF unknown key not inherited 0
218 QPDF_Stream provider length not provided 0 218 QPDF_Stream provider length not provided 0
219 QPDF_Stream unknown stream length 0 219 QPDF_Stream unknown stream length 0
220 QPDF replaceReserved 0 220 QPDF replaceReserved 0
  221 +QPDF copyForeign direct 0
  222 +QPDF copyForeign not foreign 0
  223 +QPDF copy indirect 0
  224 +QPDF loop reserving objects 0
  225 +QPDF replace indirect 0
  226 +QPDF replace array 0
  227 +QPDF replace dictionary 0
  228 +QPDF replace stream 0
  229 +QPDF reserve array 0
  230 +QPDF reserve dictionary 0
  231 +QPDF reserve stream 0
  232 +QPDF not crossing page boundary 0
  233 +QPDF replace foreign indirect with null 0
  234 +QPDF not copying pages object 0
  235 +QPDF insert foreign page 0
qpdf/qtest/qpdf.test
@@ -379,6 +379,27 @@ $td-&gt;runtest(&quot;check output&quot;, @@ -379,6 +379,27 @@ $td-&gt;runtest(&quot;check output&quot;,
379 {$td->FILE => "a.pdf"}, 379 {$td->FILE => "a.pdf"},
380 {$td->FILE => "from-scratch-0.pdf"}); 380 {$td->FILE => "from-scratch-0.pdf"});
381 # ---------- 381 # ----------
  382 +$td->notify("--- Copy Foreign Objects ---");
  383 +$n_tests += 7;
  384 +
  385 +foreach my $d ([25, 1], [26, 2], [27, 3])
  386 +{
  387 + my ($testn, $outn) = @$d;
  388 + $td->runtest("copy objects $outn",
  389 + {$td->COMMAND => "test_driver $testn" .
  390 + " copy-foreign-objects-in.pdf"},
  391 + {$td->STRING => "test $testn done\n", $td->EXIT_STATUS => 0},
  392 + $td->NORMALIZE_NEWLINES);
  393 + $td->runtest("check output",
  394 + {$td->FILE => "a.pdf"},
  395 + {$td->FILE => "copy-foreign-objects-out$outn.pdf"});
  396 +}
  397 +$td->runtest("copy objects error",
  398 + {$td->COMMAND => "test_driver 28 copy-foreign-objects-in.pdf"},
  399 + {$td->FILE => "copy-foreign-objects-errors.out",
  400 + $td->EXIT_STATUS => 0},
  401 + $td->NORMALIZE_NEWLINES);
  402 +# ----------
382 $td->notify("--- Error Condition Tests ---"); 403 $td->notify("--- Error Condition Tests ---");
383 # $n_tests incremented after initialization of badfiles below. 404 # $n_tests incremented after initialization of badfiles below.
384 405
qpdf/qtest/qpdf/copy-foreign-objects-errors.out 0 โ†’ 100644
  1 +logic error: QPDF::copyForeign called with object from this QPDF
  2 +logic error: QPDF::copyForeign called with direct object handle
  3 +test 28 done
qpdf/qtest/qpdf/copy-foreign-objects-in.pdf 0 โ†’ 100644
  1 +%PDF-1.3
  2 +%ยฟรทยขรพ
  3 +%QDF-1.0
  4 +
  5 +% This test file is specifically crafted for testing copyForeignObject
  6 +% and also for testing addPage when called with a page from another
  7 +% file.
  8 +
  9 +% The /QTest key in trailer has pointers to several indirect objects:
  10 +% O1, O2, O3 where O1 is an array that contains a dictionary that has
  11 +% a key that points to O2, O2 is a dictionary that contains an array
  12 +% that points to O1, and O3 is a page object that inherits some
  13 +% resource from its parent /Pages and also points to some other page.
  14 +% O1 also points to a stream whose dictionary has a key that points to
  15 +% another stream whose dictionary points back to the first stream.
  16 +
  17 +1 0 obj
  18 +<<
  19 + /Pages 2 0 R
  20 + /Type /Catalog
  21 +>>
  22 +endobj
  23 +
  24 +2 0 obj
  25 +<<
  26 + /Count 5
  27 + /Kids [
  28 + 3 0 R
  29 + 4 0 R
  30 + 5 0 R
  31 + 6 0 R
  32 + 7 0 R
  33 + ]
  34 + /Rotate 180
  35 + /Type /Pages
  36 +>>
  37 +endobj
  38 +
  39 +%% Page 1
  40 +3 0 obj
  41 +<<
  42 + /Contents 8 0 R
  43 + /MediaBox [
  44 + 0
  45 + 0
  46 + 612
  47 + 792
  48 + ]
  49 + /Parent 2 0 R
  50 + /Resources <<
  51 + /Font <<
  52 + /F1 10 0 R
  53 + >>
  54 + /ProcSet [
  55 + /PDF
  56 + /Text
  57 + ]
  58 + >>
  59 + /Type /Page
  60 +>>
  61 +endobj
  62 +
  63 +%% Page 2
  64 +4 0 obj
  65 +<<
  66 + /Contents 11 0 R
  67 + /MediaBox [
  68 + 0
  69 + 0
  70 + 612
  71 + 792
  72 + ]
  73 + /Parent 2 0 R
  74 + /Resources <<
  75 + /Font <<
  76 + /F1 10 0 R
  77 + >>
  78 + /ProcSet [
  79 + /PDF
  80 + /Text
  81 + ]
  82 + >>
  83 + /Type /Page
  84 +>>
  85 +endobj
  86 +
  87 +%% Page 3, object O3
  88 +5 0 obj
  89 +<<
  90 + /This-is-O3 true
  91 + /Contents 13 0 R
  92 + /MediaBox [
  93 + 0
  94 + 0
  95 + 612
  96 + 792
  97 + ]
  98 + /Parent 2 0 R
  99 + /Resources <<
  100 + /Font <<
  101 + /F1 10 0 R
  102 + >>
  103 + /ProcSet [
  104 + /PDF
  105 + /Text
  106 + ]
  107 + >>
  108 + /OtherPage 6 0 R
  109 + /Type /Page
  110 +>>
  111 +endobj
  112 +
  113 +%% Page 4
  114 +6 0 obj
  115 +<<
  116 + /This-is-O3-other-page true
  117 + /Contents 15 0 R
  118 + /MediaBox [
  119 + 0
  120 + 0
  121 + 612
  122 + 792
  123 + ]
  124 + /Parent 2 0 R
  125 + /Resources <<
  126 + /Font <<
  127 + /F1 10 0 R
  128 + >>
  129 + /ProcSet [
  130 + /PDF
  131 + /Text
  132 + ]
  133 + >>
  134 + /Type /Page
  135 +>>
  136 +endobj
  137 +
  138 +%% Page 5
  139 +7 0 obj
  140 +<<
  141 + /Contents 17 0 R
  142 + /MediaBox [
  143 + 0
  144 + 0
  145 + 612
  146 + 792
  147 + ]
  148 + /Parent 2 0 R
  149 + /Resources <<
  150 + /Font <<
  151 + /F1 10 0 R
  152 + >>
  153 + /ProcSet [
  154 + /PDF
  155 + /Text
  156 + ]
  157 + >>
  158 + /Type /Page
  159 +>>
  160 +endobj
  161 +
  162 +%% Contents for page 1
  163 +8 0 obj
  164 +<<
  165 + /Length 9 0 R
  166 +>>
  167 +stream
  168 +BT /F1 15 Tf 72 720 Td (Original page 0) Tj ET
  169 +endstream
  170 +endobj
  171 +
  172 +9 0 obj
  173 +47
  174 +endobj
  175 +
  176 +10 0 obj
  177 +<<
  178 + /BaseFont /Times-Roman
  179 + /Encoding /WinAnsiEncoding
  180 + /Subtype /Type1
  181 + /Type /Font
  182 +>>
  183 +endobj
  184 +
  185 +%% Contents for page 2
  186 +11 0 obj
  187 +<<
  188 + /Length 12 0 R
  189 +>>
  190 +stream
  191 +BT /F1 15 Tf 72 720 Td (Original page 1) Tj ET
  192 +endstream
  193 +endobj
  194 +
  195 +12 0 obj
  196 +47
  197 +endobj
  198 +
  199 +%% Contents for page 3
  200 +13 0 obj
  201 +<<
  202 + /Length 14 0 R
  203 +>>
  204 +stream
  205 +BT /F1 15 Tf 72 720 Td (Original page 2) Tj ET
  206 +endstream
  207 +endobj
  208 +
  209 +14 0 obj
  210 +47
  211 +endobj
  212 +
  213 +%% Contents for page 4
  214 +15 0 obj
  215 +<<
  216 + /Length 16 0 R
  217 +>>
  218 +stream
  219 +BT /F1 15 Tf 72 720 Td (Original page 3) Tj ET
  220 +endstream
  221 +endobj
  222 +
  223 +16 0 obj
  224 +47
  225 +endobj
  226 +
  227 +%% Contents for page 5
  228 +17 0 obj
  229 +<<
  230 + /Length 18 0 R
  231 +>>
  232 +stream
  233 +BT /F1 15 Tf 72 720 Td (Original page 4) Tj ET
  234 +endstream
  235 +endobj
  236 +
  237 +18 0 obj
  238 +47
  239 +endobj
  240 +
  241 +% O1
  242 +19 0 obj
  243 +[
  244 + /This-is-O1
  245 + /potato
  246 + << /O2 [3.14159 << /O2 20 0 R >> 2.17828 ] >>
  247 + /salad
  248 + /O2 20 0 R
  249 + /Stream1 21 0 R
  250 +]
  251 +endobj
  252 +
  253 +% O2
  254 +20 0 obj
  255 +<<
  256 + /This-is-O2 true
  257 + /K1 [2.236 /O1 19 0 R 1.732]
  258 + /O1 19 0 R
  259 +>>
  260 +endobj
  261 +
  262 +% stream1
  263 +21 0 obj
  264 +<<
  265 + /This-is-Stream1 true
  266 + /Length 22 0 R
  267 + /Stream2 23 0 R
  268 +>>
  269 +stream
  270 +This is stream 1.
  271 +endstream
  272 +endobj
  273 +
  274 +22 0 obj
  275 +18
  276 +endobj
  277 +
  278 +% stream2
  279 +23 0 obj
  280 +<<
  281 + /This-is-Stream2 true
  282 + /Length 24 0 R
  283 + /Stream1 21 0 R
  284 +>>
  285 +stream
  286 +This is stream 2.
  287 +endstream
  288 +endobj
  289 +
  290 +24 0 obj
  291 +18
  292 +endobj
  293 +
  294 +% QTest
  295 +25 0 obj
  296 +<< /This-is-QTest true /O1 19 0 R /O2 20 0 R /O3 5 0 R >>
  297 +endobj
  298 +
  299 +xref
  300 +0 26
  301 +0000000000 65535 f
  302 +0000000655 00000 n
  303 +0000000709 00000 n
  304 +0000000845 00000 n
  305 +0000001073 00000 n
  306 +0000001313 00000 n
  307 +0000001580 00000 n
  308 +0000001839 00000 n
  309 +0000002081 00000 n
  310 +0000002183 00000 n
  311 +0000002202 00000 n
  312 +0000002334 00000 n
  313 +0000002438 00000 n
  314 +0000002481 00000 n
  315 +0000002585 00000 n
  316 +0000002628 00000 n
  317 +0000002732 00000 n
  318 +0000002775 00000 n
  319 +0000002879 00000 n
  320 +0000002904 00000 n
  321 +0000003042 00000 n
  322 +0000003138 00000 n
  323 +0000003255 00000 n
  324 +0000003285 00000 n
  325 +0000003402 00000 n
  326 +0000003430 00000 n
  327 +trailer <<
  328 + /Root 1 0 R
  329 + /Size 26
  330 + /QTest 25 0 R
  331 + /ID [<d15f7aca3be584a96c1c94adb0931e71><9adb6b2fdb22e857340f7103917b16e4>]
  332 +>>
  333 +startxref
  334 +3505
  335 +%%EOF
qpdf/qtest/qpdf/copy-foreign-objects-out1.pdf 0 โ†’ 100644
  1 +%PDF-1.3
  2 +%ยฟรทยขรพ
  3 +1 0 obj
  4 +<< /Pages 3 0 R /Type /Catalog >>
  5 +endobj
  6 +2 0 obj
  7 +<< /O1 4 0 R /O2 5 0 R /This-is-QTest true >>
  8 +endobj
  9 +3 0 obj
  10 +<< /Count 1 /Kids [ 6 0 R ] /Type /Pages >>
  11 +endobj
  12 +4 0 obj
  13 +[ /This-is-O1 /potato << /O2 [ 3.14159 << /O2 5 0 R >> 2.17828 ] >> /salad /O2 5 0 R /Stream1 7 0 R ]
  14 +endobj
  15 +5 0 obj
  16 +<< /K1 [ 2.236 /O1 4 0 R 1.732 ] /O1 4 0 R /This-is-O2 true >>
  17 +endobj
  18 +6 0 obj
  19 +<< /Contents 8 0 R /MediaBox [ 0 0 612 792 ] /Parent 3 0 R /Resources << /Font << /F1 9 0 R >> /ProcSet 10 0 R >> /Type /Page >>
  20 +endobj
  21 +7 0 obj
  22 +<< /Stream2 11 0 R /This-is-Stream1 true /Length 18 >>
  23 +stream
  24 +This is stream 1.
  25 +endstream
  26 +endobj
  27 +8 0 obj
  28 +<< /Length 44 >>
  29 +stream
  30 +BT
  31 + /F1 24 Tf
  32 + 72 720 Td
  33 + (Potato) Tj
  34 +ET
  35 +endstream
  36 +endobj
  37 +9 0 obj
  38 +<< /BaseFont /Helvetica /Encoding /WinAnsiEncoding /Name /F1 /Subtype /Type1 /Type /Font >>
  39 +endobj
  40 +10 0 obj
  41 +[ /PDF /Text ]
  42 +endobj
  43 +11 0 obj
  44 +<< /Stream1 7 0 R /This-is-Stream2 true /Length 18 >>
  45 +stream
  46 +This is stream 2.
  47 +endstream
  48 +endobj
  49 +xref
  50 +0 12
  51 +0000000000 65535 f
  52 +0000000015 00000 n
  53 +0000000064 00000 n
  54 +0000000125 00000 n
  55 +0000000184 00000 n
  56 +0000000301 00000 n
  57 +0000000379 00000 n
  58 +0000000523 00000 n
  59 +0000000628 00000 n
  60 +0000000721 00000 n
  61 +0000000828 00000 n
  62 +0000000859 00000 n
  63 +trailer << /QTest 2 0 R /Root 1 0 R /Size 12 /ID [<31415926535897932384626433832795><31415926535897932384626433832795>] >>
  64 +startxref
  65 +964
  66 +%%EOF
qpdf/qtest/qpdf/copy-foreign-objects-out2.pdf 0 โ†’ 100644
  1 +%PDF-1.3
  2 +%ยฟรทยขรพ
  3 +1 0 obj
  4 +<< /Pages 3 0 R /Type /Catalog >>
  5 +endobj
  6 +2 0 obj
  7 +<< /O1 4 0 R /O2 5 0 R /O3 6 0 R /This-is-QTest true >>
  8 +endobj
  9 +3 0 obj
  10 +<< /Count 2 /Kids [ 7 0 R 6 0 R ] /Type /Pages >>
  11 +endobj
  12 +4 0 obj
  13 +[ /This-is-O1 /potato << /O2 [ 3.14159 << /O2 5 0 R >> 2.17828 ] >> /salad /O2 5 0 R /Stream1 8 0 R ]
  14 +endobj
  15 +5 0 obj
  16 +<< /K1 [ 2.236 /O1 4 0 R 1.732 ] /O1 4 0 R /This-is-O2 true >>
  17 +endobj
  18 +6 0 obj
  19 +<< /Contents 9 0 R /MediaBox [ 0 0 612 792 ] /Parent 3 0 R /Resources << /Font << /F1 10 0 R >> /ProcSet [ /PDF /Text ] >> /Rotate 180 /This-is-O3 true /Type /Page >>
  20 +endobj
  21 +7 0 obj
  22 +<< /Contents 11 0 R /MediaBox [ 0 0 612 792 ] /Parent 3 0 R /Resources << /Font << /F1 12 0 R >> /ProcSet 13 0 R >> /Type /Page >>
  23 +endobj
  24 +8 0 obj
  25 +<< /Stream2 14 0 R /This-is-Stream1 true /Length 18 >>
  26 +stream
  27 +This is stream 1.
  28 +endstream
  29 +endobj
  30 +9 0 obj
  31 +<< /Length 47 >>
  32 +stream
  33 +BT /F1 15 Tf 72 720 Td (Original page 2) Tj ET
  34 +endstream
  35 +endobj
  36 +10 0 obj
  37 +<< /BaseFont /Times-Roman /Encoding /WinAnsiEncoding /Subtype /Type1 /Type /Font >>
  38 +endobj
  39 +11 0 obj
  40 +<< /Length 44 >>
  41 +stream
  42 +BT
  43 + /F1 24 Tf
  44 + 72 720 Td
  45 + (Potato) Tj
  46 +ET
  47 +endstream
  48 +endobj
  49 +12 0 obj
  50 +<< /BaseFont /Helvetica /Encoding /WinAnsiEncoding /Name /F1 /Subtype /Type1 /Type /Font >>
  51 +endobj
  52 +13 0 obj
  53 +[ /PDF /Text ]
  54 +endobj
  55 +14 0 obj
  56 +<< /Stream1 8 0 R /This-is-Stream2 true /Length 18 >>
  57 +stream
  58 +This is stream 2.
  59 +endstream
  60 +endobj
  61 +xref
  62 +0 15
  63 +0000000000 65535 f
  64 +0000000015 00000 n
  65 +0000000064 00000 n
  66 +0000000135 00000 n
  67 +0000000200 00000 n
  68 +0000000317 00000 n
  69 +0000000395 00000 n
  70 +0000000577 00000 n
  71 +0000000723 00000 n
  72 +0000000828 00000 n
  73 +0000000924 00000 n
  74 +0000001024 00000 n
  75 +0000001118 00000 n
  76 +0000001226 00000 n
  77 +0000001257 00000 n
  78 +trailer << /QTest 2 0 R /Root 1 0 R /Size 15 /ID [<31415926535897932384626433832795><31415926535897932384626433832795>] >>
  79 +startxref
  80 +1362
  81 +%%EOF
qpdf/qtest/qpdf/copy-foreign-objects-out3.pdf 0 โ†’ 100644
  1 +%PDF-1.3
  2 +%ยฟรทยขรพ
  3 +1 0 obj
  4 +<< /Pages 3 0 R /Type /Catalog >>
  5 +endobj
  6 +2 0 obj
  7 +<< /O1 4 0 R /O2 5 0 R /O3 6 0 R /This-is-QTest true >>
  8 +endobj
  9 +3 0 obj
  10 +<< /Count 3 /Kids [ 7 0 R 8 0 R 6 0 R ] /Type /Pages >>
  11 +endobj
  12 +4 0 obj
  13 +[ /This-is-O1 /potato << /O2 [ 3.14159 << /O2 5 0 R >> 2.17828 ] >> /salad /O2 5 0 R /Stream1 9 0 R ]
  14 +endobj
  15 +5 0 obj
  16 +<< /K1 [ 2.236 /O1 4 0 R 1.732 ] /O1 4 0 R /This-is-O2 true >>
  17 +endobj
  18 +6 0 obj
  19 +<< /Contents 10 0 R /MediaBox [ 0 0 612 792 ] /OtherPage 8 0 R /Parent 3 0 R /Resources << /Font << /F1 11 0 R >> /ProcSet [ /PDF /Text ] >> /Rotate 180 /This-is-O3 true /Type /Page >>
  20 +endobj
  21 +7 0 obj
  22 +<< /Contents 12 0 R /MediaBox [ 0 0 612 792 ] /Parent 3 0 R /Resources << /Font << /F1 13 0 R >> /ProcSet 14 0 R >> /Type /Page >>
  23 +endobj
  24 +8 0 obj
  25 +<< /Contents 15 0 R /MediaBox [ 0 0 612 792 ] /Parent 3 0 R /Resources << /Font << /F1 11 0 R >> /ProcSet [ /PDF /Text ] >> /Rotate 180 /This-is-O3-other-page true /Type /Page >>
  26 +endobj
  27 +9 0 obj
  28 +<< /Stream2 16 0 R /This-is-Stream1 true /Length 18 >>
  29 +stream
  30 +This is stream 1.
  31 +endstream
  32 +endobj
  33 +10 0 obj
  34 +<< /Length 47 >>
  35 +stream
  36 +BT /F1 15 Tf 72 720 Td (Original page 2) Tj ET
  37 +endstream
  38 +endobj
  39 +11 0 obj
  40 +<< /BaseFont /Times-Roman /Encoding /WinAnsiEncoding /Subtype /Type1 /Type /Font >>
  41 +endobj
  42 +12 0 obj
  43 +<< /Length 44 >>
  44 +stream
  45 +BT
  46 + /F1 24 Tf
  47 + 72 720 Td
  48 + (Potato) Tj
  49 +ET
  50 +endstream
  51 +endobj
  52 +13 0 obj
  53 +<< /BaseFont /Helvetica /Encoding /WinAnsiEncoding /Name /F1 /Subtype /Type1 /Type /Font >>
  54 +endobj
  55 +14 0 obj
  56 +[ /PDF /Text ]
  57 +endobj
  58 +15 0 obj
  59 +<< /Length 47 >>
  60 +stream
  61 +BT /F1 15 Tf 72 720 Td (Original page 3) Tj ET
  62 +endstream
  63 +endobj
  64 +16 0 obj
  65 +<< /Stream1 9 0 R /This-is-Stream2 true /Length 18 >>
  66 +stream
  67 +This is stream 2.
  68 +endstream
  69 +endobj
  70 +xref
  71 +0 17
  72 +0000000000 65535 f
  73 +0000000015 00000 n
  74 +0000000064 00000 n
  75 +0000000135 00000 n
  76 +0000000206 00000 n
  77 +0000000323 00000 n
  78 +0000000401 00000 n
  79 +0000000601 00000 n
  80 +0000000747 00000 n
  81 +0000000941 00000 n
  82 +0000001046 00000 n
  83 +0000001143 00000 n
  84 +0000001243 00000 n
  85 +0000001337 00000 n
  86 +0000001445 00000 n
  87 +0000001476 00000 n
  88 +0000001573 00000 n
  89 +trailer << /QTest 2 0 R /Root 1 0 R /Size 17 /ID [<31415926535897932384626433832795><31415926535897932384626433832795>] >>
  90 +startxref
  91 +1678
  92 +%%EOF
qpdf/test_driver.cc
@@ -916,6 +916,89 @@ void runtest(int n, char const* filename) @@ -916,6 +916,89 @@ void runtest(int n, char const* filename)
916 w.setStreamDataMode(qpdf_s_preserve); 916 w.setStreamDataMode(qpdf_s_preserve);
917 w.write(); 917 w.write();
918 } 918 }
  919 + else if (n == 25)
  920 + {
  921 + // The copy object tests are designed to work with a specific
  922 + // file. Look at the test suite for the file, and look at the
  923 + // file for comments about the file's structure.
  924 +
  925 + // Copy qtest without crossing page boundaries. Should get O1
  926 + // and O2 and their streams but not O3 or any other pages.
  927 +
  928 + QPDF newpdf;
  929 + newpdf.processFile("minimal.pdf");
  930 + QPDFObjectHandle qtest = pdf.getTrailer().getKey("/QTest");
  931 + newpdf.getTrailer().replaceKey(
  932 + "/QTest", newpdf.copyForeignObject(qtest));
  933 +
  934 + QPDFWriter w(newpdf, "a.pdf");
  935 + w.setStaticID(true);
  936 + w.setStreamDataMode(qpdf_s_preserve);
  937 + w.write();
  938 + }
  939 + else if (n == 26)
  940 + {
  941 + // Copy the O3 page using addPage. Copy qtest without
  942 + // crossing page boundaries. In addition to previous results,
  943 + // should get page O3 but no other pages including the page
  944 + // that O3 points to. Also, inherited object will have been
  945 + // pushed down and will be preserved.
  946 +
  947 + QPDF newpdf;
  948 + newpdf.processFile("minimal.pdf");
  949 + QPDFObjectHandle qtest = pdf.getTrailer().getKey("/QTest");
  950 + QPDFObjectHandle O3 = qtest.getKey("/O3");
  951 + newpdf.addPage(O3, false);
  952 + newpdf.getTrailer().replaceKey(
  953 + "/QTest", newpdf.copyForeignObject(qtest));
  954 +
  955 + QPDFWriter w(newpdf, "a.pdf");
  956 + w.setStaticID(true);
  957 + w.setStreamDataMode(qpdf_s_preserve);
  958 + w.write();
  959 + }
  960 + else if (n == 27)
  961 + {
  962 + // Copy O3 and the page O3 refers to before copying qtest.
  963 + // Should get qtest plus only the O3 page and the page that O3
  964 + // points to. Inherited objects should be preserved.
  965 +
  966 + QPDF newpdf;
  967 + newpdf.processFile("minimal.pdf");
  968 + QPDFObjectHandle qtest = pdf.getTrailer().getKey("/QTest");
  969 + QPDFObjectHandle O3 = qtest.getKey("/O3");
  970 + newpdf.addPage(O3.getKey("/OtherPage"), false);
  971 + newpdf.addPage(O3, false);
  972 + newpdf.getTrailer().replaceKey(
  973 + "/QTest", newpdf.copyForeignObject(qtest));
  974 +
  975 + QPDFWriter w(newpdf, "a.pdf");
  976 + w.setStaticID(true);
  977 + w.setStreamDataMode(qpdf_s_preserve);
  978 + w.write();
  979 + }
  980 + else if (n == 28)
  981 + {
  982 + // Copy foreign object errors
  983 + try
  984 + {
  985 + pdf.copyForeignObject(pdf.getTrailer().getKey("/QTest"));
  986 + std::cout << "oops -- didn't throw" << std::endl;
  987 + }
  988 + catch (std::logic_error e)
  989 + {
  990 + std::cout << "logic error: " << e.what() << std::endl;
  991 + }
  992 + try
  993 + {
  994 + pdf.copyForeignObject(QPDFObjectHandle::newInteger(1));
  995 + std::cout << "oops -- didn't throw" << std::endl;
  996 + }
  997 + catch (std::logic_error e)
  998 + {
  999 + std::cout << "logic error: " << e.what() << std::endl;
  1000 + }
  1001 + }
919 else 1002 else
920 { 1003 {
921 throw std::runtime_error(std::string("invalid test ") + 1004 throw std::runtime_error(std::string("invalid test ") +