Commit a5a016cdd26a8e5c99e5f019bc30d1bdf6c050a2

Authored by Jay Berkenbilt
1 parent 8ce3b53c

Revert preservations of outlines with --split-pages

The preservation of outlines didn't provide very useful behavior
anyway as it copied all outlines but most didn't work. This
implementation also caused a very significant performance hit and so
is being reverted until a proper solution can be coded. The eventual
solution will not be compatible with the reverted solution anyway, so
it's best not to leave this in.
ChangeLog
  1 +2019-04-20 Jay Berkenbilt <ejb@ql.org>
  2 +
  3 + * Revert change that included preservation of outlines (bookmarks)
  4 + in --split-pages. The way it was implemented caused a very
  5 + significant performance penalty when splitting pages with
  6 + outlines. We need a better solution that only copies the relevant
  7 + items, not the whole tree.
  8 +
1 9 2019-03-11 Jay Berkenbilt <ejb@ql.org>
2 10  
3 11 * JSON serialization: add missing leading 0 to decimal values
... ...
... ... @@ -77,6 +77,17 @@ Page splitting/merging
77 77 * make sure conflicting named destinations work possibly test by
78 78 including the same file by two paths in a merge
79 79  
  80 + Note: original implementation of bookmark preservation for split
  81 + pages caused a very high performance hit. The problem was
  82 + introduced in 313ba081265f69ac9a0324f9fe87087c72918191 and reverted
  83 + in the commit that adds this paragraph. The revert includes marking
  84 + a few tests cases as $td->EXPECT_FAILURE. When properly coded, the
  85 + test cases will need to be adjusted to only include the parts of
  86 + the outlines that are actually copied. The tests in question are
  87 + "split page with outlines". When implementing properly, ensure that
  88 + the performance is not adversely affected by timing split-pages on
  89 + a large file with complex outlines such as the PDF specification.
  90 +
80 91 When pruning outlines, keep all outlines in the hierarchy that are
81 92 above an outline for a page we care about. If one of the ancestor
82 93 outlines points to a non-existent page, clear its dest. If an
... ...
qpdf/qpdf.cc
... ... @@ -4966,30 +4966,6 @@ static void write_outfile(QPDF&amp; pdf, Options&amp; o)
4966 4966 "/Nums", QPDFObjectHandle::newArray(labels));
4967 4967 outpdf.getRoot().replaceKey("/PageLabels", page_labels);
4968 4968 }
4969   - // Copying the outlines tree, names table, and any
4970   - // outdated Dests key from the original file will make
4971   - // some things work in the split files. It is not a
4972   - // complete solution, but at least outlines whose
4973   - // destinations are on pages that have been preserved will
4974   - // work normally. There are other top-level structures
4975   - // that should be copied as well. This will be improved in
4976   - // the future.
4977   - std::list<std::string> to_copy;
4978   - to_copy.push_back("/Names");
4979   - to_copy.push_back("/Dests");
4980   - to_copy.push_back("/Outlines");
4981   - for (std::list<std::string>::iterator iter = to_copy.begin();
4982   - iter != to_copy.end(); ++iter)
4983   - {
4984   - QPDFObjectHandle orig = pdf.getRoot().getKey(*iter);
4985   - if (! orig.isIndirect())
4986   - {
4987   - orig = pdf.makeIndirectObject(orig);
4988   - }
4989   - outpdf.getRoot().replaceKey(
4990   - *iter,
4991   - outpdf.copyForeignObject(orig));
4992   - }
4993 4969 std::string page_range = QUtil::int_to_string(first, pageno_len);
4994 4970 if (o.split_pages > 1)
4995 4971 {
... ...
qpdf/qtest/qpdf.test
... ... @@ -1594,6 +1594,8 @@ foreach my $i (qw(01-06 07-11))
1594 1594 {$td->FILE => "labels-split-$i.pdf"});
1595 1595 }
1596 1596  
  1597 +# See comments in TODO about these expected failures. Search for
  1598 +# "split page with outlines".
1597 1599 $td->runtest("split page with outlines",
1598 1600 {$td->COMMAND => "qpdf --qdf --static-id --split-pages=10".
1599 1601 " outlines-with-actions.pdf split-out-outlines.pdf"},
... ... @@ -1602,7 +1604,8 @@ foreach my $i (qw(01-10 11-20 21-30))
1602 1604 {
1603 1605 $td->runtest("check output ($i)",
1604 1606 {$td->FILE => "split-out-outlines-$i.pdf"},
1605   - {$td->FILE => "outlines-split-$i.pdf"});
  1607 + {$td->FILE => "outlines-split-$i.pdf"},
  1608 + $td->EXPECT_FAILURE)
1606 1609 }
1607 1610  
1608 1611 foreach my $d (@sp_cases)
... ...