Commit de3094124d5221039c1906fb5da79eabf4953560

Authored by Jay Berkenbilt
1 parent d8d70ecc

Include information from all `pages` issues in TODO-pages.md

Showing 1 changed file with 89 additions and 1 deletions
TODO-pages.md
... ... @@ -14,6 +14,93 @@ This file contains plans and notes regarding implementing of the "pages epic." T
14 14 * booklet generation ("signatures", as in what `psbook` does)
15 15 * Possibly others pending analysis of open issues and public discussion
16 16  
  17 +# Feature to Issue Mapping
  18 +
  19 +Last checked: 2023-12-29
  20 +
  21 +* Questions/ideas
  22 + * I have often wondered whether we need to be able to attach arbitrary metadata to a QPDFObjectHandle (or object or value) and to control whether it should be included in copies. For example, one could attach to a page which qpdf id and page number it came from, then carry that around as the page was converted to a form xobject, inserted into a foreign file, etc. It feels like something like that will be needed to support some of these features.
  23 +* Generate a mapping from source to destination for all destinations
  24 + * Issues: #1077
  25 + * Notes:
  26 + * Source can be an outline or link, either directly or via action. If link, it should include the page.
  27 + * Destination can be a structure destination, which should map to a regular destination.
  28 + * source: page X -> link -> action -> dest: page Y
  29 + * source: page X -> link -> action -> dest: structure -> page Y
  30 + * Consider something in json that dumps this.
  31 + * We will need to associate this with a QPDF. It would be great if remote or embedded go-to actions could be handled, but that's ambitious.
  32 + * It will be necessary to keep some global map that includes all QPDF objects that are part of the final file.
  33 + * An interesting use case to consider would be to create a QPDF object from an embedded file and append the embedded file and make the embedded actions work. This would probably require some way to tell qpdf that a particular external file came from an embedded file.
  34 +
  35 +* Control size of page and position/transformation of overlay/underlay
  36 + * Issues: #1031, #811, #740, #559
  37 + * Notes:
  38 + * It should be possible to define a destination page from scratch or in terms of other pages and then place page contents onto it with arbitrary transformations applied.
  39 + * It should be possible to compute the size of the destination page in terms of the source pages, e.g., to create one long or wide page from other pages.
  40 + * Also allow specification of which page box to use
  41 +* Preserve hyperlinks when doing any page operations
  42 + * See also "Generate a mapping from source to destination for all destinations"
  43 + * Issues: #1003, #797, #94
  44 + * Notes:
  45 + * A link annotation that points to a destination rather than an external URL should continue to work when files are split or merged.
  46 +* Awareness of structured and tagged PDF (14.7, 14.8)
  47 + * Issues: #957, #953, #490
  48 + * Notes:
  49 + * This looks complicated. It may be not be possible to do this fully in the first increment, but we have to keep it in mind and warn if we can't and we see /SD in an action.
  50 + * #490 has some good analysis
  51 +* Assign page labels
  52 + * Issues: #939
  53 + * Notes:
  54 + * #939 has a good proposal
  55 + * This could be applied to page groups, and we could have an option to keep the labels as they are in a given group, which is what qpdf does now.
  56 +* Interleave pages with ordering
  57 + * Issues: #921
  58 + * Notes:
  59 + * From 921: interleave odd pages and reversed even pages. This might require different handling for even/odd numbers of pages. Make sure it's natural for the cases of len(odd) == len(even) or len(odd) == 1+len(even)
  60 +* Preserve all attachments when merging files
  61 + * Issues: #856
  62 + * Notes:
  63 + * If all pages of a file are selected, keep all attachments
  64 + * If some pages of a file are selected
  65 + * Keep all attachments if there are any embedded file annotations
  66 + * Otherwise, what? Do we have a keep-attachments flag of some sort? Or do we just make the user copy attachments from one file to another?
  67 +* Create page group by excluding pages
  68 + * Issues: #790, #564
  69 + * Notes:
  70 + * Handle cases in `PageSelector` below
  71 +* Apply clipping to a page
  72 + * Issues: #771
  73 + * Notes:
  74 + * Create a form xobject from a page, then apply a specific clipping region expressed in coordinates or as a percentage
  75 +* Ability to create a blank page
  76 + * Issues: #753
  77 + * Notes:
  78 + * Create a blank page of a specific size or of the same size as another page
  79 +* Split groups with explicit boundaries
  80 + * Issues: #741, #616
  81 + * Notes:
  82 + * Example: --split-after a,b,c
  83 +* Handle Optional Content (8.11)
  84 + * Issues: #672, #9
  85 +* Scale a page up or down to fit to a size
  86 + * Issues: #611
  87 +* Place contents of pages adjacent horizontally or vertically on one page
  88 + * Issues: #1040, #546
  89 +* nup, booklet
  90 + * Issues: #493, #461, #152
  91 + * Notes:
  92 + * #461 may want the inverse of booklet and discusses reader and printer spreads
  93 +* Flexible multiplexing
  94 + * Issues: #505
  95 +* Split pages based on outlines
  96 + * Issues: #477
  97 +* Keep relevant parts of outline hierarchy
  98 + * Issues: #457, #356, #343, #323
  99 + * Notes:
  100 + * There is some helpful discussion in #343 including
  101 + * Prserving open/closed status
  102 + * Preserving javascript actions
  103 +
17 104 # Architectural Thoughts
18 105  
19 106 I want to encapsulate various aspects of the logic into interfaces that can be implemented by developers to add their own logic. It should be easy to contribute these. Here are some rough ideas.
... ... @@ -52,6 +139,7 @@ Here are some examples.
52 139 * the left-front (left-back, right-front, right-back) pages of a booklet with signatures of n pages
53 140 * all pages reachable from a section of the outline hierarchy or something based on threads or other structure
54 141 * selection based on page labels
  142 + * pages in a group except pages in another group
55 143 * PageTransformer
56 144 * clip to media box (trim box, crop box, etc.)
57 145 * clip to specific absolute or relative size
... ... @@ -188,7 +276,7 @@ Document-level structures:
188 276 * AcroForm
189 277 * StructTreeRoot
190 278 * Page: StructParents
191   -* MarkInfo
  279 +* MarkInfo (see 14.7 - Logical Structure, 14.8 Tagged PDF)
192 280 * SpiderInfo
193 281 * Page: ID
194 282 * OutputIntents
... ...