Commit 122df44a78a58520fdbc04a3e60c10fea959a1ca

Authored by Jay Berkenbilt
1 parent bfc5fc51

TODO-pages: minor cleanup before rework

Showing 1 changed file with 52 additions and 25 deletions
TODO-pages.md
... ... @@ -73,6 +73,46 @@ Here are some examples of things that will become possible:
73 73 The rest of this document describes the details of what how these features will work and what needs
74 74 to be done to make them possible to build.
75 75  
  76 +# QPDFJob Summary
  77 +
  78 +`QPDFJob` goes through the following stages:
  79 +
  80 +* create QPDF
  81 + * update from JSON
  82 + * page specs (`--pages`)
  83 + * Create a QPDF for each input source
  84 + * Figure out whether to keep files open
  85 + * Remove unreferenced resources if needed
  86 + * Remove pages from the pages tree
  87 + * Handle collation
  88 + * Copy or revive all final pages
  89 + * When copying foreign pages, possibly remove unreferenced resources
  90 + * Handle the same page copied more than once by doing a shallow copy
  91 + * Preserve form fields and page labels
  92 + * Delete pages from the primary input that were not used in the output
  93 + * Delete unreferenced form fields
  94 + * rotation
  95 + * underlay/overlay
  96 + * transformations
  97 + * disable signatures
  98 + * externalize images
  99 + * optimize images
  100 + * generate appearances
  101 + * flatten annotations
  102 + * coalesce contents
  103 + * flatten rotation
  104 + * remove page labels
  105 + * remove attachments
  106 + * add attachments
  107 + * copy attachments
  108 +* write QPDF
  109 + * One of:
  110 + * Do inspections
  111 + * Write single file
  112 + * Split pages
  113 + * Remove unreference resources if needed
  114 + * Preserve form fields and page labels
  115 +
76 116 # Architectural Thoughts
77 117  
78 118 Create a new top-level class called `QPDFAssembler` that will be used to perform page-level
... ... @@ -198,7 +238,7 @@ Here is a list of cases that need to be expressible.
198 238 * front: 4,9, back: 10,3
199 239 * front: 2,11, back: 12,1
200 240  
201   - This is the same as dupex 2-up with pages in order 6, 7, 8, 5, 4, 9, 10, 3, 2, 11, 12, 1
  241 + This is the same as duplex 2-up with pages in order 6, 7, 8, 5, 4, 9, 10, 3, 2, 11, 12, 1
202 242 * n-up:
203 243 * For 2-up, calculate new w and h such that w/h maintains a fixed ratio and w and h are the
204 244 largest values that can fit within 1/2 the page with specified margins.
... ... @@ -207,7 +247,6 @@ Here is a list of cases that need to be expressible.
207 247 * Maybe define n-up a scale and rotate followed by fitting the result into a specified rectangle.
208 248 I might already have this logic in QPDFAnnotationObjectHelper::getPageContentForAppearance.
209 249  
210   -
211 250 # Feature to Issue Mapping
212 251  
213 252 Last checked: 2023-12-29
... ... @@ -494,62 +533,47 @@ PDF document structure
494 533 The trailer contains the catalog and the Info dictionary. We probably need to do something
495 534 intelligent with the info dictionary.
496 535  
497   -
498 536 7.7.2 contains the list of all keys in the document catalog.
499 537  
500   -Document-level structures:
  538 +Document-level structures to merge:
501 539 * Extensions
502 540 * Must be combination of Extensions from all input files
503 541 * PageLabels
504 542 * Ensure each page has its original label
505 543 * Allow post-processing
506 544 * Names -- see below
507   - * Combined and disambiguated
  545 + * Combine per tree
  546 + * May require disambiguation
508 547 * Page: TemplateInstantiated
509   -ombine from all files
510 548 * Dests
511 549 * Keep referenced destinations across all files
512 550 * May need to disambiguate or "flatten" or convert to named dests with the names tree
513 551 * Outlines
514 552 * Threads (easy)
515 553 * Page: B
516   -* AA (Additional Actions)
517   - * Merge from different files if possible
518   - * If duplicate, first contributor wins
519 554 * AcroForm
520   - * Merge
521 555 * StructTreeRoot
522   - * Combine
523 556 * Page: StructParents
524 557 * MarkInfo (see 14.7 - Logical Structure, 14.8 Tagged PDF)
525   - * Combine
526 558 * SpiderInfo
527   - * Combine
528 559 * Page: ID
529 560 * OutputIntents
530   - * Combine
531 561 * Page: OutputIntents
532 562 * PieceInfo
533   - * Combine
534 563 * Page: PieceInfo
535 564 * OCProperties
536   - * Combine across documents
537 565 * Requirements
538   - * Combine
539 566 * AF (file specification dictionaries)
540   - * Combine
541 567 * Page: AF
542 568 * DPartRoot
543   - * Combine
544 569 * Page: DPart
545   -
546   -Things qpdf probably needs to drop
547 570 * Version
548   -* Perms
549   -* Legal
550   -* DSS
  571 + * Maximum
551 572  
552 573 Things that stay with the first document that has one and/or will not be supported
  574 +* AA (Additional Actions)
  575 + * Would be possible to combine and let the first contributor win, but it probably wouldn't usually
  576 + be what we want.
553 577 * Info (not part of document catalog)
554 578 * ViewerPreferences
555 579 * PageLayout
... ... @@ -560,10 +584,13 @@ Things that stay with the first document that has one and/or will not be support
560 584 * Lang
561 585 * NeedsRendering
562 586 * Collection
  587 +* Perms
  588 +* Legal
  589 +* DSS
563 590  
564 591 Name dictionary (7.7.4)
565 592 * Dests
566   -* AP (appearance strams)
  593 +* AP (appearance streams)
567 594 * JavaScript
568 595 * Pages (named pages)
569 596 * Templates
... ...