Commit 62d47bff523ec6b64161651b33bb1563b6a80776
1 parent
05460d40
TODO: notes on QPDFPagesTree
Showing
1 changed file
with
33 additions
and
9 deletions
TODO
| ... | ... | @@ -11,6 +11,7 @@ In order: |
| 11 | 11 | |
| 12 | 12 | Other (do in any order): |
| 13 | 13 | |
| 14 | +* QPDFPagesTree -- avoid ever flattening the pages tree. | |
| 14 | 15 | * Check about runpath in the linux-bin distribution. I think the |
| 15 | 16 | appimage build specifically is setting the runpath, which is |
| 16 | 17 | actually desirable in this case. Make sure to understand and |
| ... | ... | @@ -56,17 +57,8 @@ Output JSON v2 |
| 56 | 57 | |
| 57 | 58 | Some of this documentation has drifted from the actual implementation. |
| 58 | 59 | |
| 59 | -Make sure pages tree repair generates warnings. | |
| 60 | - | |
| 61 | 60 | * Document that /Length is ignored in stream dictionary replacements |
| 62 | 61 | |
| 63 | -Try to never flatten pages tree. Make sure we do something reasonable | |
| 64 | -with pages tree repair. The problem is that if pages tree repair is | |
| 65 | -done as a side effect of running --json, the qpdf part of the json may | |
| 66 | -contain object numbers that aren't there. Maybe we need to indicate | |
| 67 | -whether pages tree repair has been done in the json, but this would | |
| 68 | -have to be known early in parsing, which is a problem. | |
| 69 | - | |
| 70 | 62 | General things to remember: |
| 71 | 63 | |
| 72 | 64 | * Make sure all the information from --check and other informational |
| ... | ... | @@ -240,6 +232,38 @@ Additionally, using "n n R" as a key in "objects" and "objectinfo" |
| 240 | 232 | messes up searching for things. |
| 241 | 233 | |
| 242 | 234 | |
| 235 | +QPDFPagesTree | |
| 236 | +============= | |
| 237 | + | |
| 238 | +Partial work is on qpdf-pages-tree branch. QPDFPageTree is mostly | |
| 239 | +implemented and mostly tested. There are not enough cases of different | |
| 240 | +kinds of operations (pclm, linearize, json, etc.) with non-flat pages | |
| 241 | +trees. Insertion is not implemented. | |
| 242 | + | |
| 243 | +Page tree repair is silent (no warnings) and has a comment saying that | |
| 244 | +we don't need warnings, but I think we should have warnings now that | |
| 245 | +we have json v2. The reason is that page tree repair will change | |
| 246 | +object numbers, and it's useful to know that. | |
| 247 | + | |
| 248 | +I'm thinking we will want to keep a pages cache for efficient | |
| 249 | +insertion. There's no reason we can't keep a vector of page objects up | |
| 250 | +to date and just do a traversal the first time we do getAllPages just | |
| 251 | +like we do now. The difference is that we would not flatten the pages | |
| 252 | +tree. It would be useful to go through QPDF_pages and re-reimplement | |
| 253 | +everything without calling flattenPagesTree. Then we can remove | |
| 254 | +flattenPagesTree, which is private. | |
| 255 | + | |
| 256 | +In its current state, QPDFPagesTree does not proactively fix /Type or | |
| 257 | +correct page objects that are used multiple times. You have to | |
| 258 | +traverse the pages tree to trigger this operation. It would be nice if | |
| 259 | +we would do that somewhere but not do it more often than necessary so | |
| 260 | +isPagesObject and isPageObject are reliable and can be made more | |
| 261 | +reliable. Maybe add a validate or repair function? It should also make | |
| 262 | +sure /Count and /Parent are correct. | |
| 263 | + | |
| 264 | +refs/attic/QPDFPagesTree-old -- original, abndoned branch -- clean up | |
| 265 | +when done. | |
| 266 | + | |
| 243 | 267 | QPDFJob |
| 244 | 268 | ======= |
| 245 | 269 | ... | ... |