Commit a5367003c3eaf7f21d369c1c6d11338564cf04f2
1 parent
2118eeca
TODO: add analytics ideas
Showing
1 changed file
with
20 additions
and
1 deletions
TODO
| ... | ... | @@ -58,7 +58,6 @@ C++-11 |
| 58 | 58 | definitely break binary compatibility as the PointerHolder<Members> |
| 59 | 59 | pattern is part of the ABI for almost every class. |
| 60 | 60 | |
| 61 | - | |
| 62 | 61 | Page splitting/merging |
| 63 | 62 | ====================== |
| 64 | 63 | |
| ... | ... | @@ -208,6 +207,26 @@ Future ideas: |
| 208 | 207 | Also, it turns out that PointerHolder is more performant than |
| 209 | 208 | std::shared_ptr. |
| 210 | 209 | |
| 210 | +Analytics | |
| 211 | +========= | |
| 212 | + | |
| 213 | +Consider features that make it easier to detect certain patterns in | |
| 214 | +PDF files. The information below could be computed using an external | |
| 215 | +program that reads the existing json, but if it's useful enough, we | |
| 216 | +could add it directly to the json output. | |
| 217 | + | |
| 218 | + * Add to "pages" in the json: | |
| 219 | + * "inheritsresources": bool; whether there are any inherited | |
| 220 | + attributes from ancestor page tree nodes | |
| 221 | + * "sharedresources": a list of indirect objects that are | |
| 222 | + "/Resources" dictionaries or "XObject" resource dictionary subkeys | |
| 223 | + of either the page itself or of any form XObject referenced by the | |
| 224 | + page. | |
| 225 | + | |
| 226 | + * Add to "objectinfo" in json: "directpagerefcount": the number of | |
| 227 | + pages that directly reference this object (i.e., you can find an | |
| 228 | + indirect reference to the object in the page dictionary without | |
| 229 | + traversing over any indirect objects) | |
| 211 | 230 | |
| 212 | 231 | General |
| 213 | 232 | ======= | ... | ... |