Commit 512a518dd9327147444d4207cc395bff967d1079

Authored by Jay Berkenbilt
1 parent f34af6b8

Update TODO

Showing 1 changed file with 40 additions and 1 deletions
1 Soon 1 Soon
2 ==== 2 ====
3 3
  4 + * Take changes on encryption-keys branch and make them usable.
  5 + Replace the hex encoding and decoding piece, and come up with a
  6 + more robust way of specifying the key.
  7 +
4 * Consider whether there should be a mode in which QPDFObjectHandle 8 * Consider whether there should be a mode in which QPDFObjectHandle
5 returns nulls for operations on the wrong type instead of asserting 9 returns nulls for operations on the wrong type instead of asserting
6 the type. The way things are wired up now, this would have to be a 10 the type. The way things are wired up now, this would have to be a
@@ -19,7 +23,7 @@ Soon @@ -19,7 +23,7 @@ Soon
19 23
20 * Support user-pluggable stream filters. This would enable external 24 * Support user-pluggable stream filters. This would enable external
21 code to provide interpretation for filters that are missing from 25 code to provide interpretation for filters that are missing from
22 - qpdf. Make it possible for user-provided fitlers to override 26 + qpdf. Make it possible for user-provided filters to override
23 built-in filters. Make sure that the pluggable filters can be 27 built-in filters. Make sure that the pluggable filters can be
24 prioritized so that we can poll all registered filters to see 28 prioritized so that we can poll all registered filters to see
25 whether they are capable of filtering a particular stream. 29 whether they are capable of filtering a particular stream.
@@ -37,6 +41,41 @@ Soon @@ -37,6 +41,41 @@ Soon
37 - See ../misc/broken-files 41 - See ../misc/broken-files
38 42
39 43
  44 +Lexical
  45 +=======
  46 +
  47 +Consider rewriting the tokenizer. These are rough ideas at this point.
  48 +I may or may not do this as described.
  49 +
  50 + * Use flex. Generate them from ./autogen.sh and include them in the
  51 + source package, but do not commit them.
  52 +
  53 + * Make it possible to run the lexer (tokenizer) over a while file
  54 + such that the following things would be possible:
  55 +
  56 + * Rewrite fix-qdf in C++ so that there is no longer a runtime perl
  57 + dependency
  58 +
  59 + * Create a way to filter content streams that could be used to
  60 + preserve the content stream exactly including spaces but also to
  61 + do things like replace everything between a detected set of
  62 + markers. This is to support form flattening. Ideally, it should
  63 + be possible to use this programmatically on broken files.
  64 +
  65 + * Make it possible to replace all strings in a file lexically even
  66 + on badly broken files. Ideally this should work files that are
  67 + lacking xref, have broken links, etc., and ideally it should work
  68 + with encrypted files if possible. This should go through the
  69 + streams and strings and replace them with fixed or random
  70 + characters, preferably, but not necessarily, in a manner that
  71 + works with fonts. One possibility would be to detect whether a
  72 + string contains characters with normal encoding, and if so, use
  73 + 0x41. If the string uses character maps, use 0x01. The output
  74 + should otherwise be unrelated to the input. This could be built
  75 + after the filtering and tokenizer rewrite and should be done in a
  76 + manner that takes advantage of the other lexical features. This
  77 + sanitizer should also clear metadata and replace images.
  78 +
40 General 79 General
41 ======= 80 =======
42 81