Commit c8729398ddb9ac82b00bbafaf24e8d37543e5b9e

Authored by Jay Berkenbilt
1 parent b4bd124b

Generate help content from manual

This is a massive rewrite of the help text and cli.rst section of the
manual. All command-line flags now have their own help and are
specifically index. qpdf --help is completely redone.
Makefile
... ... @@ -127,7 +127,9 @@ check: $(TEST_TARGETS)
127 127 .PHONY: spell
128 128 # npm install -g cspell; add exceptions to cSpell.json
129 129 spell:
130   - cspell **/*.hh include/qpdf/*.h **/*.cc manual/* ChangeLog README* TODO
  130 + cspell **/*.hh include/qpdf/*.h **/*.cc \
  131 + manual/*.rst manual/*.in manual/_ext/*.py \
  132 + ChangeLog README* TODO
131 133  
132 134 # Install targets are in the make directory in the rules-specific make
133 135 # fragments.
... ...
... ... @@ -34,16 +34,14 @@ Documentation
34 34 * Consider which parts might be good candidates for moving to the
35 35 wiki.
36 36  
37   -* See #530 -- add an appendix explaining PDF encryption in general
38   - plus how it's handled by qpdf. Or maybe this should go on the wiki.
39   -
40 37 Document-level work
41 38 ===================
42 39  
43 40 * Ideas here may by superseded by #593.
44 41  
45 42 * QPDFPageCopier -- object for moving pages around within files or
46   - between files and performing various transformations
  43 + between files and performing various transformations. Reread/rewrite
  44 + _page-selection in the manual if needed.
47 45  
48 46 * Handle all the stuff of pages and split-pages
49 47 * Do n-up, booklet, collation
... ...
cSpell.json
... ... @@ -54,6 +54,7 @@
54 54 "cerr",
55 55 "cfis",
56 56 "cflags",
  57 + "ciphertext",
57 58 "classname",
58 59 "clearsign",
59 60 "cleartext",
... ... @@ -149,6 +150,7 @@
149 150 "hosoda",
150 151 "htcondor",
151 152 "htdocs",
  153 + "idempotency",
152 154 "ifdefs",
153 155 "ifeq",
154 156 "ifstream",
... ...
generate_auto_job
... ... @@ -19,10 +19,16 @@ def warn(*args, **kwargs):
19 19  
20 20  
21 21 class Main:
22   - SOURCES = [whoami, 'job.yml', 'manual/cli.rst']
  22 + SOURCES = [
  23 + whoami,
  24 + 'manual/_ext/qpdf.py',
  25 + 'job.yml',
  26 + 'manual/cli.rst',
  27 + ]
23 28 DESTS = {
24 29 'decl': 'libqpdf/qpdf/auto_job_decl.hh',
25 30 'init': 'libqpdf/qpdf/auto_job_init.hh',
  31 + 'help': 'libqpdf/qpdf/auto_job_help.hh',
26 32 }
27 33 SUMS = 'job.sums'
28 34  
... ... @@ -100,14 +106,22 @@ class Main:
100 106 short_text = None
101 107 long_text = None
102 108  
103   - print('this->ap.addHelpFooter("For detailed help, visit'
104   - ' the qpdf manual: https://qpdf.readthedocs.io\\n");', file=f)
  109 + # Generate a bunch of short static functions rather than a big
  110 + # member function for help. Some compilers have problems with
  111 + # very large member functions in classes in anonymous
  112 + # namespaces.
  113 +
  114 + help_files = 0
  115 + help_lines = 0
  116 +
  117 + self.all_topics = set(self.options_without_help)
  118 + self.referenced_topics = set()
105 119  
106 120 def set_indent(x):
107 121 nonlocal indent
108 122 indent = ' ' * len(x)
109 123  
110   - def append_long_text(line):
  124 + def append_long_text(line, topic):
111 125 nonlocal indent, long_text
112 126 if line == '\n':
113 127 long_text += '\n'
... ... @@ -115,13 +129,23 @@ class Main:
115 129 long_text += line[len(indent):]
116 130 else:
117 131 long_text = long_text.strip()
118   - if long_text != '':
119   - long_text += '\n'
  132 + if long_text == '':
  133 + raise Exception(f'missing long text for {topic}')
  134 + long_text += '\n'
  135 + for i in re.finditer(r'--help=([^\.\s]+)', long_text):
  136 + self.referenced_topics.add(i.group(1))
120 137 return True
121 138 return False
122 139  
123 140 lineno = 0
124 141 for line in df.readlines():
  142 + if help_lines == 0:
  143 + if help_files > 0:
  144 + print('}', file=f)
  145 + help_files += 1
  146 + help_lines += 1
  147 + print(f'static void add_help_{help_files}(QPDFArgParser& ap)\n'
  148 + '{', file=f)
125 149 lineno += 1
126 150 if state == st_top:
127 151 m = re.match(r'^(\s*\.\. )help-topic (\S+): (.*)$', line)
... ... @@ -132,8 +156,9 @@ class Main:
132 156 long_text = ''
133 157 state = st_topic
134 158 continue
135   - m = re.match(r'^(\s*\.\. )qpdf:option:: (([^=\s]+)(=(\S+))?)$',
136   - line)
  159 + m = re.match(
  160 + r'^(\s*\.\. )qpdf:option:: (([^=\s]+)([= ](.+))?)$',
  161 + line)
137 162 if m:
138 163 if topic is None:
139 164 raise Exception('option seen before topic')
... ... @@ -150,9 +175,11 @@ class Main:
150 175 state = st_option
151 176 continue
152 177 elif state == st_topic:
153   - if append_long_text(line):
154   - print(f'this->ap.addHelpTopic("{topic}", "{short_text}",'
  178 + if append_long_text(line, topic):
  179 + self.all_topics.add(topic)
  180 + print(f'ap.addHelpTopic("{topic}", "{short_text}",'
155 181 f' R"({long_text})");', file=f)
  182 + help_lines += 1
156 183 state = st_top
157 184 elif state == st_option:
158 185 if line == '\n' or line.startswith(indent):
... ... @@ -162,12 +189,36 @@ class Main:
162 189 short_text = m.group(2)
163 190 state = st_option_help
164 191 else:
  192 + raise Exception('option without help text')
165 193 state = st_top
166 194 elif state == st_option_help:
167   - if append_long_text(line):
168   - print(f'this->ap.addOptionHelp("{option}", "{topic}",'
  195 + if append_long_text(line, option):
  196 + if option in self.options_without_help:
  197 + self.options_without_help.remove(option)
  198 + else:
  199 + raise Exception(
  200 + f'help for unknown option {option},'
  201 + f' lineno={lineno}')
  202 + print(f'ap.addOptionHelp("{option}", "{topic}",'
169 203 f' "{short_text}", R"({long_text})");', file=f)
  204 + help_lines += 1
170 205 state = st_top
  206 + if help_lines == 20:
  207 + help_lines = 0
  208 + print('}', file=f)
  209 + print('static void add_help(QPDFArgParser& ap)\n{', file=f)
  210 + for i in range(help_files):
  211 + print(f' add_help_{i+1}(ap);', file=f)
  212 + print('ap.addHelpFooter("For detailed help, visit'
  213 + ' the qpdf manual: https://qpdf.readthedocs.io\\n");', file=f)
  214 + print('}\n', file=f)
  215 + for i in self.referenced_topics:
  216 + if i not in self.all_topics:
  217 + raise Exception(f'help text referenced --help={i}')
  218 + for i in self.options_without_help:
  219 + raise Exception(
  220 + 'Options without help: ' +
  221 + ', '.join(self.options_without_help))
171 222  
172 223 def generate(self):
173 224 warn(f'{whoami}: regenerating auto job files')
... ... @@ -175,12 +226,19 @@ class Main:
175 226 with open('job.yml', 'r') as f:
176 227 data = yaml.safe_load(f.read())
177 228 self.validate(data)
  229 + self.options_without_help = set(
  230 + ['--completion-bash', '--completion-zsh', '--help']
  231 + )
178 232 with open(self.DESTS['decl'], 'w') as f:
179 233 print(BANNER, file=f)
180 234 self.generate_decl(data, f)
181 235 with open(self.DESTS['init'], 'w') as f:
182 236 print(BANNER, file=f)
183 237 self.generate_init(data, f)
  238 + with open(self.DESTS['help'], 'w') as f:
  239 + with open('manual/cli.rst', 'r') as df:
  240 + print(BANNER, file=f)
  241 + self.generate_doc(df, f)
184 242  
185 243 # Update hashes last to ensure that this will be rerun in the
186 244 # event of a failure.
... ... @@ -275,24 +333,29 @@ class Main:
275 333 print('this->ap.addPositional('
276 334 f'p(&ArgParser::{prefix}Positional));', file=f)
277 335 for i in o.get('bare', []):
  336 + self.options_without_help.add(f'--{i}')
278 337 identifier = self.to_identifier(i, prefix, False)
279 338 print(f'this->ap.addBare("{i}", '
280 339 f'b(&ArgParser::{identifier}));', file=f)
281 340 for i in o.get('optional_parameter', []):
  341 + self.options_without_help.add(f'--{i}')
282 342 identifier = self.to_identifier(i, prefix, False)
283 343 print(f'this->ap.addOptionalParameter("{i}", '
284 344 f'p(&ArgParser::{identifier}));', file=f)
285 345 for k, v in o.get('required_parameter', {}).items():
  346 + self.options_without_help.add(f'--{k}')
286 347 identifier = self.to_identifier(k, prefix, False)
287 348 print(f'this->ap.addRequiredParameter("{k}", '
288 349 f'p(&ArgParser::{identifier})'
289 350 f', "{v}");', file=f)
290 351 for k, v in o.get('required_choices', {}).items():
  352 + self.options_without_help.add(f'--{k}')
291 353 identifier = self.to_identifier(k, prefix, False)
292 354 print(f'this->ap.addChoices("{k}", '
293 355 f'p(&ArgParser::{identifier})'
294 356 f', true, {v}_choices);', file=f)
295 357 for k, v in o.get('optional_choices', {}).items():
  358 + self.options_without_help.add(f'--{k}')
296 359 identifier = self.to_identifier(k, prefix, False)
297 360 print(f'this->ap.addChoices("{k}", '
298 361 f'p(&ArgParser::{identifier})'
... ... @@ -312,8 +375,6 @@ class Main:
312 375 for j in ft['options']:
313 376 print('this->ap.copyFromOtherTable'
314 377 f'("{j}", "{other_table}");', file=f)
315   - with open('manual/cli.rst', 'r') as df:
316   - self.generate_doc(df, f)
317 378  
318 379  
319 380 if __name__ == '__main__':
... ...
job.sums
1 1 # Generated by generate_auto_job
2   -generate_auto_job 1f42fc554778d95210d11c44e858214b4854ead907d1c9ea84fe37f993ea1a23
  2 +generate_auto_job 466aa9211549cebeb3fedc6413108981aeeddd89936621095f5f5223cee9880b
3 3 job.yml 25c85cba1ae01dac9cd0f9cb7b734e7e3e531c0023ea2b892dc0d40bda1c1146
4 4 libqpdf/qpdf/auto_job_decl.hh 97395ecbe590b23ae04d6cce2080dbd0e998917ff5eeaa5c6aafa91041d3cd6a
5   -libqpdf/qpdf/auto_job_init.hh 2afffb5002ff28a3909f709709f65d77bf2289dd72d5ea3d1598a36664a49c73
6   -manual/cli.rst f0109cca3366a9da4b0a05e3cce996ece2d776321a3f689aeaa2d6af599eee88
  5 +libqpdf/qpdf/auto_job_help.hh fa7ff1d1f6289881ac3a485107d15240c4992c59cff506be425354557108d184
  6 +libqpdf/qpdf/auto_job_init.hh 465bf46769559ceb77110d1b9d3293ba9b3595850b49848c31aeabd10aadb4ad
  7 +manual/_ext/qpdf.py 855fe12de5af7a10bb24be6ecc4d5dff4c84ac58cf388a13be6bbb394346a67d
  8 +manual/cli.rst c26e877d2065ac917edffdd6a037d2191b64d7c25beb4e8df1acc174b20b3ff4
... ...
libqpdf/QPDFArgParser.cc
... ... @@ -967,21 +967,20 @@ void
967 967 QPDFArgParser::getAllHelp(std::ostringstream& msg)
968 968 {
969 969 getTopHelp(msg);
970   - auto show = [this, &msg](std::map<std::string, HelpTopic>& topics,
971   - std::string const& label) {
  970 + auto show = [this, &msg](std::map<std::string, HelpTopic>& topics) {
972 971 for (auto const& i: topics)
973 972 {
974 973 auto const& topic = i.first;
975 974 msg << std::endl
976   - << "== " << label << " " << topic
  975 + << "== " << topic
977 976 << " (" << i.second.short_text << ") =="
978 977 << std::endl
979 978 << std::endl;
980 979 getTopicHelp(topic, i.second, msg);
981 980 }
982 981 };
983   - show(this->m->help_topics, "topic");
984   - show(this->m->option_help, "option");
  982 + show(this->m->help_topics);
  983 + show(this->m->option_help);
985 984 msg << std::endl << "====" << std::endl;
986 985 }
987 986  
... ...
libqpdf/QPDFJob_argv.cc
... ... @@ -48,6 +48,8 @@ ArgParser::ArgParser(QPDFArgParser&amp; ap, QPDFJob&amp; o) :
48 48 initOptionTables();
49 49 }
50 50  
  51 +#include <qpdf/auto_job_help.hh>
  52 +
51 53 void
52 54 ArgParser::initOptionTables()
53 55 {
... ... @@ -55,6 +57,8 @@ ArgParser::initOptionTables()
55 57 # include <qpdf/auto_job_init.hh>
56 58 this->ap.addFinalCheck(
57 59 QPDFArgParser::bindBare(&ArgParser::doFinalChecks, this));
  60 + // add_help is defined in auto_job_help.hh
  61 + add_help(this->ap);
58 62 }
59 63  
60 64 void
... ... @@ -127,513 +131,6 @@ ArgParser::argCopyright()
127 131 << std::endl;
128 132 }
129 133  
130   -#if 0
131   -void
132   -ArgParser::argHelp()
133   -{
134   - // QXXXQ
135   - std::cout
136   - // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
137   - << "Usage: qpdf [options] {infile | --empty} [page_selection_options] outfile\n"
138   - << "\n"
139   - << "An option summary appears below. Please see the documentation for details.\n"
140   - << "\n"
141   - << "If @filename appears anywhere in the command-line, each line of filename\n"
142   - << "will be interpreted as an argument. No interpolation is done. Line\n"
143   - << "terminators are stripped, but leading and trailing whitespace is\n"
144   - << "intentionally preserved. @- can be specified to read from standard input.\n"
145   - << "\n"
146   - << "The output file can be - to indicate writing to standard output, or it can\n"
147   - << "be --replace-input to cause qpdf to replace the input file with the output.\n"
148   - << "\n"
149   - << "Note that when contradictory options are provided, whichever options are\n"
150   - << "provided last take precedence.\n"
151   - << "\n"
152   - << "\n"
153   - << "Basic Options\n"
154   - << "-------------\n"
155   - << "\n"
156   - << "--version show version of qpdf\n"
157   - << "--copyright show qpdf's copyright and license information\n"
158   - << "--help show command-line argument help\n"
159   - << "--show-crypto show supported crypto providers; default is first\n"
160   - << "--completion-bash output a bash complete command you can eval\n"
161   - << "--completion-zsh output a zsh complete command you can eval\n"
162   - << "--password=password specify a password for accessing encrypted files\n"
163   - << "--password-file=file get the password the first line \"file\"; use \"-\"\n"
164   - << " to read the password from stdin (without prompt or\n"
165   - << " disabling echo, so use with caution)\n"
166   - << "--is-encrypted silently exit 0 if the file is encrypted or 2\n"
167   - << " if not; useful for shell scripts\n"
168   - << "--requires-password silently exit 0 if a password (other than as\n"
169   - << " supplied) is required, 2 if the file is not\n"
170   - << " encrypted, or 3 if the file is encrypted\n"
171   - << " but requires no password or the supplied password\n"
172   - << " is correct; useful for shell scripts\n"
173   - << "--verbose provide additional informational output\n"
174   - << "--progress give progress indicators while writing output\n"
175   - << "--no-warn suppress warnings\n"
176   - << "--warning-exit-0 exit with code 0 instead of 3 if there are warnings\n"
177   - << "--linearize generated a linearized (web optimized) file\n"
178   - << "--replace-input use in place of specifying an output file; qpdf will\n"
179   - << " replace the input file with the output\n"
180   - << "--copy-encryption=file copy encryption parameters from specified file\n"
181   - << "--encryption-file-password=password\n"
182   - << " password used to open the file from which encryption\n"
183   - << " parameters are being copied\n"
184   - << "--allow-weak-crypto allow creation of files using weak cryptographic\n"
185   - << " algorithms\n"
186   - << "--encrypt options -- generate an encrypted file\n"
187   - << "--decrypt remove any encryption on the file\n"
188   - << "--password-is-hex-key treat primary password option as a hex-encoded key\n"
189   - << "--suppress-password-recovery\n"
190   - << " do not attempt recovering from password string\n"
191   - << " encoding errors\n"
192   - << "--password-mode=mode control qpdf's encoding of passwords\n"
193   - << "--pages options -- select specific pages from one or more files\n"
194   - << "--collate=n causes files specified in --pages to be collated\n"
195   - << " in groups of n pages (default 1) rather than\n"
196   - << " concatenated\n"
197   - << "--flatten-rotation move page rotation from /Rotate key to content\n"
198   - << "--rotate=[+|-]angle[:page-range]\n"
199   - << " rotate each specified page 0, 90, 180, or 270\n"
200   - << " degrees; rotate all pages if no page range is given\n"
201   - << "--split-pages=[n] write each output page to a separate file\n"
202   - << "--overlay options -- overlay pages from another file\n"
203   - << "--underlay options -- underlay pages from another file\n"
204   - << "\n"
205   - << "Note that you can use the @filename or @- syntax for any argument at any\n"
206   - << "point in the command. This provides a good way to specify a password without\n"
207   - << "having to explicitly put it on the command line. @filename or @- must be a\n"
208   - << "word by itself. Syntax such as --arg=@filename doesn't work.\n"
209   - << "\n"
210   - << "If none of --copy-encryption, --encrypt or --decrypt are given, qpdf will\n"
211   - << "preserve any encryption data associated with a file.\n"
212   - << "\n"
213   - << "Note that when copying encryption parameters from another file, all\n"
214   - << "parameters will be copied, including both user and owner passwords, even\n"
215   - << "if the user password is used to open the other file. This works even if\n"
216   - << "the owner password is not known.\n"
217   - << "\n"
218   - << "The --password-is-hex-key option overrides the normal computation of\n"
219   - << "encryption keys. It only applies to the password used to open the main\n"
220   - << "file. This option is not ordinarily useful but can be helpful for forensic\n"
221   - << "or investigatory purposes. See manual for further discussion.\n"
222   - << "\n"
223   - << "The --rotate flag can be used to specify pages to rotate pages either\n"
224   - << "0, 90, 180, or 270 degrees. The page range is specified in the same\n"
225   - << "format as with the --pages option, described below. Repeat the option\n"
226   - << "to rotate multiple groups of pages. If the angle is preceded by + or -,\n"
227   - << "it is added to or subtracted from the original rotation. Otherwise, the\n"
228   - << "rotation angle is set explicitly to the given value. You almost always\n"
229   - << "want to use + or - unless you are certain about the internals of the PDF\n"
230   - << "you are working with.\n"
231   - << "\n"
232   - << "If --split-pages is specified, each page is written to a separate output\n"
233   - << "file. File names are generated as follows:\n"
234   - << "* If the string %d appears in the output file name, it is replaced with a\n"
235   - << " zero-padded page range starting from 1\n"
236   - << "* Otherwise, if the output file name ends in .pdf (case insensitive), a\n"
237   - << " zero-padded page range, preceded by a dash, is inserted before the file\n"
238   - << " extension\n"
239   - << "* Otherwise, the file name is appended with a zero-padded page range\n"
240   - << " preceded by a dash.\n"
241   - << "Page ranges are single page numbers for single-page groups or first-last\n"
242   - << "for multipage groups.\n"
243   - << "\n"
244   - << "\n"
245   - << "Encryption Options\n"
246   - << "------------------\n"
247   - << "\n"
248   - << " --encrypt user-password owner-password key-length flags --\n"
249   - << "\n"
250   - << "Note that -- terminates parsing of encryption flags.\n"
251   - << "\n"
252   - << "Either or both of the user password and the owner password may be\n"
253   - << "empty strings.\n"
254   - << "\n"
255   - << "key-length may be 40, 128, or 256\n"
256   - << "\n"
257   - << "Additional flags are dependent upon key length.\n"
258   - << "\n"
259   - << " If 40:\n"
260   - << "\n"
261   - << " --print=[yn] allow printing\n"
262   - << " --modify=[yn] allow document modification\n"
263   - << " --extract=[yn] allow text/graphic extraction\n"
264   - << " --annotate=[yn] allow comments and form fill-in and signing\n"
265   - << "\n"
266   - << " If 128:\n"
267   - << "\n"
268   - << " --accessibility=[yn] allow accessibility to visually impaired\n"
269   - << " --extract=[yn] allow other text/graphic extraction\n"
270   - << " --print=print-opt control printing access\n"
271   - << " --assemble=[yn] allow document assembly\n"
272   - << " --annotate=[yn] allow commenting/filling form fields\n"
273   - << " --form=[yn] allow filling form fields\n"
274   - << " --modify-other=[yn] allow other modifications\n"
275   - << " --modify=modify-opt control modify access (old way)\n"
276   - << " --cleartext-metadata prevents encryption of metadata\n"
277   - << " --use-aes=[yn] indicates whether to use AES encryption\n"
278   - << " --force-V4 forces use of V=4 encryption handler\n"
279   - << "\n"
280   - << " If 256, options are the same as 128 with these exceptions:\n"
281   - << " --force-V4 this option is not available with 256-bit keys\n"
282   - << " --use-aes this option is always on with 256-bit keys\n"
283   - << " --force-R5 forces use of deprecated R=5 encryption\n"
284   - << " --allow-insecure allow the owner password to be empty when the\n"
285   - << " user password is not empty\n"
286   - << "\n"
287   - << " print-opt may be:\n"
288   - << "\n"
289   - << " full allow full printing\n"
290   - << " low allow only low-resolution printing\n"
291   - << " none disallow printing\n"
292   - << "\n"
293   - << " modify-opt may be:\n"
294   - << "\n"
295   - << " all allow full document modification\n"
296   - << " annotate allow comment authoring and form operations\n"
297   - << " form allow form field fill-in and signing\n"
298   - << " assembly allow document assembly only\n"
299   - << " none allow no modifications\n"
300   - << "\n"
301   - << "The default for each permission option is to be fully permissive. Please\n"
302   - << "refer to the manual for more details on the modify options.\n"
303   - << "\n"
304   - << "Specifying cleartext-metadata forces the PDF version to at least 1.5.\n"
305   - << "Specifying use of AES forces the PDF version to at least 1.6. These\n"
306   - << "options are both off by default.\n"
307   - << "\n"
308   - << "The --force-V4 flag forces the V=4 encryption handler introduced in PDF 1.5\n"
309   - << "to be used even if not otherwise needed. This option is primarily useful\n"
310   - << "for testing qpdf and has no other practical use.\n"
311   - << "\n"
312   - << "A warning will be issued if you attempt to encrypt a file with a format that\n"
313   - << "uses a weak cryptographic algorithm such as RC4. To suppress the warning,\n"
314   - << "specify the option --allow-weak-crypto. This option is outside of encryption\n"
315   - << "options (e.g. --allow-week-crypto --encrypt u o 128 --)\n"
316   - << "\n"
317   - << "\n"
318   - << "Password Modes\n"
319   - << "--------------\n"
320   - << "\n"
321   - << "The --password-mode controls how qpdf interprets passwords supplied\n"
322   - << "on the command-line. qpdf's default behavior is correct in almost all\n"
323   - << "cases, but you can fine-tune with this option.\n"
324   - << "\n"
325   - << " bytes: use the password literally as supplied\n"
326   - << " hex-bytes: interpret the password as a hex-encoded byte string\n"
327   - << " unicode: interpret the password as a UTF-8 encoded string\n"
328   - << " auto: attempt to infer the encoding and adjust as needed\n"
329   - << "\n"
330   - << "This is a complex topic. See the manual for a complete discussion.\n"
331   - << "\n"
332   - << "\n"
333   - << "Page Selection Options\n"
334   - << "----------------------\n"
335   - << "\n"
336   - << "These options allow pages to be selected from one or more PDF files.\n"
337   - << "Whatever file is given as the primary input file is used as the\n"
338   - << "starting point, but its pages are replaced with pages as specified.\n"
339   - << "\n"
340   - << "--keep-files-open=[yn]\n"
341   - << "--keep-files-open-threshold=count\n"
342   - << "--pages file [ --password=password ] [ page-range ] ... --\n"
343   - << "\n"
344   - << "For each file that pages should be taken from, specify the file, a\n"
345   - << "password needed to open the file (if any), and a page range. The\n"
346   - << "password needs to be given only once per file. If any of the input\n"
347   - << "files are the same as the primary input file or the file used to copy\n"
348   - << "encryption parameters (if specified), you do not need to repeat the\n"
349   - << "password here. The same file can be repeated multiple times. The\n"
350   - << "filename \".\" may be used to refer to the current input file. All\n"
351   - << "non-page data (info, outlines, page numbers, etc. are taken from the\n"
352   - << "primary input file. To discard this, use --empty as the primary\n"
353   - << "input.\n"
354   - << "\n"
355   - << "By default, when more than 200 distinct files are specified, qpdf will\n"
356   - << "close each file when not being referenced. With 200 files or fewer, all\n"
357   - << "files will be kept open at the same time. This behavior can be overridden\n"
358   - << "by specifying --keep-files-open=[yn]. Closing and opening files can have\n"
359   - << "very high overhead on certain file systems, especially networked file\n"
360   - << "systems. The threshold of 200 can be modified with\n"
361   - << "--keep-files-open-threshold\n"
362   - << "\n"
363   - << "The page range is a set of numbers separated by commas, ranges of\n"
364   - << "numbers separated dashes, or combinations of those. The character\n"
365   - << "\"z\" represents the last page. A number preceded by an \"r\" indicates\n"
366   - << "to count from the end, so \"r3-r1\" would be the last three pages of the\n"
367   - << "document. Pages can appear in any order. Ranges can appear with a\n"
368   - << "high number followed by a low number, which causes the pages to appear in\n"
369   - << "reverse. Numbers may be repeated. A page range may be appended with :odd\n"
370   - << "to indicate odd pages in the selected range or :even to indicate even\n"
371   - << "pages.\n"
372   - << "\n"
373   - << "If the page range is omitted, the range of 1-z is assumed. qpdf decides\n"
374   - << "that the page range is omitted if the range argument is either -- or a\n"
375   - << "valid file name and not a valid range.\n"
376   - << "\n"
377   - << "The usual behavior of --pages is to add all pages from the first file,\n"
378   - << "then all pages from the second file, and so on. If the --collate option\n"
379   - << "is specified, then pages are collated instead. In other words, qpdf takes\n"
380   - << "the first page from the first file, the first page from the second file,\n"
381   - << "and so on until it runs out of files; then it takes the second page from\n"
382   - << "each file, etc. When a file runs out of pages, it is skipped until all\n"
383   - << "specified pages are taken from all files.\n"
384   - << "\n"
385   - << "See the manual for examples and a discussion of additional subtleties.\n"
386   - << "\n"
387   - << "\n"
388   - << "Overlay and Underlay Options\n"
389   - << "----------------------------\n"
390   - << "\n"
391   - << "These options allow pages from another file to be overlaid or underlaid\n"
392   - << "on the primary output. Overlaid pages are drawn on top of the destination\n"
393   - << "page and may obscure the page. Underlaid pages are drawn below the\n"
394   - << "destination page.\n"
395   - << "\n"
396   - << "{--overlay | --underlay } file\n"
397   - " [ --password=password ]\n"
398   - " [ --to=page-range ]\n"
399   - " [ --from=[page-range] ]\n"
400   - " [ --repeat=page-range ]\n"
401   - " --\n"
402   - << "\n"
403   - << "For overlay and underlay, a file and optional password are specified, along\n"
404   - << "with a series of optional page ranges. The default behavior is that each\n"
405   - << "page of the overlay or underlay file is imposed on the corresponding page\n"
406   - << "of the primary output until it runs out of pages, and any extra pages are\n"
407   - << "ignored. The page range options all take page ranges in the same form as\n"
408   - << "the --pages option. They have the following meanings:\n"
409   - << "\n"
410   - << " --to: the pages in the primary output to which overlay/underlay is\n"
411   - << " applied\n"
412   - << " --from: the pages from the overlay/underlay file that are used\n"
413   - << " --repeat: pages from the overlay/underlay that are repeated after\n"
414   - << " any \"from\" pages have been exhausted\n"
415   - << "\n"
416   - << "\n"
417   - << "Embedded Files/Attachments Options\n"
418   - << "----------------------------------\n"
419   - << "\n"
420   - << "These options can be used to work with embedded files, also known as\n"
421   - << "attachments.\n"
422   - << "\n"
423   - << "--list-attachments show key and stream number for embedded files;\n"
424   - << " combine with --verbose for more detailed information\n"
425   - << "--show-attachment=key write the contents of the specified attachment to\n"
426   - << " standard output as binary data\n"
427   - << "--add-attachment file options --\n"
428   - << " add or replace an attachment\n"
429   - << "--remove-attachment=key remove the specified attachment; repeatable\n"
430   - << "--copy-attachments-from file options --\n"
431   - << " copy attachments from another file\n"
432   - << "\n"
433   - << "The \"key\" option is the unique name under which the attachment is registered\n"
434   - << "within the PDF file. You can get this using the --list-attachments option. This\n"
435   - << "is usually the same as the filename, but it doesn't have to be.\n"
436   - << "\n"
437   - << "Options for adding attachments:\n"
438   - << "\n"
439   - << " file path to the file to attach\n"
440   - << " --key=key the name of this in the embedded files table;\n"
441   - << " defaults to the last path element of file\n"
442   - << " --filename=name the file name of the attachment; this is what is\n"
443   - << " usually displayed to the user; defaults to the\n"
444   - << " last path element of file\n"
445   - << " --creationdate=date creation date in PDF format; defaults to the\n"
446   - << " current time\n"
447   - << " --moddate=date modification date in PDF format; defaults to the\n"
448   - << " current time\n"
449   - << " --mimetype=type/subtype mime type of attachment (e.g. application/pdf)\n"
450   - << " --description=\"text\" attachment description\n"
451   - << " --replace replace any existing attachment with the same key\n"
452   - << "\n"
453   - << "Options for copying attachments:\n"
454   - << "\n"
455   - << " file file whose attachments should be copied\n"
456   - << " --password=password password to open the other file, if needed\n"
457   - << " --prefix=prefix a prefix to insert in front of each key;\n"
458   - << " required if needed to ensure each attachment\n"
459   - << " has a unique key\n"
460   - << "\n"
461   - << "Date format: D:yyyymmddhhmmss<z> where <z> is either Z for UTC or a timezone\n"
462   - << "offset in the form -hh'mm' or +hh'mm'.\n"
463   - << "Examples: D:20210207161528-05'00', D:20210207211528Z\n"
464   - << "\n"
465   - << "\n"
466   - << "Advanced Parsing Options\n"
467   - << "------------------------\n"
468   - << "\n"
469   - << "These options control aspects of how qpdf reads PDF files. Mostly these are\n"
470   - << "of use to people who are working with damaged files. There is little reason\n"
471   - << "to use these options unless you are trying to solve specific problems.\n"
472   - << "\n"
473   - << "--suppress-recovery prevents qpdf from attempting to recover damaged files\n"
474   - << "--ignore-xref-streams tells qpdf to ignore any cross-reference streams\n"
475   - << "\n"
476   - << "\n"
477   - << "Advanced Transformation Options\n"
478   - << "-------------------------------\n"
479   - << "\n"
480   - << "These transformation options control fine points of how qpdf creates\n"
481   - << "the output file. Mostly these are of use only to people who are very\n"
482   - << "familiar with the PDF file format or who are PDF developers.\n"
483   - << "\n"
484   - << "--stream-data=option controls transformation of stream data (below)\n"
485   - << "--compress-streams=[yn] controls whether to compress streams on output\n"
486   - << "--decode-level=option controls how to filter streams from the input\n"
487   - << "--recompress-flate recompress streams already compressed with Flate\n"
488   - << "--compression-level=n set zlib compression level; most effective with\n"
489   - << " --recompress-flate --object-streams=generate\n"
490   - << "--normalize-content=[yn] enables or disables normalization of content streams\n"
491   - << "--object-streams=mode controls handing of object streams\n"
492   - << "--preserve-unreferenced preserve unreferenced objects\n"
493   - << "--remove-unreferenced-resources={auto,yes,no}\n"
494   - << " whether to remove unreferenced page resources\n"
495   - << "--preserve-unreferenced-resources\n"
496   - << " synonym for --remove-unreferenced-resources=no\n"
497   - << "--newline-before-endstream always put a newline before endstream\n"
498   - << "--coalesce-contents force all pages' content to be a single stream\n"
499   - << "--flatten-annotations=option\n"
500   - << " incorporate rendering of annotations into page\n"
501   - << " contents including those for interactive form\n"
502   - << " fields; may also want --generate-appearances\n"
503   - << "--generate-appearances generate appearance streams for form fields\n"
504   - << "--optimize-images compress images with DCT (JPEG) when advantageous\n"
505   - << "--oi-min-width=w do not optimize images whose width is below w;\n"
506   - << " default is 128. Use 0 to mean no minimum\n"
507   - << "--oi-min-height=h do not optimize images whose height is below h\n"
508   - << " default is 128. Use 0 to mean no minimum\n"
509   - << "--oi-min-area=a do not optimize images whose pixel count is below a\n"
510   - << " default is 16,384. Use 0 to mean no minimum\n"
511   - << "--externalize-inline-images convert inline images to regular images; by\n"
512   - << " default, images of at least 1,024 bytes are\n"
513   - << " externalized\n"
514   - << "--ii-min-bytes=bytes specify minimum size of inline images to be\n"
515   - << " converted to regular images\n"
516   - << "--keep-inline-images exclude inline images from image optimization\n"
517   - << "--remove-page-labels remove any page labels present in the output file\n"
518   - << "--qdf turns on \"QDF mode\" (below)\n"
519   - << "--linearize-pass1=file write intermediate pass of linearized file\n"
520   - << " for debugging\n"
521   - << "--min-version=version sets the minimum PDF version of the output file\n"
522   - << "--force-version=version forces this to be the PDF version of the output file\n"
523   - << "\n"
524   - << "Options for --flatten-annotations are all, print, or screen. If the option\n"
525   - << "is print, only annotations marked as print are included. If the option is\n"
526   - << "screen, options marked as \"no view\" are excluded. Otherwise, annotations\n"
527   - << "are flattened regardless of the presence of print or NoView flags. It is\n"
528   - << "common for PDF files to have a flag set that appearance streams need to be\n"
529   - << "regenerated. This happens when someone changes a form value with software\n"
530   - << "that does not know how to render the new value. qpdf will not flatten form\n"
531   - << "fields in files like this. If you get this warning, you have two choices:\n"
532   - << "either use qpdf's --generate-appearances flag to tell qpdf to go ahead and\n"
533   - << "regenerate appearances, or use some other tool to generate the appearances.\n"
534   - << "qpdf does a pretty good job with most forms when only ASCII and \"Windows\n"
535   - << "ANSI\" characters are used in form field values, but if your form fields\n"
536   - << "contain other characters, rich text, or are other than left justified, you\n"
537   - << "will get better results first saving with other software.\n"
538   - << "\n"
539   - << "Version numbers may be expressed as major.minor.extension-level, so 1.7.3\n"
540   - << "means PDF version 1.7 at extension level 3.\n"
541   - << "\n"
542   - << "Values for stream data options:\n"
543   - << "\n"
544   - << " compress recompress stream data when possible (default)\n"
545   - << " preserve leave all stream data as is\n"
546   - << " uncompress uncompress stream data when possible\n"
547   - << "\n"
548   - << "Values for object stream mode:\n"
549   - << "\n"
550   - << " preserve preserve original object streams (default)\n"
551   - << " disable don't write any object streams\n"
552   - << " generate use object streams wherever possible\n"
553   - << "\n"
554   - << "When --compress-streams=n is specified, this overrides the default behavior\n"
555   - << "of qpdf, which is to attempt compress uncompressed streams. Setting\n"
556   - << "stream data mode to uncompress or preserve has the same effect.\n"
557   - << "\n"
558   - << "The --decode-level parameter may be set to one of the following values:\n"
559   - << " none do not decode streams\n"
560   - << " generalized decode streams compressed with generalized filters\n"
561   - << " including LZW, Flate, and the ASCII encoding filters.\n"
562   - << " specialized additionally decode streams with non-lossy specialized\n"
563   - << " filters including RunLength\n"
564   - << " all additionally decode streams with lossy filters\n"
565   - << " including DCT (JPEG)\n"
566   - << "\n"
567   - << "In qdf mode, by default, content normalization is turned on, and the\n"
568   - << "stream data mode is set to uncompress. QDF mode does not support\n"
569   - << "linearized files. The --linearize flag disables qdf mode.\n"
570   - << "\n"
571   - << "Setting the minimum PDF version of the output file may raise the version\n"
572   - << "but will never lower it. Forcing the PDF version of the output file may\n"
573   - << "set the PDF version to a lower value than actually allowed by the file's\n"
574   - << "contents. You should only do this if you have no other possible way to\n"
575   - << "open the file or if you know that the file definitely doesn't include\n"
576   - << "features not supported later versions.\n"
577   - << "\n"
578   - << "Testing, Inspection, and Debugging Options\n"
579   - << "------------------------------------------\n"
580   - << "\n"
581   - << "These options can be useful for digging into PDF files or for use in\n"
582   - << "automated test suites for software that uses the qpdf library.\n"
583   - << "\n"
584   - << "--deterministic-id generate deterministic /ID\n"
585   - << "--static-id generate static /ID: FOR TESTING ONLY!\n"
586   - << "--static-aes-iv use a static initialization vector for AES-CBC\n"
587   - << " This is option is not secure! FOR TESTING ONLY!\n"
588   - << "--no-original-object-ids suppress original object ID comments in qdf mode\n"
589   - << "--show-encryption quickly show encryption parameters\n"
590   - << "--show-encryption-key when showing encryption, reveal the actual key\n"
591   - << "--check-linearization check file integrity and linearization status\n"
592   - << "--show-linearization check and show all linearization data\n"
593   - << "--show-xref show the contents of the cross-reference table\n"
594   - << "--show-object=trailer|obj[,gen]\n"
595   - << " show the contents of the given object\n"
596   - << " --raw-stream-data show raw stream data instead of object contents\n"
597   - << " --filtered-stream-data show filtered stream data instead of object contents\n"
598   - << "--show-npages print the number of pages in the file\n"
599   - << "--show-pages shows the object/generation number for each page\n"
600   - << " --with-images also shows the object IDs for images on each page\n"
601   - << "--check check file structure + encryption, linearization\n"
602   - << "--json generate a json representation of the file\n"
603   - << "--json-help describe the format of the json representation\n"
604   - << "--json-key=key repeatable; prune json structure to include only\n"
605   - << " specified keys. If absent, all keys are shown\n"
606   - << "--json-object=trailer|[obj,gen]\n"
607   - << " repeatable; include only specified objects in the\n"
608   - << " \"objects\" section of the json. If absent, all\n"
609   - << " objects are shown\n"
610   - << "\n"
611   - << "The json representation generated by qpdf is designed to facilitate\n"
612   - << "processing of qpdf from other programming languages that have a hard\n"
613   - << "time calling C++ APIs. Run qpdf --json-help for details on the format.\n"
614   - << "The manual has more in-depth information about the json representation\n"
615   - << "and certain compatibility guarantees that qpdf provides.\n"
616   - << "\n"
617   - << "The --raw-stream-data and --filtered-stream-data options are ignored\n"
618   - << "unless --show-object is given. Either of these options will cause the\n"
619   - << "stream data to be written to standard output.\n"
620   - << "\n"
621   - << "If --filtered-stream-data is given and --normalize-content=y is also\n"
622   - << "given, qpdf will attempt to normalize the stream data as if it is a\n"
623   - << "page content stream. This attempt will be made even if it is not a\n"
624   - << "page content stream, in which case it will produce unusable results.\n"
625   - << "\n"
626   - << "Ordinarily, qpdf exits with a status of 0 on success or a status of 2\n"
627   - << "if any errors occurred. If there were warnings but not errors, qpdf\n"
628   - << "exits with a status of 3. If warnings would have been issued but --no-warn\n"
629   - << "was given, an exit status of 3 is still used. If you want qpdf to exit\n"
630   - << "with status 0 when there are warnings, use the --warning-exit-0 flag.\n"
631   - << "When --no-warn and --warning-exit-0 are used together, the effect is for\n"
632   - << "qpdf to completely ignore warnings. qpdf does not use exit status 1,\n"
633   - << "since that is used by the shell if it can't execute qpdf.\n";
634   -}
635   -#endif
636   -
637 134 void
638 135 ArgParser::argJsonHelp()
639 136 {
... ...
libqpdf/qpdf/auto_job_help.hh 0 โ†’ 100644
  1 +//
  2 +// This file is automatically generated by generate_auto_job.
  3 +// Edits will be automatically overwritten if the build is
  4 +// run in maintainer mode.
  5 +//
  6 +static void add_help_1(QPDFArgParser& ap)
  7 +{
  8 +ap.addHelpTopic("usage", "basic invocation", R"(Read a PDF file, apply transformations or modifications, and write
  9 +a new PDF file.
  10 +
  11 +Usage: qpdf infile [options] [outfile]
  12 + OR qpdf help-option
  13 +
  14 +- infile, options, and outfile may be in any order as long as infile
  15 + precedes outfile.
  16 +- Use --empty in place of an input file for a zero-page, empty input
  17 +- Use --replace-input in place of an output file to overwrite the
  18 + input file with the output
  19 +- outfile may be - to write to stdout; reading from stdin is not supported
  20 +- @filename is an argument file; each line is treated as a separate
  21 + command-line argument
  22 +- @- may be used to read arguments from stdin
  23 +- Later options may override earlier options if contradictory
  24 +)");
  25 +ap.addOptionHelp("--empty", "usage", "empty input file", R"(Use in place of infile for an empty input. Especially useful
  26 +with --pages.
  27 +)");
  28 +ap.addOptionHelp("--replace-input", "usage", "replace input with output", R"(Use in place of outfile to overwrite the input file with the output.
  29 +)");
  30 +ap.addHelpTopic("exit-status", "meanings of qpdf's exit codes", R"(Meaning of exit codes:
  31 +
  32 +0: no errors or warnings
  33 +1: not used by qpdf but may be used by the shell if unable to invoke qpdf
  34 +2: errors detected
  35 +3: warnings detected, unless --warning-exit-0 is given
  36 +)");
  37 +ap.addOptionHelp("--warning-exit-0", "exit-status", "exit 0 even with warnings", R"(Use exit status 0 instead of 3 when warnings are present. When
  38 +combined with --no-warn, warnings are completely ignored.
  39 +)");
  40 +ap.addHelpTopic("completion", "shell completion", R"(Shell completion is supported with bash and zsh. Use
  41 +eval $(qpdf --completion-bash) or eval $(qpdf --completion-zsh)
  42 +to enable. The QPDF_EXECUTABLE environment variable overrides the
  43 +path to qpdf that these commands output.
  44 +)");
  45 +ap.addOptionHelp("--completion-bash", "completion", "enable bash completion", R"(Output a command that enables bash completion
  46 +)");
  47 +ap.addOptionHelp("--completion-zsh", "completion", "enable zsh completion", R"(Output a command that enables zsh completion
  48 +)");
  49 +ap.addHelpTopic("help", "information about qpdf", R"(Help options provide some information about qpdf itself. Help
  50 +options are only valid as the first and only command-line argument.
  51 +)");
  52 +ap.addOptionHelp("--help", "help", "provide help", R"(Display help information. Run qpdf --help for information about
  53 +how to get help on various topics.
  54 +)");
  55 +ap.addOptionHelp("--version", "help", "show qpdf version", R"(Display the version of qpdf.
  56 +)");
  57 +ap.addOptionHelp("--copyright", "help", "show copyright information", R"(Display copyright and license information.
  58 +)");
  59 +ap.addOptionHelp("--show-crypto", "help", "show available crypto providers", R"(Show a list of available crypto providers, one per line. The
  60 +default provider is shown first.
  61 +)");
  62 +ap.addHelpTopic("general", "general options", R"(General options control qpdf's behavior in ways that are not
  63 +directly related to the operation it is performing.
  64 +)");
  65 +ap.addOptionHelp("--password", "general", "specify password", R"(--password=password
  66 +
  67 +Specify a password for an encrypted, password-protected file.
  68 +Not needed for encrypted files with no password.
  69 +)");
  70 +ap.addOptionHelp("--password-file", "general", "read password from a file", R"(--password-file=filename
  71 +
  72 +The first line of the specified file is used as the password.
  73 +This is used in place of the --password option.
  74 +)");
  75 +ap.addOptionHelp("--verbose", "general", "print additional information", R"(Output additional information about various things qpdf is
  76 +doing, including information about files created and operations
  77 +performed.
  78 +)");
  79 +ap.addOptionHelp("--progress", "general", "show progress when writing", R"(Indicate progress when writing files.
  80 +)");
  81 +ap.addOptionHelp("--no-warn", "general", "suppress printing warning messages", R"(Suppress printing warning messages. If warnings were
  82 +encountered, qpdf still exits with exit status 3.
  83 +Use --warning-exit-0 with --no-warn to completely ignore
  84 +warnings.
  85 +)");
  86 +}
  87 +static void add_help_2(QPDFArgParser& ap)
  88 +{
  89 +ap.addOptionHelp("--deterministic-id", "general", "generate ID deterministically", R"(Generate a secure, random document ID only using static
  90 +information, such as the page contents. Does not use the file's
  91 +name or attributes or the current time.
  92 +)");
  93 +ap.addOptionHelp("--allow-weak-crypto", "general", "allow insecure cryptographic algorithms", R"(All creation of files with weak cryptographic algorithms. This
  94 +option is necessary to create 40-bit files or 128-bit files that
  95 +use RC4 encryption.
  96 +)");
  97 +ap.addOptionHelp("--keep-files-open", "general", "manage keeping multiple files open", R"(--keep-files-open=[yn]
  98 +
  99 +When qpdf needs to work with many files, as when merging large
  100 +numbers of files, explicitly indicate whether files should be
  101 +kept open. The default behavior is to determine this based on
  102 +the number of files.
  103 +)");
  104 +ap.addOptionHelp("--keep-files-open-threshold", "general", "set threshold for --keep-files-open", R"(--keep-files-open-threshold=count
  105 +
  106 +Set the threshold used by --keep-files-open, overriding the
  107 +default value of 200.
  108 +)");
  109 +ap.addHelpTopic("advanced-control", "tweak qpdf's behavior", R"(Advanced control options control qpdf's behavior in ways that would
  110 +normally never be needed by a user but that may be useful to
  111 +developers or people investigating problems with specific files.
  112 +)");
  113 +ap.addOptionHelp("--password-is-hex-key", "advanced-control", "provide hex-encoded encryption key", R"(Provide the underlying file encryption key has a hex-encoded
  114 +string rather than supplying a password. This is an expert
  115 +option.
  116 +)");
  117 +ap.addOptionHelp("--suppress-password-recovery", "advanced-control", "don't try different password encodings", R"(Suppress qpdf's behavior of attempting different encodings of a
  118 +password that contains non-ASCII Unicode characters if the first
  119 +attempt doesn't succeed.
  120 +)");
  121 +ap.addOptionHelp("--password-mode", "advanced-control", "tweak how qpdf encodes passwords", R"(--password-mode={mode}
  122 +
  123 +Fine-tune how qpdf controls encoding of Unicode passwords. Valid
  124 +options are auto, bytes, hex-bytes, and unicode.
  125 +)");
  126 +ap.addOptionHelp("--suppress-recovery", "advanced-control", "suppress error recovery", R"(Avoid attempting to recover when errors are found in a file's
  127 +cross reference table or stream lengths.
  128 +)");
  129 +ap.addOptionHelp("--ignore-xref-streams", "advanced-control", "use xref tables rather than streams", R"(Ignore any cross-reference streams in the file, falling back to
  130 +cross-reference tables or triggering document recovery.
  131 +)");
  132 +ap.addHelpTopic("transformation", "make structural PDF changes", R"(The options below tell qpdf to apply transformations that change
  133 +the structure without changing the content.
  134 +)");
  135 +ap.addOptionHelp("--linearize", "transformation", "linearize (web-optimize) output", R"(Create linearized (web-optimized) output files.
  136 +)");
  137 +ap.addOptionHelp("--encrypt", "transformation", "start encryption options", R"(--encrypt user owner key-length [ options ] --
  138 +
  139 +Run qpdf --help=encryption for details.
  140 +)");
  141 +ap.addOptionHelp("--decrypt", "transformation", "remove encryption from input file", R"(Create an unencrypted output file even if the input file was
  142 +encrypted. Normally qpdf preserves whatever encryption was
  143 +present on the input file. This option overrides that behavior.
  144 +)");
  145 +ap.addOptionHelp("--copy-encryption", "transformation", "copy another file's encryption details", R"(--copy-encryption=file
  146 +
  147 +Copy encryption details from the specified file instead of
  148 +preserving the input file's encryption. Use --encryption-file-password
  149 +to specify the encryption file's password.
  150 +)");
  151 +ap.addOptionHelp("--encryption-file-password", "transformation", "supply password for --copy-encryption", R"(--encryption-file-password=password
  152 +
  153 +If the file named in --copy-encryption requires a password, use
  154 +this option to specify the password.
  155 +)");
  156 +ap.addOptionHelp("--qdf", "transformation", "enable viewing PDF code in a text editor", R"(Create a PDF file suitable for viewing in a text editor and even
  157 +editing. This is to edit the PDF code, not the page contents.
  158 +All streams that can be uncompressed are uncompressed, and
  159 +content streams are normalized, among other changes. The
  160 +companion tool "fix-qdf" can be used to repair hand-edited QDF
  161 +files. QDF is a feature specific to the qpdf tool. There is a
  162 +chapter about it in the manual.
  163 +)");
  164 +ap.addOptionHelp("--no-original-object-ids", "transformation", "omit original object ID in qdf", R"(Omit comments in a QDF file indicating the object ID an object
  165 +had in the original file.
  166 +)");
  167 +ap.addOptionHelp("--compress-streams", "transformation", "compress uncompressed streams", R"(--compress-streams=[yn]
  168 +
  169 +Setting --compress-streams=n prevents qpdf from compressing
  170 +uncompressed streams. This can be useful if you are leaving some
  171 +streams uncompressed intentionally.
  172 +)");
  173 +}
  174 +static void add_help_3(QPDFArgParser& ap)
  175 +{
  176 +ap.addOptionHelp("--decode-level", "transformation", "control which streams to uncompress", R"(--decode-level=option
  177 +
  178 +When uncompressing streams, control which types of compression
  179 +schemes should be uncompressed:
  180 +- none: don't uncompress anything
  181 +- generalized: uncompress streams compressed with a
  182 + general-purpose compression algorithm. This is the default.
  183 +- specialized: in addition to generalized, also uncompress
  184 + streams compressed with a special-purpose but non-lossy
  185 + compression scheme
  186 +- all: in addition to specialized, uncompress streams compressed
  187 + with lossy compression schemes like JPEG (DCT)
  188 +qpdf does not know how to uncompress all compression schemes.
  189 +)");
  190 +ap.addOptionHelp("--stream-data", "transformation", "control stream compression", R"(--stream-data=option
  191 +
  192 +This option controls how streams are compressed in the output.
  193 +It is less granular than the newer options, --compress-streams
  194 +and --decode-level.
  195 +
  196 +Options:
  197 +- compress: same as --compress-streams=y --decode-level=generalized
  198 +- preserve: same as --compress-streams=n --decode-level=none
  199 +- uncompress: same as --compress-streams=n --decode-level=generalized
  200 +)");
  201 +ap.addOptionHelp("--recompress-flate", "transformation", "uncompress and recompress flate", R"(The default generalized compression scheme used by PDF is flate,
  202 +which is the same as used by zip and gzip. Usually qpdf just
  203 +leaves these alone. This option tells qpdf to uncompress and
  204 +recompress streams compressed with flate. This can be useful
  205 +when combined with --compression-level.
  206 +)");
  207 +ap.addOptionHelp("--compression-level", "transformation", "set compression level for flate", R"(--compression-level=level
  208 +
  209 +Set a compression level from 1 (least, fastest) to 9 (most,
  210 +slowest) when compressing files with flate (used in zip and
  211 +gzip), which is the default compression for most PDF files.
  212 +You need --recompress-flate with this option if you want to
  213 +change already compressed streams.
  214 +)");
  215 +ap.addOptionHelp("--normalize-content", "transformation", "fix newlines in content streams", R"(--normalize-content=[yn]
  216 +
  217 +Normalize newlines to UNIX-style newlines in PDF content
  218 +streams, which is useful for viewing them in a programmer's text
  219 +editor across multiple platforms. This is also turned on by
  220 +--qdf.
  221 +)");
  222 +ap.addOptionHelp("--object-streams", "transformation", "control use of object streams", R"(--object-streams=mode
  223 +
  224 +Control what qpdf does regarding object streams. Options:
  225 +- preserve: preserve original object streams, if any (the default)
  226 +- disable: create output files with no object streams
  227 +- generate: create object streams, and compress objects when possible
  228 +)");
  229 +ap.addOptionHelp("--preserve-unreferenced", "transformation", "preserve unreferenced objects", R"(Preserve all objects from the input even if not referenced.
  230 +)");
  231 +ap.addOptionHelp("--remove-unreferenced-resources", "transformation", "remove unreferenced page resources", R"(--remove-unreferenced-resources=option
  232 +
  233 +Remove from a page's resource dictionary any resources that are
  234 +not referenced in the page's contents. Options: "auto"
  235 +(default), "yes", "no".
  236 +)");
  237 +ap.addOptionHelp("--preserve-unreferenced-resources", "transformation", "use --remove-unreferenced-resources=no", R"(Synonym for --remove-unreferenced-resources=no. Use that instead.
  238 +)");
  239 +ap.addOptionHelp("--newline-before-endstream", "transformation", "force a newline before endstream", R"(For an extra newline before endstream. Using this option enables
  240 +qpdf to preserve PDF/A when rewriting such files.
  241 +)");
  242 +ap.addOptionHelp("--coalesce-contents", "transformation", "combine content streams", R"(If a page has an array of content streams, concatenate them into
  243 +a single content stream.
  244 +)");
  245 +ap.addOptionHelp("--externalize-inline-images", "transformation", "convert inline to regular images", R"(Convert inline images to regular images.
  246 +)");
  247 +ap.addOptionHelp("--ii-min-bytes", "transformation", "set minimum size for --externalize-inline-images", R"(--ii-min-bytes=size-in-bytes
  248 +
  249 +Don't externalize inline images smaller than this size. The
  250 +default is 1,024. Use 0 for no minimum.
  251 +)");
  252 +ap.addOptionHelp("--min-version", "transformation", "set minimum PDF version", R"(--min-version=version
  253 +
  254 +Force the PDF version of the output to be at least the
  255 +specified version.
  256 +)");
  257 +ap.addOptionHelp("--force-version", "transformation", "set output PDF version", R"(--force-version=version
  258 +
  259 +Force the output PDF file's PDF version header to be the specified
  260 +value, even if the file uses features that may not be available
  261 +in that version.
  262 +)");
  263 +ap.addHelpTopic("page-ranges", "page range syntax", R"(A full description of the page range syntax, with examples, can be
  264 +found in the manual. Summary:
  265 +
  266 +- a,b,c pages a, b, and c
  267 +- a-b pages a through b inclusive; if a > b, this counts down
  268 +- r<n> where <n> represents a number is the <n>th page from the end
  269 +- z the last page, same as r1
  270 +
  271 +You can append :even or :odd to select every other page from the
  272 +resulting set of pages, where :odd starts with the first page and
  273 +:even starts with the second page. These are odd and even pages
  274 +from the resulting set, not based on the original page numbers.
  275 +)");
  276 +ap.addHelpTopic("modification", "change parts of the PDF", R"(Modification options make systematic changes to certain parts of
  277 +the PDF, causing the PDF to render differently from the original.
  278 +)");
  279 +ap.addOptionHelp("--pages", "modification", "begin page selection", R"(--pages file [ --password=password ] [ page-range ] [ ... ] --
  280 +
  281 +Run qpdf --help=page-selection for details.
  282 +)");
  283 +ap.addOptionHelp("--collate", "modification", "collate with --pages", R"(--collate=n
  284 +
  285 +Collate rather than concatenate pages specified with --pages.
  286 +With a numeric argument, collate in groups of n. The default
  287 +is 1. Run qpdf --help=page-selection for additional details.
  288 +)");
  289 +}
  290 +static void add_help_4(QPDFArgParser& ap)
  291 +{
  292 +ap.addOptionHelp("--split-pages", "modification", "write pages to separate files", R"(--split-pages=[n]
  293 +
  294 +This option causes qpdf to create separate output files for each
  295 +page or group of pages rather than a single output file.
  296 +
  297 +File names are generated from the specified output file as follows:
  298 +
  299 +- If the string %d appears in the output file name, it is replaced with a
  300 + zero-padded page range starting from 1
  301 +- Otherwise, if the output file name ends in .pdf (case insensitive), a
  302 + zero-padded page range, preceded by a dash, is inserted before the file
  303 + extension
  304 +- Otherwise, the file name is appended with a zero-padded page range
  305 + preceded by a dash.
  306 +
  307 +Page ranges are single page numbers for single-page groups or first-last
  308 +for multi-page groups.
  309 +)");
  310 +ap.addOptionHelp("--overlay", "modification", "begin overlay options", R"(--overlay file [ options ] --
  311 +
  312 +Overlay pages from another file on the output.
  313 +Run qpdf --help=overlay-underlay for details.
  314 +)");
  315 +ap.addOptionHelp("--underlay", "modification", "begin underlay options", R"(--underlay file [ options ] --
  316 +
  317 +Underlay pages from another file on the output.
  318 +Run qpdf --help=overlay-underlay for details.
  319 +)");
  320 +ap.addOptionHelp("--flatten-rotation", "modification", "remove rotation from page dictionary", R"(Rotate a page using content commands instead of page-level
  321 +metadata. This can be useful if a broken PDF viewer fails to
  322 +properly consider page rotation metadata.
  323 +)");
  324 +ap.addOptionHelp("--flatten-annotations", "modification", "push annotations into content", R"(--flatten-annotations=option
  325 +
  326 +Push page annotations into the content streams. This may be
  327 +necessary in some case when printing or splitting files.
  328 +Options: "all", "print", "screen".
  329 +)");
  330 +ap.addOptionHelp("--rotate", "modification", "rotate pages", R"(--rotate=[+|-]angle[:page-range]
  331 +
  332 +Rotate specified pages by multiples of 90 degrees specifying
  333 +either absolute or relative angles. "angle" may be 0, 90, 180,
  334 +or 270. You almost always want to use +angle or -angle rather
  335 +than just angle, as discussed in the manual. Run
  336 +qpdf --help=page-ranges for help with page ranges.
  337 +)");
  338 +ap.addOptionHelp("--generate-appearances", "modification", "generate appearances for form fields", R"(PDF form fields consist of values and appearances, which may be
  339 +inconsistent with each other if a form field value has been
  340 +modified without updating its appearance. This option tells qpdf
  341 +to generate new appearance streams. There are some limitations,
  342 +which are discussed in the manual.
  343 +)");
  344 +ap.addOptionHelp("--optimize-images", "modification", "use efficient compression for images", R"(Attempt to use DCT (JPEG) compression for images that fall
  345 +within certain constraints as long as doing so decreases the
  346 +size in bytes of the image. See also help for the following
  347 +options:
  348 + --oi-min-width
  349 + --oi-min-height
  350 + --oi-min-area
  351 + --keep-inline-images
  352 +
  353 +The --verbose flag is useful with this option.
  354 +)");
  355 +ap.addOptionHelp("--oi-min-width", "modification", "minimum width for --optimize-images", R"(--oi-min-width=width
  356 +
  357 +Don't optimize images whose width is below the specified value.
  358 +)");
  359 +ap.addOptionHelp("--oi-min-height", "modification", "minimum height for --optimize-images", R"(--oi-min-height=height
  360 +
  361 +Don't optimize images whose height is below the specified value.
  362 +)");
  363 +ap.addOptionHelp("--oi-min-area", "modification", "minimum area for --optimize-images", R"(--oi-min-area=area-in-pixels
  364 +
  365 +Don't optimize images whose area in pixels is below the specified value.
  366 +)");
  367 +ap.addOptionHelp("--keep-inline-images", "modification", "exclude inline images from optimization", R"(Prevent inline images from being considered by --optimize-images.
  368 +)");
  369 +ap.addOptionHelp("--remove-page-labels", "modification", "remove page labels (numbers)", R"(Exclude page labels (explicit page numbers) from the output file.
  370 +)");
  371 +ap.addHelpTopic("encryption", "create encrypted files", R"(Create encrypted files. Usage:
  372 +
  373 +--encrypt user-password owner-password key-length [ options ] --
  374 +
  375 +Either or both of user-password and owner-password may be empty
  376 +strings. key-length may be 40, 128, or 256. Encryption options are
  377 +terminated by "--" by itself.
  378 +
  379 +40-bit encryption is insecure, as is 128-bit encryption without
  380 +AES. Use 256-bit encryption unless you have a specific reason to
  381 +use an insecure format, such as testing or compatibility with very
  382 +old viewers. You must use the --allow-weak-crypto to create
  383 +encrypted files that use insecure cryptographic algorithms. The
  384 +--allow-weak-crypto flag appears outside of --encrypt ... --
  385 +(before --encrypt or after --).
  386 +
  387 +Available options vary by key length. Not all readers respect all
  388 +restrictions. Different PDF readers respond differently to various
  389 +combinations of options. Sometimes a PDF viewer may show you
  390 +restrictions that differ from what you selected. This is probably
  391 +not a bug in qpdf.
  392 +
  393 +Options for 40-bit only:
  394 + --annotate=[yn] restrict comments, filling forms, and signing
  395 + --extract=[yn] restrict text/graphic extraction
  396 + --modify=[yn] restrict document modification
  397 + --print=[yn] restrict printing
  398 +
  399 +Options for 128-bit or 256-bit:
  400 + --accessibility=[yn] restrict accessibility (usually ignored)
  401 + --annotate=[yn] restrict commenting/filling form fields
  402 + --assemble=[yn] restrict document assembly
  403 + --extract=[yn] restrict text/graphic extraction
  404 + --form=[yn] restrict filling form fields
  405 + --modify-other=[yn] restrict other modifications
  406 + --modify=modify-opt control modify access by level
  407 + --print=print-opt control printing access
  408 + --cleartext-metadata prevent encryption of metadata
  409 +
  410 +For 128-bit only:
  411 + --use-aes=[yn] indicates whether to use AES encryption
  412 + --force-V4 forces use of V=4 encryption handler
  413 +
  414 +For 256-bit only:
  415 + --force-R5 forces use of deprecated R=5 encryption
  416 + --allow-insecure allow user password with empty owner password
  417 +
  418 +Values for print-opt:
  419 + none disallow printing
  420 + low allow only low-resolution printing
  421 + full allow full printing
  422 +
  423 +Values for modify-opt:
  424 + none allow no modifications
  425 + assembly allow document assembly only
  426 + form assembly + filling in form fields and signing
  427 + annotate form + commenting and modifying forms
  428 + all allow full document modification
  429 +)");
  430 +ap.addOptionHelp("--accessibility", "encryption", "restrict document accessibility", R"(--accessibility=[yn]
  431 +
  432 +This option is ignored except with very old encryption formats.
  433 +The current PDF specification does not allow restriction of
  434 +document accessibility. This option is not available with 40-bit
  435 +encryption.
  436 +)");
  437 +ap.addOptionHelp("--annotate", "encryption", "restrict document annotation", R"(--annotate=[yn]
  438 +
  439 +Enable/disable modifying annotations including making comments
  440 +and filling in form fields. For 128-bit and 256-bit encryption,
  441 +this also enables editing, creating, and deleting form fields
  442 +unless --modify-other=n or --modify=none is also specified.
  443 +)");
  444 +ap.addOptionHelp("--assemble", "encryption", "restrict document assembly", R"(--assemble=[yn]
  445 +
  446 +Enable/disable document assembly (rotation and reordering of
  447 +pages). This option is not available with 40-bit encryption.
  448 +)");
  449 +ap.addOptionHelp("--extract", "encryption", "restrict text/graphic extraction", R"(--extract=[yn]
  450 +
  451 +Enable/disable text/graphic extraction for purposes other than
  452 +accessibility.
  453 +)");
  454 +ap.addOptionHelp("--form", "encryption", "restrict form filling", R"(--form=[yn]
  455 +
  456 +Enable/disable whether filling form fields is allowed even if
  457 +modification of annotations is disabled. This option is not
  458 +available with 40-bit encryption.
  459 +)");
  460 +}
  461 +static void add_help_5(QPDFArgParser& ap)
  462 +{
  463 +ap.addOptionHelp("--modify-other", "encryption", "restrict other modifications", R"(--modify-other=[yn]
  464 +
  465 +Enable/disable modifications not controlled by --assemble,
  466 +--annotate, or --form. --modify-other=n is implied by any of the
  467 +other --modify options. This option is not available with 40-bit
  468 +encryption.
  469 +)");
  470 +ap.addOptionHelp("--modify", "encryption", "restrict document modification", R"(--modify=modify-opt
  471 +
  472 +For 40-bit files, modify-opt may only be y or n and controls all
  473 +aspects of document modification.
  474 +
  475 +For 128-bit and 256-bit encryption, modify-opt values allow
  476 +enabling and disabling levels of restriction in a manner similar
  477 +to how some PDF creation tools do it. modify-opt values map to
  478 +other combinations of options as follows:
  479 +
  480 +all: allow full modification (the default)
  481 +annotate: --modify-other=n
  482 +form: --modify-other=n --annotate=n
  483 +assembly: --modify-other=n --annotate=n --form=n
  484 +none: --modify-other=n --annotate=n --form=n --assemble=n
  485 +)");
  486 +ap.addOptionHelp("--print", "encryption", "restrict printing", R"(--print=print-opt
  487 +
  488 +Control what kind of printing is allowed. For 40-bit encryption,
  489 +print-opt may only be y or n and enables or disables all
  490 +printing. For 128-bit and 256-bit encryption, print-opt may have
  491 +the following values:
  492 +
  493 +none: disallow printing
  494 +low: allow low-resolution printing only
  495 +full: allow full printing (the default)
  496 +)");
  497 +ap.addOptionHelp("--cleartext-metadata", "encryption", "don't encrypt metadata", R"(If specified, don't encrypt document metadata even when
  498 +encrypting the rest of the document. This option is not
  499 +available with 40-bit encryption.
  500 +)");
  501 +ap.addOptionHelp("--use-aes", "encryption", "use AES with 128-bit encryption", R"(--use-aes=[yn]
  502 +
  503 +Enables/disables use of the more secure AES encryption with
  504 +128-bit encryption. Specifying --use-aes=y forces the PDF
  505 +version to be at least 1.6. This option is only available with
  506 +128-bit encryption. The default is "n" for compatibility
  507 +reasons. Use 256-bit encryption instead.
  508 +)");
  509 +ap.addOptionHelp("--allow-insecure", "encryption", "allow empty owner passwords", R"(Allow creation of PDF files with empty owner passwords and
  510 +non-empty user passwords when using 256-bit encryption.
  511 +)");
  512 +ap.addOptionHelp("--force-V4", "encryption", "force V=4 in encryption dictionary", R"(This option is for testing and is never needed in practice since
  513 +qpdf does this automatically when needed.
  514 +)");
  515 +ap.addOptionHelp("--force-R5", "encryption", "use unsupported R=5 encryption", R"(Use an undocumented, unsupported, deprecated encryption
  516 +algorithm that existed only in Acrobat version IX. This option
  517 +should not be used except for compatibility testing.
  518 +)");
  519 +ap.addHelpTopic("page-selection", "select pages from one or more files", R"(Use the --pages option to select pages from multiple files. Usage:
  520 +
  521 +qpdf in.pdf --pages input-file [ --password=password ] [ page-range ] \
  522 + [ ... ] -- out.pdf
  523 +
  524 +Between --pages and the -- that terminates pages option, repeat
  525 +the following:
  526 +
  527 +filename [ --password=password ] [ page-range ]
  528 +
  529 +Document-level information, such as outlines, tags, etc., is taken
  530 +from in.pdf is preserved in out.pdf. You can use --empty in place
  531 +of an input file to start from an empty file and just copy pages
  532 +equally from all files. You can use "." as a shorthand for the
  533 +primary input file (if not --empty). In the above example, "."
  534 +would refer to in.pdf.
  535 +
  536 +Use --password=password to specify the password for a
  537 +password-protected input file. If the same input file is used more
  538 +than once, you only need to supply the password the first time. If
  539 +the page range is omitted, all pages are selected.
  540 +
  541 +Run qpdf --help=page-ranges for help with page ranges.
  542 +
  543 +Use --collate=n to cause pages to be collated in groups of n pages
  544 +(default 1) instead of concatenating the input.
  545 +
  546 +Examples:
  547 +
  548 +- Start with in.pdf and append all pages from a.pdf and the even
  549 + pages from b.pdf, and write the output to out.pdf. Document-level
  550 + information from in.pdf is retained. Note the use of "." to refer
  551 + to in.pdf.
  552 +
  553 + qpdf in.pdf --pages . a.pdf b.pdf:even -- out.pdf
  554 +
  555 +- Take all the pages from a.pdf, all the pages from b.pdf in
  556 + reverse, and only pages 3 and 6 from c.pdf and write the result
  557 + to out.pdf. Use password "x" to open b.pdf:
  558 +
  559 + qpdf --empty --pages a.pdf b.pdf --password=x z-1 c.pdf 3,6
  560 +
  561 +More examples are in the manual.
  562 +)");
  563 +ap.addHelpTopic("overlay-underlay", "overlay/underlay pages from other files", R"(These options allow pages from another file to be overlaid or
  564 +underlaid on the primary output. Overlaid pages are drawn on top of
  565 +the destination page and may obscure the page. Underlaid pages are
  566 +drawn below the destination page. Usage:
  567 +
  568 +{--overlay | --underlay } file
  569 + [ --password=password ]
  570 + [ --to=page-range ]
  571 + [ --from=[page-range] ]
  572 + [ --repeat=page-range ]
  573 + --
  574 +
  575 +Note the use of "--" by itself to terminate overlay/underlay options.
  576 +
  577 +For overlay and underlay, a file and optional password are specified, along
  578 +with a series of optional page ranges. The default behavior is that each
  579 +page of the overlay or underlay file is imposed on the corresponding page
  580 +of the primary output until it runs out of pages, and any extra pages are
  581 +ignored. You can also give a page range with --repeat to cause
  582 +those pages to be repeated after the original pages are exhausted.
  583 +
  584 +Run qpdf --help=page-ranges for help with page ranges.
  585 +)");
  586 +ap.addOptionHelp("--to", "overlay-underlay", "destination pages for underlay/overlay", R"(--to=page-range
  587 +
  588 +Specify the range of pages in the primary output to apply
  589 +overlay/underlay to. See qpdf --help=page-ranges for help with
  590 +the page range syntax.
  591 +)");
  592 +ap.addOptionHelp("--from", "overlay-underlay", "source pages for underlay/overlay", R"(--from=[page-range]
  593 +
  594 +Specify pages from the overlay/underlay file that are applied to
  595 +the destination pages. See qpdf --help=page-ranges for help
  596 +with the page range syntax. The page range may be omitted
  597 +if --repeat is used.
  598 +)");
  599 +ap.addOptionHelp("--repeat", "overlay-underlay", "overlay/underlay pages to repeat", R"(--repeat=page-range
  600 +
  601 +Specify pages from the overlay/underlay that are repeated after
  602 +"from" pages have been exhausted. See qpdf --help=page-ranges
  603 +for help with the page range syntax.
  604 +)");
  605 +ap.addHelpTopic("attachments", "work with embedded files", R"(It is possible to list, add, or delete embedded files (also known
  606 +as attachments) and to copy attachments from other files. See help
  607 +on individual options for details. Run qpdf --help=add-attachment
  608 +for additional details about adding attachments.
  609 +)");
  610 +ap.addOptionHelp("--list-attachments", "attachments", "list embedded files", R"(Show the key and stream number for each embedded file. Combine
  611 +with --verbose for more detailed information.
  612 +)");
  613 +ap.addOptionHelp("--show-attachment", "attachments", "export an embedded file", R"(--show-attachment=key
  614 +
  615 +Write the contents of the specified attachment to standard
  616 +output as binary data. Get the key with --list-attachments.
  617 +)");
  618 +ap.addOptionHelp("--add-attachment", "attachments", "start add attachment options", R"(--add-attachment file options --
  619 +
  620 +The --add-attachment flag and its options may be repeated to add
  621 +multiple attachments. Run qpdf --help=add-attachment for details.
  622 +)");
  623 +ap.addOptionHelp("--remove-attachment", "attachments", "remove an embedded file", R"(--remove-attachment=key
  624 +
  625 +Remove an embedded file using its key. Get the key with
  626 +--list-attachments.
  627 +)");
  628 +ap.addOptionHelp("--copy-attachments-from", "attachments", "start copy attachment options", R"(--copy-attachments-from file options --
  629 +
  630 +The --copy-attachments-from flag and its options may be repeated
  631 +to copy attachments from multiple files. Run
  632 +qpdf --help=copy-attachments for details.
  633 +)");
  634 +}
  635 +static void add_help_6(QPDFArgParser& ap)
  636 +{
  637 +ap.addHelpTopic("pdf-dates", "PDF date format", R"(When a date is required, the date should conform to the PDF date
  638 +format specification, which is "D:yyyymmddhhmmssz" where "z" is
  639 +either literally upper case "Z" for UTC or a timezone offset in
  640 +the form "-hh'mm'" or "+hh'mm'". Negative timezone offsets indicate
  641 +time before UTC. Positive offsets indicate how far after. For
  642 +example, US Eastern Standard Time (America/New_York) is "-05'00'",
  643 +and Indian Standard Time (Asia/Calcutta) is "+05'30'".
  644 +
  645 +Examples:
  646 +- D:20210207161528-05'00' February 7, 2021 at 4:15:28 p.m.
  647 +- D:20210207211528Z February 7, 2021 at 21:15:28 UTC
  648 +)");
  649 +ap.addHelpTopic("add-attachment", "attach (embed) files", R"(The options listed below appear between --add-attachment and its
  650 +terminating "--".
  651 +)");
  652 +ap.addOptionHelp("--key", "add-attachment", "specify attachment key", R"(--key=key
  653 +
  654 +Specify the key to use for the attachment in the embedded files
  655 +table. It defaults to the last element of the attached file's
  656 +filename.
  657 +)");
  658 +ap.addOptionHelp("--filename", "add-attachment", "set attachment's displayed filename", R"(--filename=name
  659 +
  660 +Specify the filename to be used for the attachment. This is what
  661 +is usually displayed to the user and is the name most graphical
  662 +PDF viewers will use when saving a file. It defaults to the last
  663 +element of the attached file's filename.
  664 +)");
  665 +ap.addOptionHelp("--creationdate", "add-attachment", "set attachment's creation date", R"(--creationdate=date
  666 +
  667 +Specify the attachment's creation date in PDF format; defaults
  668 +to the current time. Run qpdf --help=pdf-dates for information
  669 +about the date format.
  670 +)");
  671 +ap.addOptionHelp("--moddate", "add-attachment", "set attachment's modification date", R"(--moddate=date
  672 +
  673 +Specify the attachment's modification date in PDF format;
  674 +defaults to the current time. Run qpdf --help=pdf-dates for
  675 +information about the date format.
  676 +)");
  677 +ap.addOptionHelp("--mimetype", "add-attachment", "attachment mime type (e.g. application/pdf)", R"(--mimetype=type/subtype
  678 +
  679 +Specify the mime type for the attachment, such as text/plain,
  680 +application/pdf, image/png, etc.
  681 +)");
  682 +ap.addOptionHelp("--description", "add-attachment", "set attachment's description", R"(--description="text"
  683 +
  684 +Supply descriptive text for the attachment, displayed by some
  685 +PDF viewers.
  686 +)");
  687 +ap.addOptionHelp("--replace", "add-attachment", "replace attachment with same key", R"(Indicate that any existing attachment with the same key should
  688 +be replaced by the new attachment. Otherwise, qpdf gives an
  689 +error if an attachment with that key is already present.
  690 +)");
  691 +ap.addHelpTopic("copy-attachments", "copy attachments from another file", R"(The options listed below appear between --copy-attachments-from and
  692 +its terminating "--".
  693 +
  694 +To copy attachments from a password-protected file, use
  695 +the --password option after the file name.
  696 +)");
  697 +ap.addOptionHelp("--prefix", "copy-attachments", "key prefix for copying attachments", R"(--prefix=prefix
  698 +
  699 +Prepend a prefix to each key; may be needed if there are
  700 +duplicate attachment keys. This affects the key only, not the
  701 +file name.
  702 +)");
  703 +ap.addOptionHelp("--is-encrypted", "copy-attachments", "silently test whether a file is encrypted", R"(Silently exit with a code indicating the file's encryption status:
  704 +
  705 +0: the file is encrypted
  706 +1: not used
  707 +2: the file is not encrypted
  708 +
  709 +This can be used with password-protected files even if you don't
  710 +know the password.
  711 +)");
  712 +ap.addOptionHelp("--requires-password", "copy-attachments", "silently test a file's password", R"(Silently exit with a code indicating the file's password status:
  713 +
  714 +0: a password, other than as supplied, is required
  715 +1: not used
  716 +2: the file is not encrypted
  717 +3: the file is encrypted, and correct password (if any) has been supplied
  718 +)");
  719 +ap.addOptionHelp("--check", "copy-attachments", "partially check whether PDF is valid", R"(Check the structure of the PDF file as well as a number of other
  720 +aspects of the file, and write information about the file to
  721 +standard output. Note that qpdf does not perform any validation
  722 +of the actual PDF page content or semantic correctness of the
  723 +PDF file. It merely checks that the PDF file is syntactically
  724 +valid.
  725 +)");
  726 +ap.addOptionHelp("--show-encryption", "copy-attachments", "information about encrypted files", R"(Show document encryption parameters. Also show the document's
  727 +user password if the owner password is given and the file was
  728 +encrypted using older encryption formats that allow user
  729 +password recovery.
  730 +)");
  731 +ap.addOptionHelp("--show-encryption-key", "copy-attachments", "show key with --show-encryption", R"(When used with --show-encryption, causes the underlying
  732 +encryption key to be displayed.
  733 +)");
  734 +ap.addOptionHelp("--check-linearization", "copy-attachments", "check linearization tables", R"(Check to see whether a file is linearized and, if so, whether
  735 +the linearization hint tables are correct.
  736 +)");
  737 +ap.addOptionHelp("--show-linearization", "copy-attachments", "show linearization hint tables", R"(Check and display all data in the linearization hint tables.
  738 +)");
  739 +ap.addOptionHelp("--show-xref", "copy-attachments", "show cross reference data", R"(Show the contents of the cross-reference table or stream (object
  740 +locations in the file) in a human-readable form. This is
  741 +especially useful for files with cross-reference streams, which
  742 +are stored in a binary format.
  743 +)");
  744 +}
  745 +static void add_help_7(QPDFArgParser& ap)
  746 +{
  747 +ap.addOptionHelp("--show-object", "copy-attachments", "show contents of an object", R"(--show-object=trailer|obj[,gen]
  748 +
  749 +Show the contents of the given object. This is especially useful
  750 +for inspecting objects that are inside of object streams (also
  751 +known as "compressed objects").
  752 +)");
  753 +ap.addOptionHelp("--raw-stream-data", "copy-attachments", "show raw stream data", R"(When used with --show-object, if the object is a stream, write
  754 +the raw (compressed) binary stream data to standard output
  755 +instead of the object's contents. See also
  756 +--filtered-stream-data.
  757 +)");
  758 +ap.addOptionHelp("--filtered-stream-data", "copy-attachments", "show filtered stream data", R"(When used with --show-object, if the object is a stream, write
  759 +the filtered (uncompressed, potentially binary) stream data to
  760 +standard output instead of the object's contents. See also
  761 +--raw-stream-data.
  762 +)");
  763 +ap.addOptionHelp("--show-npages", "copy-attachments", "show number of pages", R"(Print the number of pages in the input file on a line by itself.
  764 +Useful for scripts.
  765 +)");
  766 +ap.addOptionHelp("--show-pages", "copy-attachments", "display page dictionary information", R"(Show the object and generation number for each page dictionary
  767 +object and for each content stream associated with the page.
  768 +)");
  769 +ap.addOptionHelp("--with-images", "copy-attachments", "include image details with --show-pages", R"(When used with --show-pages, also shows the object and
  770 +generation numbers for the image objects on each page.
  771 +)");
  772 +ap.addHelpTopic("json", "JSON output for PDF information", R"(Show information about the PDF file in JSON format. Please see the
  773 +JSON chapter in the qpdf manual for details.
  774 +)");
  775 +ap.addOptionHelp("--json", "json", "show file in json format", R"(Generate a JSON representation of the file. This is described in
  776 +depth in the JSON section of the manual.
  777 +)");
  778 +ap.addOptionHelp("--json-help", "json", "show format of json output", R"(Describe the format of the JSON output.
  779 +)");
  780 +ap.addOptionHelp("--json-key", "json", "restrict which keys are in json output", R"(--json-key=key
  781 +
  782 +This option is repeatable. If given, only the specified
  783 +top-level keys will be included in the JSON output. Otherwise,
  784 +all keys will be included.
  785 +)");
  786 +ap.addOptionHelp("--json-object", "json", "restrict which objects are in JSON", R"(--json-object=trailer|obj[,gen]
  787 +
  788 +This option is repeatable. If given, only specified objects will
  789 +be shown in the "objects" key of the JSON output. Otherwise, all
  790 +objects will be shown.
  791 +)");
  792 +ap.addHelpTopic("testing", "options for testing or debugging", R"(The options below are useful when writing automated test code that
  793 +includes files created by qpdf or when testing qpdf itself.
  794 +)");
  795 +ap.addOptionHelp("--static-id", "testing", "use a fixed document ID", R"(Use a fixed value for the document ID. This is intended for
  796 +testing only. Never use it for production files. See also
  797 +qpdf --help=--deterministic-id.
  798 +)");
  799 +ap.addOptionHelp("--static-aes-iv", "testing", "use a fixed AES vector", R"(Use a static initialization vector for AES-CBC. This is intended
  800 +for testing only so that output files can be reproducible. Never
  801 +use it for production files. This option is not secure since it
  802 +significantly weakens the encryption.
  803 +)");
  804 +ap.addOptionHelp("--linearize-pass1", "testing", "save pass 1 of linearization", R"(--linearize-pass1=file
  805 +
  806 +Write the first pass of linearization to the named file. The
  807 +resulting file is not a valid PDF file. This option is useful only
  808 +for debugging qpdf.
  809 +)");
  810 +}
  811 +static void add_help(QPDFArgParser& ap)
  812 +{
  813 + add_help_1(ap);
  814 + add_help_2(ap);
  815 + add_help_3(ap);
  816 + add_help_4(ap);
  817 + add_help_5(ap);
  818 + add_help_6(ap);
  819 + add_help_7(ap);
  820 +ap.addHelpFooter("For detailed help, visit the qpdf manual: https://qpdf.readthedocs.io\n");
  821 +}
  822 +
... ...
libqpdf/qpdf/auto_job_init.hh
... ... @@ -162,4 +162,3 @@ this-&gt;ap.copyFromOtherTable(&quot;annotate&quot;, &quot;128-bit encryption&quot;);
162 162 this->ap.copyFromOtherTable("form", "128-bit encryption");
163 163 this->ap.copyFromOtherTable("modify-other", "128-bit encryption");
164 164 this->ap.copyFromOtherTable("modify", "128-bit encryption");
165   -this->ap.addHelpFooter("For detailed help, visit the qpdf manual: https://qpdf.readthedocs.io\n");
... ...
libtests/qtest/arg_parser/help-all.out
... ... @@ -6,7 +6,7 @@ Topics:
6 6 baaa: Baaa Options
7 7 quack: Quack Options
8 8  
9   -== topic baaa (Baaa Options) ==
  9 +== baaa (Baaa Options) ==
10 10  
11 11 Ewe can do sheepish things.
12 12 For example, ewe can add more ram to your computer.
... ... @@ -15,15 +15,15 @@ Related options:
15 15 --ewe: just for ewe
16 16 --ram: curly horns
17 17  
18   -== topic quack (Quack Options) ==
  18 +== quack (Quack Options) ==
19 19  
20 20 Just put stuff after quack to get a count at the end.
21 21  
22   -== option --ewe (just for ewe) ==
  22 +== --ewe (just for ewe) ==
23 23  
24 24 You are not a ewe.
25 25  
26   -== option --ram (curly horns) ==
  26 +== --ram (curly horns) ==
27 27  
28 28 curly horns
29 29  
... ...
manual/.gitignore 0 โ†’ 100644
  1 +__pycache__
... ...
manual/_ext/qpdf.py 0 โ†’ 100644
  1 +from collections import defaultdict
  2 +from operator import itemgetter
  3 +import re
  4 +
  5 +from sphinx import addnodes
  6 +from sphinx.directives import ObjectDescription
  7 +from sphinx.domains import Domain, Index
  8 +from sphinx.roles import XRefRole
  9 +from sphinx.util.nodes import make_refnode
  10 +
  11 +# Reference:
  12 +# https://www.sphinx-doc.org/en/master/development/tutorials/todo.html
  13 +# https://www.sphinx-doc.org/en/master/development/tutorials/recipe.html
  14 +
  15 +
  16 +class OptionDirective(ObjectDescription):
  17 + has_content = True
  18 +
  19 + def handle_signature(self, sig, signode):
  20 + signode += addnodes.desc_name(text=sig)
  21 + return sig
  22 +
  23 + def add_target_and_index(self, name_cls, sig, signode):
  24 + m = re.match(r'^--([^= ]+)', sig)
  25 + if not m:
  26 + raise Exception('option must start with --')
  27 + option_name = m.group(1)
  28 + signode['ids'].append(f'option-{option_name}')
  29 + qpdf = self.env.get_domain('qpdf')
  30 + qpdf.add_option(sig, option_name)
  31 +
  32 +
  33 +class OptionIndex(Index):
  34 + name = 'options'
  35 + localname = 'qpdf Command-line Options'
  36 + shortname = 'Options'
  37 +
  38 + def generate(self, docnames=None):
  39 + content = defaultdict(list)
  40 + options = self.domain.get_objects()
  41 + options = sorted(options, key=itemgetter(0))
  42 +
  43 + # name, subtype, docname, anchor, extra, qualifier, description
  44 + for name, display_name, typ, docname, anchor, _ in options:
  45 + m = re.match(r'^(--([^= ]+))', display_name)
  46 + if not m:
  47 + raise Exception(
  48 + 'OptionIndex.generate: display name not as expected')
  49 + content[m.group(2)[0].lower()].append(
  50 + (m.group(1), 0, docname, anchor, '', '', typ))
  51 +
  52 + content = sorted(content.items())
  53 + return content, True
  54 +
  55 +
  56 +class QpdfDomain(Domain):
  57 + name = 'qpdf'
  58 + label = 'qpdf documentation domain'
  59 + roles = {
  60 + 'ref': XRefRole()
  61 + }
  62 + directives = {
  63 + 'option': OptionDirective,
  64 + }
  65 + indices = {
  66 + OptionIndex,
  67 + }
  68 + initial_data = {
  69 + 'options': [], # object list
  70 + }
  71 +
  72 + def get_full_qualified_name(self, node):
  73 + return '{}.{}'.format('option', node.arguments[0])
  74 +
  75 + def get_objects(self):
  76 + for obj in self.data['options']:
  77 + yield(obj)
  78 +
  79 + def resolve_xref(self, env, from_doc_name, builder, typ, target, node,
  80 + contnode):
  81 + match = [(docname, anchor)
  82 + for name, sig, typ, docname, anchor, priority
  83 + in self.get_objects() if name == f'option.{target[2:]}']
  84 +
  85 + if len(match) > 0:
  86 + to_doc_name = match[0][0]
  87 + match_target = match[0][1]
  88 + return make_refnode(builder, from_doc_name, to_doc_name,
  89 + match_target, contnode, match_target)
  90 + else:
  91 + raise Exception(f'invalid option xref ({target})')
  92 +
  93 + def add_option(self, signature, option_name):
  94 + if self.env.docname != 'cli':
  95 + raise Exception(
  96 + 'qpdf:option directives don\'t work outside of cli.rst')
  97 +
  98 + name = f'option.{option_name}'
  99 + anchor = f'option-{option_name}'
  100 +
  101 + # name, display_name, type, docname, anchor, priority
  102 + self.data['options'].append(
  103 + (name, signature, '', self.env.docname, anchor, 0))
  104 +
  105 + def purge_options(self, docname):
  106 + self.data['options'] = list([
  107 + x for x in self.data['options']
  108 + if x[3] != docname
  109 + ])
  110 +
  111 +
  112 +def purge_options(app, env, docname):
  113 + option = env.get_domain('qpdf')
  114 + option.purge_options(docname)
  115 +
  116 +
  117 +def setup(app):
  118 + app.add_domain(QpdfDomain)
  119 + app.connect('env-purge-doc', purge_options)
  120 +
  121 + return {
  122 + 'version': '0.1',
  123 + 'parallel_read_safe': True,
  124 + 'parallel_write_safe': True,
  125 + }
... ...
manual/build.mk
... ... @@ -17,7 +17,7 @@ ifeq ($(BUILD_PDF),1)
17 17 TARGETS_manual += $(PDF_TARGET)
18 18 endif
19 19  
20   -MANUAL_DEPS = $(wildcard manual/*.rst) manual/conf.py
  20 +MANUAL_DEPS = $(wildcard manual/*.rst) manual/conf.py manual/_ext/qpdf.py
21 21  
22 22 # Prevent targets that run $(SPHINX) from running in parallel by using
23 23 # order-only dependencies (the dependencies listed after the |) to
... ...
manual/cli.rst
  1 +.. NOTES
  2 +
  3 + This file contains text that is used for help file generation.
  4 + Lines that start with the magic comment ".. help topic x: y"
  5 + introduce a help topic called "x" with short text "y". The contents
  6 + of the comment are the long text.
  7 +
  8 + The ".. qpdf:option:: option" directive introduces a command-line
  9 + option. The next ".. help: short_text" comment's contents are the
  10 + long text of the help. Search for ".. help-topic" and "qpdf:option"
  11 + for additional help. Command line arguments can be referenced using
  12 + :qpdf:ref:`--option`. They also appear in an index.
  13 +
  14 + In this text, :samp:`...` and ``...`` are used somewhat
  15 + interchangeably. :samp: should be used when there is replaceable
  16 + text enclosed in curly braces. Otherwise, either is fine. Ideally
  17 + there should be a stricter editorial convention, but they render
  18 + the same, so I have not gone to the trouble of making it consistent.
  19 +
1 20 .. _using:
2 21  
3   -Running QPDF
  22 +Running qpdf
4 23 ============
5 24  
6 25 This chapter describes how to run the qpdf program from the command
... ... @@ -11,1187 +30,1356 @@ line.
11 30 Basic Invocation
12 31 ----------------
13 32  
14   -When running qpdf, the basic invocation is as follows:
  33 +.. help-topic usage: basic invocation
  34 +
  35 + Read a PDF file, apply transformations or modifications, and write
  36 + a new PDF file.
  37 +
  38 + Usage: qpdf infile [options] [outfile]
  39 + OR qpdf help-option
  40 +
  41 + - infile, options, and outfile may be in any order as long as infile
  42 + precedes outfile.
  43 + - Use --empty in place of an input file for a zero-page, empty input
  44 + - Use --replace-input in place of an output file to overwrite the
  45 + input file with the output
  46 + - outfile may be - to write to stdout; reading from stdin is not supported
  47 + - @filename is an argument file; each line is treated as a separate
  48 + command-line argument
  49 + - @- may be used to read arguments from stdin
  50 + - Later options may override earlier options if contradictory
15 51  
16 52 ::
17 53  
18   - qpdf [ options ] { infilename | --empty } outfilename
19   -
20   -This converts PDF file :samp:`infilename` to PDF file
21   -:samp:`outfilename`. The output file is functionally
22   -identical to the input file but may have been structurally reorganized.
23   -Also, orphaned objects will be removed from the file. Many
24   -transformations are available as controlled by the options below. In
25   -place of :samp:`infilename`, the parameter
26   -:samp:`--empty` may be specified. This causes qpdf to
27   -use a dummy input file that contains zero pages. The only normal use
28   -case for using :samp:`--empty` would be if you were
29   -going to add pages from another source, as discussed in :ref:`page-selection`.
30   -
31   -If :samp:`@filename` appears as a word anywhere in the
32   -command-line, it will be read line by line, and each line will be
33   -treated as a command-line argument. Leading and trailing whitespace is
34   -intentionally not removed from lines, which makes it possible to handle
35   -arguments that start or end with spaces. The :samp:`@-`
36   -option allows arguments to be read from standard input. This allows qpdf
37   -to be invoked with an arbitrary number of arbitrarily long arguments. It
38   -is also very useful for avoiding having to pass passwords on the command
39   -line. Note that the :samp:`@filename` can't appear in
40   -the middle of an argument, so constructs such as
41   -:samp:`--arg=@option` will not work. You would have to
42   -include the argument and its options together in the arguments file.
43   -
44   -:samp:`outfilename` does not have to be seekable, even
45   -when generating linearized files. Specifying ":samp:`-`"
46   -as :samp:`outfilename` means to write to standard
47   -output. If you want to overwrite the input file with the output, use the
48   -option :samp:`--replace-input` and omit the output file
49   -name. You can't specify the same file as both the input and the output.
50   -If you do this, qpdf will tell you about the
51   -:samp:`--replace-input` option.
52   -
53   -Most options require an output file, but some testing or inspection
54   -commands do not. These are specifically noted.
  54 + Usage: qpdf infile [ options ] [ outfile ]
  55 +
  56 +The :command:`qpdf` command reads the PDF file :samp:`{infile}`,
  57 +applies various transformations or modifications to the file in
  58 +memory, and writes the results to :samp:`{outfile}`. When run with no
  59 +arguments, the output file is functionally identical to the input file
  60 +but may be structurally reorganized, and orphaned objects are removed
  61 +from the file. Many options are available for applying transformations
  62 +or modifications to the file.
  63 +
  64 +:samp:`{infile}` can be a regular file, or it can be
  65 +:qpdf:ref:`--empty` to start with an empty PDF file. :samp:`{outfile}`
  66 +can be a regular file, ``-`` to represent standard output, or
  67 +:qpdf:ref:`--replace-input` to indicate that the input file should be
  68 +overwritten. The output file does not have to be seekable, even when
  69 +generating linearized files. The input file *does* have to be
  70 +seekable. You can't read from standard input or a pipe. You can also
  71 +use :qpdf:ref:`--split-pages` to create separate output files for each
  72 +page (or group of pages) instead of a single output file.
  73 +Password-protected files may be opened by specifying a password with
  74 +:qpdf:ref:`--password`. These and many other options are discussed in
  75 +the remaining sections of this chapter.
  76 +
  77 +All options other than help options (see :ref:`help-options`) require
  78 +an input file. If inspection options (see :ref:`inspection-options`)
  79 +are given, an output file must not be given. Otherwise, an output file
  80 +is required.
  81 +
  82 +If :samp:`@filename` appears as a word anywhere in the command-line,
  83 +it will be read line by line, and each line will be treated as a
  84 +command-line argument. Leading and trailing whitespace is
  85 +intentionally not removed from lines, which makes it possible to
  86 +handle arguments that start or end with spaces. The :samp:`@-` option
  87 +allows arguments to be read from standard input. This allows qpdf to
  88 +be invoked with an arbitrary number of arbitrarily long arguments. It
  89 +is also very useful for avoiding having to pass passwords on the
  90 +command line, though see also :qpdf:ref:`--password-file`. Note that
  91 +the :samp:`@filename` can't appear in the middle of an argument, so
  92 +constructs such as :samp:`--arg=@filename` will not work. Instead, you
  93 +would have to include the argument and its parameter (e.g.,
  94 +:samp:`--arg=parameter`) as a line in the :file:`filename` file and
  95 +just pass :samp:`@filename` on the command line.
  96 +
  97 +Related Options
  98 +~~~~~~~~~~~~~~~
  99 +
  100 +.. qpdf:option:: --empty
  101 +
  102 + .. help: empty input file
  103 +
  104 + Use in place of infile for an empty input. Especially useful
  105 + with --pages.
  106 +
  107 + This option may be given in place of :samp:`{infile}`. This causes
  108 + qpdf to use a dummy input file that contains zero pages. This
  109 + option is useful in conjunction with :qpdf:ref:`--pages`. See
  110 + :ref:`page-selection` for details.
  111 +
  112 +.. qpdf:option:: --replace-input
  113 +
  114 + .. help: replace input with output
  115 +
  116 + Use in place of outfile to overwrite the input file with the output.
  117 +
  118 + This option may be given in place of :samp:`{outfile}`. This causes
  119 + qpdf to replace the input file with the output. It does this by
  120 + writing to :file:`{infilename}.~qpdf-temp#` and, when done,
  121 + overwriting the input file with the temporary file. If there were
  122 + any warnings, the original input is saved as
  123 + :file:`{infilename}.~qpdf-orig`. If there are errors, the input
  124 + file is left untouched.
55 125  
56 126 .. _exit-status:
57 127  
58 128 Exit Status
59   -~~~~~~~~~~~
  129 +-----------
60 130  
61   -The exit status of :command:`qpdf` may be interpreted as
62   -follows:
  131 +.. help-topic exit-status: meanings of qpdf's exit codes
  132 +
  133 + Meaning of exit codes:
  134 +
  135 + 0: no errors or warnings
  136 + 1: not used by qpdf but may be used by the shell if unable to invoke qpdf
  137 + 2: errors detected
  138 + 3: warnings detected, unless --warning-exit-0 is given
  139 +
  140 +The exit status of :command:`qpdf` may be interpreted as follows:
63 141  
64 142 - ``0``: no errors or warnings were found. The file may still have
65   - problems qpdf can't detect. If
66   - :samp:`--warning-exit-0` was specified, exit status 0
67   - is used even if there are warnings.
  143 + problems qpdf can't detect. If :qpdf:ref:`--warning-exit-0` was
  144 + specified, exit status ``0`` is used even if there are warnings.
  145 +
  146 +- ``1``: :command:`qpdf` does not exit with status ``1`` since the
  147 + shell uses this exit code if it is unable to invoke the command.
68 148  
69 149 - ``2``: errors were found. qpdf was not able to fully process the
70 150 file.
71 151  
72   -- ``3``: qpdf encountered problems that it was able to recover from. In
73   - some cases, the resulting file may still be damaged. Note that qpdf
74   - still exits with status ``3`` if it finds warnings even when
75   - :samp:`--no-warn` is specified. With
76   - :samp:`--warning-exit-0`, warnings without errors
77   - exit with status 0 instead of 3.
  152 +- ``3``: qpdf encountered problems that it was able to recover from.
  153 + In some cases, the resulting file may still be damaged. Note that
  154 + qpdf still exits with status ``3`` if it finds warnings even when
  155 + :qpdf:ref:`--no-warn` is specified. With
  156 + :qpdf:ref:`--warning-exit-0`, warnings without errors exit with
  157 + status ``0`` instead of ``3``.
  158 +
  159 +The :qpdf:ref:`--is-encrypted` and :qpdf:ref:`--requires-password`
  160 +options use different exit codes. See their help for details.
  161 +
  162 +Related Options
  163 +~~~~~~~~~~~~~~~
  164 +
  165 +.. qpdf:option:: --warning-exit-0
  166 +
  167 + .. help: exit 0 even with warnings
78 168  
79   -Note that :command:`qpdf` never exists with status ``1``.
80   -If you get an exit status of ``1``, it was something else, like the
81   -shell not being able to find or execute :command:`qpdf`.
  169 + Use exit status 0 instead of 3 when warnings are present. When
  170 + combined with --no-warn, warnings are completely ignored.
  171 +
  172 + If there were warnings only and no errors, exit with exit code
  173 + ``0`` instead of ``3``. When combined with :qpdf:ref:`--no-warn`,
  174 + the effect is for :command:`qpdf` to completely ignore warnings.
82 175  
83 176 .. _shell-completion:
84 177  
85 178 Shell Completion
86 179 ----------------
87 180  
88   -Starting in qpdf version 8.3.0, qpdf provides its own completion support
89   -for zsh and bash. You can enable bash completion with :command:`eval
90   -$(qpdf --completion-bash)` and zsh completion with
91   -:command:`eval $(qpdf --completion-zsh)`. If
92   -:command:`qpdf` is not in your path, you should invoke it
93   -above with an absolute path. If you invoke it with a relative path, it
94   -will warn you, and the completion won't work if you're in a different
95   -directory.
96   -
97   -qpdf will use ``argv[0]`` to figure out where its executable is. This
98   -may produce unwanted results in some cases, especially if you are trying
99   -to use completion with copy of qpdf that is built from source. You can
100   -specify a full path to the qpdf you want to use for completion in the
101   -``QPDF_EXECUTABLE`` environment variable.
  181 +.. help-topic completion: shell completion
102 182  
103   -.. _basic-options:
  183 + Shell completion is supported with bash and zsh. Use
  184 + eval $(qpdf --completion-bash) or eval $(qpdf --completion-zsh)
  185 + to enable. The QPDF_EXECUTABLE environment variable overrides the
  186 + path to qpdf that these commands output.
104 187  
105   -Basic Options
106   --------------
  188 +:command:`qpdf` provides its own completion support for zsh and bash.
  189 +You can enable bash completion with :command:`eval $(qpdf
  190 +--completion-bash)` and zsh completion with :command:`eval $(qpdf
  191 +--completion-zsh)`. If :command:`qpdf` is not in your path, you should
  192 +invoke it above with an absolute path. If you invoke it with a
  193 +relative path, it will warn you, and the completion won't work if
  194 +you're in a different directory.
107 195  
108   -The following options are the most common ones and perform commonly
109   -needed transformations.
  196 +:command:`qpdf` will use ``argv[0]`` to figure out where its
  197 +executable is. This may produce unwanted results in some cases,
  198 +especially if you are trying to use completion with copy of qpdf that
  199 +is run directly out of the source tree or that is invoked with a
  200 +wrapper script. You can specify a full path to the qpdf you want to
  201 +use for completion in the ``QPDF_EXECUTABLE`` environment variable.
110 202  
111   -:samp:`--help`
112   - Display command-line invocation help.
  203 +Related Options
  204 +~~~~~~~~~~~~~~~
113 205  
114   -:samp:`--version`
115   - Display the current version of qpdf.
  206 +.. qpdf:option:: --completion-bash
116 207  
117   -:samp:`--copyright`
118   - Show detailed copyright information.
  208 + .. help: enable bash completion
119 209  
120   -:samp:`--show-crypto`
121   - Show a list of available crypto providers, each on a line by itself.
122   - The default provider is always listed first. See :ref:`crypto` for more information about crypto
123   - providers.
  210 + Output a command that enables bash completion
124 211  
125   -:samp:`--completion-bash`
126 212 Output a completion command you can eval to enable shell completion
127 213 from bash.
128 214  
129   -:samp:`--completion-zsh`
  215 +.. qpdf:option:: --completion-zsh
  216 +
  217 + .. help: enable zsh completion
  218 +
  219 + Output a command that enables zsh completion
  220 +
130 221 Output a completion command you can eval to enable shell completion
131 222 from zsh.
132 223  
133   -:samp:`--password={password}`
134   - Specifies a password for accessing encrypted files. To read the
135   - password from a file or standard input, you can use
136   - :samp:`--password-file`, added in qpdf 10.2. Note
137   - that you can also use :samp:`@filename` or
138   - :samp:`@-` as described above to put the password in
139   - a file or pass it via standard input, but you would do so by
140   - specifying the entire
141   - :samp:`--password={password}`
142   - option in the file. Syntax such as
143   - :samp:`--password=@filename` won't work since
144   - :samp:`@filename` is not recognized in the middle of
145   - an argument.
146   -
147   -:samp:`--password-file={filename}`
  224 +.. _help-options:
  225 +
  226 +Help/Information
  227 +----------------
  228 +
  229 +.. help-topic help: information about qpdf
  230 +
  231 + Help options provide some information about qpdf itself. Help
  232 + options are only valid as the first and only command-line argument.
  233 +
  234 +Help options provide some information about qpdf itself. Help options
  235 +are only valid as the first and only command-line argument.
  236 +
  237 +Related Options
  238 +~~~~~~~~~~~~~~~
  239 +
  240 +.. qpdf:option:: --help
  241 +
  242 + .. help: provide help
  243 +
  244 + Display help information. Run qpdf --help for information about
  245 + how to get help on various topics.
  246 +
  247 + Display command-line invocation help.
  248 +
  249 +.. qpdf:option:: --version
  250 +
  251 + .. help: show qpdf version
  252 +
  253 + Display the version of qpdf.
  254 +
  255 + Display the version of qpdf. The version number displayed is the
  256 + one that is compiled into the qpdf library. If you don't see the
  257 + version number you expect, you may have more than one version of
  258 + :command:`qpdf` installed and may not have your library path set up
  259 + correctly.
  260 +
  261 +.. qpdf:option:: --copyright
  262 +
  263 + .. help: show copyright information
  264 +
  265 + Display copyright and license information.
  266 +
  267 + Display copyright and license information.
  268 +
  269 +.. qpdf:option:: --show-crypto
  270 +
  271 + .. help: show available crypto providers
  272 +
  273 + Show a list of available crypto providers, one per line. The
  274 + default provider is shown first.
  275 +
  276 + Show a list of available crypto providers, each on a line by
  277 + itself. The default provider is always listed first. See
  278 + :ref:`crypto` for more information about crypto providers.
  279 +
  280 +.. _general-options:
  281 +
  282 +General Options
  283 +---------------
  284 +
  285 +.. help-topic general: general options
  286 +
  287 + General options control qpdf's behavior in ways that are not
  288 + directly related to the operation it is performing.
  289 +
  290 +This section describes general options that control :command:`qpdf`'s
  291 +behavior. They are not necessarily related to the specific operation
  292 +that is being performed and may be used whether or not an output file
  293 +is being created.
  294 +
  295 +Related Options
  296 +~~~~~~~~~~~~~~~
  297 +
  298 +.. qpdf:option:: --password=password
  299 +
  300 + .. help: specify password
  301 +
  302 + Specify a password for an encrypted, password-protected file.
  303 + Not needed for encrypted files with no password.
  304 +
  305 + Specifies a password for accessing encrypted, password-protected
  306 + files. To read the password from a file or standard input, you can
  307 + use :qpdf:ref:`--password-file`. You can also use :samp:`@filename`
  308 + or :samp:`@-` (see :ref:`invocation`) to put the password in a file
  309 + or pass it via standard input, but you would do so by specifying
  310 + the entire :samp:`--password={password}` option in the file. Syntax
  311 + such as :samp:`--password=@filename` won't work since
  312 + :samp:`@filename` is not recognized in the middle of an argument.
  313 +
  314 + Prior to 8.4.0, in the case of passwords that contain characters that
  315 + fall outside of 7-bit US-ASCII, qpdf left the burden of supplying
  316 + properly encoded encryption and decryption passwords to the user.
  317 + Starting in qpdf 8.4.0, qpdf does this automatically in most cases.
  318 + For an in-depth discussion, please see :ref:`unicode-passwords`.
  319 + Previous versions of this manual described workarounds using the
  320 + :command:`iconv` command. Such workarounds are no longer required or
  321 + recommended with qpdf 8.4.0. However, for backward compatibility, qpdf
  322 + attempts to detect those workarounds and do the right thing in most
  323 + cases.
  324 +
  325 +.. qpdf:option:: --password-file=filename
  326 +
  327 + .. help: read password from a file
  328 +
  329 + The first line of the specified file is used as the password.
  330 + This is used in place of the --password option.
  331 +
148 332 Reads the first line from the specified file and uses it as the
149   - password for accessing encrypted files.
150   - :samp:`{filename}`
151   - may be ``-`` to read the password from standard input. Note that, in
152   - this case, the password is echoed and there is no prompt, so use with
  333 + password for accessing encrypted files. :samp:`{filename}` may be
  334 + ``-`` to read the password from standard input, but if you do that
  335 + the password is echoed and there is no prompt, so use ``-`` with
153 336 caution.
154 337  
155   -:samp:`--is-encrypted`
156   - Silently exit with status 0 if the file is encrypted or status 2 if
157   - the file is not encrypted. This is useful for shell scripts. Other
158   - options are ignored if this is given. This option is mutually
159   - exclusive with :samp:`--requires-password`. Both this
160   - option and :samp:`--requires-password` exit with
161   - status 2 for non-encrypted files.
162   -
163   -:samp:`--requires-password`
164   - Silently exit with status 0 if a password (other than as supplied) is
165   - required. Exit with status 2 if the file is not encrypted. Exit with
166   - status 3 if the file is encrypted but requires no password or the
167   - correct password has been supplied. This is useful for shell scripts.
168   - Note that any supplied password is used when opening the file. When
169   - used with a :samp:`--password` option, this option
170   - can be used to check the correctness of the password. In that case,
171   - an exit status of 3 means the file works with the supplied password.
172   - This option is mutually exclusive with
173   - :samp:`--is-encrypted`. Both this option and
174   - :samp:`--is-encrypted` exit with status 2 for
175   - non-encrypted files.
  338 +.. qpdf:option:: --verbose
  339 +
  340 + .. help: print additional information
  341 +
  342 + Output additional information about various things qpdf is
  343 + doing, including information about files created and operations
  344 + performed.
  345 +
  346 + Increase verbosity of output. This includes information files
  347 + created, image optimization, and several other operations. In some
  348 + cases, it also displays additional information when inspection
  349 + options (see :ref:`inspection-options`) are used.
  350 +
  351 +.. qpdf:option:: --progress
  352 +
  353 + .. help: show progress when writing
  354 +
  355 + Indicate progress when writing files.
  356 +
  357 + Indicate progress while writing output files. Progress indication
  358 + does not start until writing starts, so if complicated
  359 + transformations are being applied before the write progress begins,
  360 + there may be a delay before progress indicators are seen.
  361 +
  362 +.. qpdf:option:: --no-warn
  363 +
  364 + .. help: suppress printing warning messages
  365 +
  366 + Suppress printing warning messages. If warnings were
  367 + encountered, qpdf still exits with exit status 3.
  368 + Use --warning-exit-0 with --no-warn to completely ignore
  369 + warnings.
  370 +
  371 + Suppress writing of warnings to stderr. If warnings were detected
  372 + and suppressed, :command:`qpdf` will still exit with exit code 3.
  373 + To completely ignore warnings, also specify
  374 + :qpdf:ref:`--warning-exit-0`. Use with caution as qpdf is not
  375 + always successful in recovering from situations that cause warnings
  376 + to be issued.
  377 +
  378 +.. qpdf:option:: --deterministic-id
  379 +
  380 + .. help: generate ID deterministically
  381 +
  382 + Generate a secure, random document ID only using static
  383 + information, such as the page contents. Does not use the file's
  384 + name or attributes or the current time.
  385 +
  386 + Generate of a secure, random document ID using deterministic
  387 + values. This prevents use of timestamp and output file name
  388 + information in the ID generation. Instead, at some slight
  389 + additional runtime cost, the ID field is generated to include a
  390 + digest of the significant parts of the content of the output PDF
  391 + file. This means that a given qpdf operation should generate the
  392 + same ID each time it is run, which can be useful when caching
  393 + results or for generation of some test data. Use of this flag is
  394 + not compatible with creation of encrypted files.
  395 +
  396 + Note that there is *no guarantee* that different versions of qpdf
  397 + will generate the same deterministic ID given the same generation
  398 + code and input. While care is taken to avoid gratuitous changes,
  399 + new versions of qpdf may include changes that affect the output.
  400 + This option can be useful for testing. See also
  401 + :qpdf:ref:`--static-id`.
  402 +
  403 +.. qpdf:option:: --allow-weak-crypto
  404 +
  405 + .. help: allow insecure cryptographic algorithms
  406 +
  407 + All creation of files with weak cryptographic algorithms. This
  408 + option is necessary to create 40-bit files or 128-bit files that
  409 + use RC4 encryption.
176 410  
177   -:samp:`--verbose`
178   - Increase verbosity of output. For now, this just prints some
179   - indication of any file that it creates.
180   -
181   -:samp:`--progress`
182   - Indicate progress while writing files.
183   -
184   -:samp:`--no-warn`
185   - Suppress writing of warnings to stderr. If warnings were detected and
186   - suppressed, :command:`qpdf` will still exit with exit
187   - code 3. See also :samp:`--warning-exit-0`.
188   -
189   -:samp:`--warning-exit-0`
190   - If warnings are found but no errors, exit with exit code 0 instead 3.
191   - When combined with :samp:`--no-warn`, the effect is
192   - for :command:`qpdf` to completely ignore warnings.
193   -
194   -:samp:`--linearize`
195   - Causes generation of a linearized (web-optimized) output file.
196   -
197   -:samp:`--replace-input`
198   - If specified, the output file name should be omitted. This option
199   - tells qpdf to replace the input file with the output. It does this by
200   - writing to
201   - :file:`{infilename}.~qpdf-temp#`
202   - and, when done, overwriting the input file with the temporary file.
203   - If there were any warnings, the original input is saved as
204   - :file:`{infilename}.~qpdf-orig`.
205   -
206   -:samp:`--copy-encryption=file`
207   - Encrypt the file using the same encryption parameters, including user
208   - and owner password, as the specified file. Use
209   - :samp:`--encryption-file-password` to specify a
210   - password if one is needed to open this file. Note that copying the
211   - encryption parameters from a file also copies the first half of
212   - ``/ID`` from the file since this is part of the encryption
213   - parameters.
214   -
215   -:samp:`--encryption-file-password=password`
216   - If the file specified with :samp:`--copy-encryption`
217   - requires a password, specify the password using this option. Note
218   - that only one of the user or owner password is required. Both
219   - passwords will be preserved since QPDF does not distinguish between
220   - the two passwords. It is possible to preserve encryption parameters,
221   - including the owner password, from a file even if you don't know the
222   - file's owner password.
223   -
224   -:samp:`--allow-weak-crypto`
225 411 Starting with version 10.4, qpdf issues warnings when requested to
226 412 create files using RC4 encryption. This option suppresses those
227 413 warnings. In future versions of qpdf, qpdf will refuse to create
228   - files with weak cryptography when this flag is not given. See :ref:`weak-crypto` for additional details.
  414 + files with weak cryptography when this flag is not given. See
  415 + :ref:`weak-crypto` for additional details.
  416 +
  417 +.. qpdf:option:: --keep-files-open=[yn]
  418 +
  419 + .. help: manage keeping multiple files open
  420 +
  421 + When qpdf needs to work with many files, as when merging large
  422 + numbers of files, explicitly indicate whether files should be
  423 + kept open. The default behavior is to determine this based on
  424 + the number of files.
  425 +
  426 + This option controls whether qpdf keeps individual files open while
  427 + merging. By default, qpdf keeps files open when merging unless more
  428 + than 200 files are specified, in which case files are open as
  429 + needed and closed when finished. The behavior of repeatedly opening
  430 + and closing files may impose a large performance penalty with some
  431 + file systems, especially networked file systems. If you know that
  432 + you have a large enough open file limit and are suffering from
  433 + performance problems, or if you have an open file limit smaller
  434 + than 200, you can use this option to override the default behavior
  435 + by specifying :samp:`--keep-files-open=y` to force :command:`qpdf`
  436 + to keep files open or :samp:`--keep-files-open=n` to force it to
  437 + only open files as needed. See also
  438 + :qpdf:ref:`--keep-files-open-threshold`.
  439 +
  440 + Historical note: prior to version 8.1.0, qpdf always kept all files
  441 + open, but this meant that the number of files that could be merged
  442 + was limited by the operating system's open file limit. Version
  443 + 8.1.0 opened files as they were referenced and closed them after
  444 + each read, but this caused a major performance impact. Version
  445 + 8.2.0 optimized the performance but did so in a way that, for local
  446 + file systems, there was a small but unavoidable performance hit,
  447 + but for networked file systems, the performance impact could be
  448 + very high. The current behavior was introduced in qpdf version
  449 + 8.2.1.
  450 +
  451 +.. qpdf:option:: --keep-files-open-threshold=count
  452 +
  453 + .. help: set threshold for --keep-files-open
  454 +
  455 + Set the threshold used by --keep-files-open, overriding the
  456 + default value of 200.
  457 +
  458 + If specified, overrides the default value of 200 used as the
  459 + threshold for qpdf deciding whether or not to keep files open. See
  460 + :qpdf:ref:`--keep-files-open` for details.
  461 +
  462 +.. _advanced-control-options:
  463 +
  464 +Advanced Control Options
  465 +------------------------
  466 +
  467 +.. help-topic advanced-control: tweak qpdf's behavior
  468 +
  469 + Advanced control options control qpdf's behavior in ways that would
  470 + normally never be needed by a user but that may be useful to
  471 + developers or people investigating problems with specific files.
  472 +
  473 +Advanced control options control qpdf's behavior in ways that would
  474 +normally never be needed by a user but that may be useful to
  475 +developers or people investigating problems with specific files.
  476 +
  477 +Related Options
  478 +~~~~~~~~~~~~~~~
  479 +
  480 +.. qpdf:option:: --password-is-hex-key
229 481  
230   -:samp:`--encrypt options --`
231   - Causes generation an encrypted output file. Please see :ref:`encryption-options` for details on how to specify
232   - encryption parameters.
  482 + .. help: provide hex-encoded encryption key
233 483  
234   -:samp:`--decrypt`
235   - Removes any encryption on the file. A password must be supplied if
236   - the file is password protected.
  484 + Provide the underlying file encryption key has a hex-encoded
  485 + string rather than supplying a password. This is an expert
  486 + option.
237 487  
238   -:samp:`--password-is-hex-key`
239 488 Overrides the usual computation/retrieval of the PDF file's
240 489 encryption key from user/owner password with an explicit
241 490 specification of the encryption key. When this option is specified,
242   - the argument to the :samp:`--password` option is
243   - interpreted as a hexadecimal-encoded key value. This only applies to
244   - the password used to open the main input file. It does not apply to
245   - other files opened by :samp:`--pages` or other
246   - options or to files being written.
  491 + the argument to the :qpdf:ref:`--password` option is interpreted as
  492 + a hexadecimal-encoded key value. This only applies to the password
  493 + used to open the main input file. It does not apply to other files
  494 + opened by :qpdf:ref:`--pages` or other options or to files being
  495 + written.
247 496  
248 497 Most users will never have a need for this option, and no standard
249 498 viewers support this mode of operation, but it can be useful for
250 499 forensic or investigatory purposes. For example, if a PDF file is
251 500 encrypted with an unknown password, a brute-force attack using the
252   - key directly is sometimes more efficient than one using the password.
253   - Also, if a file is heavily damaged, it may be possible to derive the
254   - encryption key and recover parts of the file using it directly. To
255   - expose the encryption key used by an encrypted file that you can open
256   - normally, use the :samp:`--show-encryption-key`
257   - option.
258   -
259   -:samp:`--suppress-password-recovery`
  501 + key directly is sometimes more efficient than one using the
  502 + password. Also, if a file is heavily damaged, it may be possible to
  503 + derive the encryption key and recover parts of the file using it
  504 + directly. To expose the encryption key used by an encrypted file
  505 + that you can open normally, use the
  506 + :qpdf:ref:`--show-encryption-key` option.
  507 +
  508 +.. qpdf:option:: --suppress-password-recovery
  509 +
  510 + .. help: don't try different password encodings
  511 +
  512 + Suppress qpdf's behavior of attempting different encodings of a
  513 + password that contains non-ASCII Unicode characters if the first
  514 + attempt doesn't succeed.
  515 +
260 516 Ordinarily, qpdf attempts to automatically compensate for passwords
261 517 specified in the wrong character encoding. This option suppresses
262 518 that behavior. Under normal conditions, there are no reasons to use
263   - this option. See :ref:`unicode-passwords` for a
264   - discussion
  519 + this option. See :ref:`unicode-passwords` for a discussion
  520 +
  521 +.. qpdf:option:: --password-mode={mode}
  522 +
  523 + .. help: tweak how qpdf encodes passwords
  524 +
  525 + Fine-tune how qpdf controls encoding of Unicode passwords. Valid
  526 + options are auto, bytes, hex-bytes, and unicode.
265 527  
266   -:samp:`--password-mode={mode}`
267 528 This option can be used to fine-tune how qpdf interprets Unicode
268 529 (non-ASCII) password strings passed on the command line. With the
269   - exception of the :samp:`hex-bytes` mode, these only
270   - apply to passwords provided when encrypting files. The
271   - :samp:`hex-bytes` mode also applies to passwords
272   - specified for reading files. For additional discussion of the
273   - supported password modes and when you might want to use them, see
274   - :ref:`unicode-passwords`. The following modes
275   - are supported:
276   -
277   - - :samp:`auto`: Automatically determine whether the
278   - specified password is a properly encoded Unicode (UTF-8) string,
279   - and transcode it as required by the PDF spec based on the type
  530 + exception of the :samp:`hex-bytes` mode, these only apply to
  531 + passwords provided when encrypting files. The :samp:`hex-bytes`
  532 + mode also applies to passwords specified for reading files. For
  533 + additional discussion of the supported password modes and when you
  534 + might want to use them, see :ref:`unicode-passwords`. The following
  535 + modes are supported:
  536 +
  537 + - :samp:`auto`: Automatically determine whether the specified
  538 + password is a properly encoded Unicode (UTF-8) string, and
  539 + transcode it as required by the PDF spec based on the type
280 540 encryption being applied. On Windows starting with version 8.4.0,
281 541 and on almost all other modern platforms, incoming passwords will
282 542 be properly encoded in UTF-8, so this is almost always what you
283 543 want.
284 544  
285   - - :samp:`unicode`: Tells qpdf that the incoming
286   - password is UTF-8, overriding whatever its automatic detection
287   - determines. The only difference between this mode and
288   - :samp:`auto` is that qpdf will fail with an error
289   - message if the password is not valid UTF-8 instead of falling back
290   - to :samp:`bytes` mode with a warning.
291   -
292   - - :samp:`bytes`: Interpret the password as a literal
293   - byte string. For non-Windows platforms, this is what versions of
294   - qpdf prior to 8.4.0 did. For Windows platforms, there is no way to
295   - specify strings of binary data on the command line directly, but
296   - you can use the :samp:`@filename` option to do it,
297   - in which case this option forces qpdf to respect the string of
298   - bytes as provided. This option will allow you to encrypt PDF files
299   - with passwords that will not be usable by other readers.
300   -
301   - - :samp:`hex-bytes`: Interpret the password as a
302   - hex-encoded string. This provides a way to pass binary data as a
303   - password on all platforms including Windows. As with
304   - :samp:`bytes`, this option may allow creation of
305   - files that can't be opened by other readers. This mode affects
306   - qpdf's interpretation of passwords specified for decrypting files
307   - as well as for encrypting them. It makes it possible to specify
308   - strings that are encoded in some manner other than the system's
309   - default encoding.
310   -
311   -:samp:`--rotate=[+|-]angle[:page-range]`
312   - Apply rotation to specified pages. The
313   - :samp:`page-range` portion of the option value has
314   - the same format as page ranges in :ref:`page-selection`. If the page range is omitted, the
315   - rotation is applied to all pages. The :samp:`angle`
316   - portion of the parameter may be either 0, 90, 180, or 270. If
317   - preceded by :samp:`+` or :samp:`-`,
318   - the angle is added to or subtracted from the specified pages'
319   - original rotations. This is almost always what you want. Otherwise
320   - the pages' rotations are set to the exact value, which may cause the
321   - appearances of the pages to be inconsistent, especially for scans.
322   - For example, the command :command:`qpdf in.pdf out.pdf
323   - --rotate=+90:2,4,6 --rotate=180:7-8` would rotate pages
324   - 2, 4, and 6 90 degrees clockwise from their original rotation and
325   - force the rotation of pages 7 through 8 to 180 degrees regardless of
326   - their original rotation, and the command :command:`qpdf in.pdf
327   - out.pdf --rotate=+180` would rotate all pages by 180
328   - degrees.
329   -
330   -:samp:`--keep-files-open={[yn]}`
331   - This option controls whether qpdf keeps individual files open while
332   - merging. Prior to version 8.1.0, qpdf always kept all files open, but
333   - this meant that the number of files that could be merged was limited
334   - by the operating system's open file limit. Version 8.1.0 opened files
335   - as they were referenced and closed them after each read, but this
336   - caused a major performance impact. Version 8.2.0 optimized the
337   - performance but did so in a way that, for local file systems, there
338   - was a small but unavoidable performance hit, but for networked file
339   - systems, the performance impact could be very high. Starting with
340   - version 8.2.1, the default behavior is that files are kept open if no
341   - more than 200 files are specified, but this default behavior can be
342   - explicitly overridden with the
343   - :samp:`--keep-files-open` flag. If you are merging
344   - more than 200 files but less than the operating system's max open
345   - files limit, you may want to use
346   - :samp:`--keep-files-open=y`, especially if working
347   - over a networked file system. If you are using a local file system
348   - where the overhead is low and you might sometimes merge more than the
349   - OS limit's number of files from a script and are not worried about a
350   - few seconds additional processing time, you may want to specify
351   - :samp:`--keep-files-open=n`. The threshold for
352   - switching may be changed from the default 200 with the
353   - :samp:`--keep-files-open-threshold` option.
354   -
355   -:samp:`--keep-files-open-threshold={count}`
356   - If specified, overrides the default value of 200 used as the
357   - threshold for qpdf deciding whether or not to keep files open. See
358   - :samp:`--keep-files-open` for details.
359   -
360   -:samp:`--pages options --`
361   - Select specific pages from one or more input files. See :ref:`page-selection` for details on how to do
362   - page selection (splitting and merging).
363   -
364   -:samp:`--collate={n}`
365   - When specified, collate rather than concatenate pages from files
366   - specified with :samp:`--pages`. With a numeric
367   - argument, collate in groups of :samp:`{n}`.
368   - The default is 1. See :ref:`page-selection` for additional details.
369   -
370   -:samp:`--flatten-rotation`
371   - For each page that is rotated using the ``/Rotate`` key in the page's
372   - dictionary, remove the ``/Rotate`` key and implement the identical
373   - rotation semantics by modifying the page's contents. This option can
374   - be useful to prepare files for buggy PDF applications that don't
375   - properly handle rotated pages.
376   -
377   -:samp:`--split-pages=[n]`
378   - Write each group of :samp:`n` pages to a separate
379   - output file. If :samp:`n` is not specified, create
380   - single pages. Output file names are generated as follows:
  545 + - :samp:`unicode`: Tells qpdf that the incoming password is UTF-8,
  546 + overriding whatever its automatic detection determines. The only
  547 + difference between this mode and :samp:`auto` is that qpdf will
  548 + fail with an error message if the password is not valid UTF-8
  549 + instead of falling back to :samp:`bytes` mode with a warning.
  550 +
  551 + - :samp:`bytes`: Interpret the password as a literal byte string.
  552 + For non-Windows platforms, this is what versions of qpdf prior to
  553 + 8.4.0 did. For Windows platforms, there is no way to specify
  554 + strings of binary data on the command line directly, but you can
  555 + use a :samp:`@filename` option or :qpdf:ref:`--password-file` to
  556 + do it, in which case this option forces qpdf to respect the
  557 + string of bytes as provided. Note that this option may cause you
  558 + to encrypt PDF files with passwords that will not be usable by
  559 + other readers.
  560 +
  561 + - :samp:`hex-bytes`: Interpret the password as a hex-encoded
  562 + string. This provides a way to pass binary data as a password on
  563 + all platforms including Windows. As with :samp:`bytes`, this
  564 + option may allow creation of files that can't be opened by other
  565 + readers. This mode affects qpdf's interpretation of passwords
  566 + specified for decrypting files as well as for encrypting them. It
  567 + makes it possible to specify strings that are encoded in some
  568 + manner other than the system's default encoding.
  569 +
  570 +.. qpdf:option:: --suppress-recovery
  571 +
  572 + .. help: suppress error recovery
  573 +
  574 + Avoid attempting to recover when errors are found in a file's
  575 + cross reference table or stream lengths.
  576 +
  577 + Prevents qpdf from attempting to reconstruct a file's cross
  578 + reference table when there are errors reading objects from the
  579 + file. Recovery is triggered by a variety of situations. While
  580 + usually successful, it uses heuristics that don't work on all
  581 + files. If this option is given, :command:`qpdf` fails on the first
  582 + error it encounters.
  583 +
  584 +.. qpdf:option:: --ignore-xref-streams
  585 +
  586 + .. help: use xref tables rather than streams
  587 +
  588 + Ignore any cross-reference streams in the file, falling back to
  589 + cross-reference tables or triggering document recovery.
  590 +
  591 + Tells qpdf to ignore any cross-reference streams, falling back to
  592 + any embedded cross-reference tables or triggering document
  593 + recovery. Ordinarily, qpdf reads cross-reference streams when they
  594 + are present in a PDF file. If this option is specified, specified,
  595 + qpdf will ignore any cross-reference streams for hybrid PDF files.
  596 + The purpose of hybrid files is to make some content available to
  597 + viewers that are not aware of cross-reference streams. It is almost
  598 + never desirable to ignore them. The only time when you might want
  599 + to use this feature is if you are testing creation of hybrid PDF
  600 + files and wish to see how a PDF consumer that doesn't understand
  601 + object and cross-reference streams would interpret such a file.
  602 +
  603 +.. _transformation-options:
  604 +
  605 +PDF Transformation
  606 +------------------
381 607  
382   - - If the string ``%d`` appears in the output file name, it is
383   - replaced with a range of zero-padded page numbers starting from 1.
  608 +.. help-topic transformation: make structural PDF changes
384 609  
385   - - Otherwise, if the output file name ends in
386   - :file:`.pdf` (case insensitive), a zero-padded
387   - page range, preceded by a dash, is inserted before the file
388   - extension.
  610 + The options below tell qpdf to apply transformations that change
  611 + the structure without changing the content.
389 612  
390   - - Otherwise, the file name is appended with a zero-padded page range
391   - preceded by a dash.
  613 +The options discussed in this section tell qpdf to apply
  614 +transformations that change the structure of a PDF file without
  615 +changing the content. Examples include creating linearized
  616 +(web-optimized) files, adding or removing encryption, restructuring
  617 +files for older viewers, and rewriting files for human inspection,
  618 +among others. See also :ref:`modification-options`.
392 619  
393   - Page ranges are a single number in the case of single-page groups or
394   - two numbers separated by a dash otherwise. For example, if
395   - :file:`infile.pdf` has 12 pages
396   -
397   - - :command:`qpdf --split-pages infile.pdf %d-out`
398   - would generate files :file:`01-out` through
399   - :file:`12-out`
400   -
401   - - :command:`qpdf --split-pages=2 infile.pdf
402   - outfile.pdf` would generate files
403   - :file:`outfile-01-02.pdf` through
404   - :file:`outfile-11-12.pdf`
405   -
406   - - :command:`qpdf --split-pages infile.pdf
407   - something.else` would generate files
408   - :file:`something.else-01` through
409   - :file:`something.else-12`
410   -
411   - Note that outlines, threads, and other global features of the
412   - original PDF file are not preserved. For each page of output, this
413   - option creates an empty PDF and copies a single page from the output
414   - into it. If you require the global data, you will have to run
415   - :command:`qpdf` with the
416   - :samp:`--pages` option once for each file. Using
417   - :samp:`--split-pages` is much faster if you don't
418   - require the global data.
419   -
420   -:samp:`--overlay options --`
421   - Overlay pages from another file onto the output pages. See :ref:`overlay-underlay` for details on
422   - overlay/underlay.
423   -
424   -:samp:`--underlay options --`
425   - Overlay pages from another file onto the output pages. See :ref:`overlay-underlay` for details on
426   - overlay/underlay.
427   -
428   -Password-protected files may be opened by specifying a password. By
429   -default, qpdf will preserve any encryption data associated with a file.
430   -If :samp:`--decrypt` is specified, qpdf will attempt to
431   -remove any encryption information. If :samp:`--encrypt`
432   -is specified, qpdf will replace the document's encryption parameters
433   -with whatever is specified.
434   -
435   -Note that qpdf does not obey encryption restrictions already imposed on
436   -the file. Doing so would be meaningless since qpdf can be used to remove
437   -encryption from the file entirely. This functionality is not intended to
438   -be used for bypassing copyright restrictions or other restrictions
439   -placed on files by their producers.
440   -
441   -Prior to 8.4.0, in the case of passwords that contain characters that
442   -fall outside of 7-bit US-ASCII, qpdf left the burden of supplying
443   -properly encoded encryption and decryption passwords to the user.
444   -Starting in qpdf 8.4.0, qpdf does this automatically in most cases. For
445   -an in-depth discussion, please see :ref:`unicode-passwords`. Previous versions of this manual
446   -described workarounds using the :command:`iconv` command.
447   -Such workarounds are no longer required or recommended with qpdf 8.4.0.
448   -However, for backward compatibility, qpdf attempts to detect those
449   -workarounds and do the right thing in most cases.
  620 +Related Options
  621 +~~~~~~~~~~~~~~~
450 622  
451   -.. _encryption-options:
  623 +.. qpdf:option:: --linearize
452 624  
453   -Encryption Options
454   -------------------
  625 + .. help: linearize (web-optimize) output
455 626  
456   -To change the encryption parameters of a file, use the --encrypt flag.
457   -The syntax is
  627 + Create linearized (web-optimized) output files.
458 628  
459   -::
  629 + Create linearized (web-optimized) output files. Linearized files
  630 + are formatted in a way that allows compliant readers to begin
  631 + displaying a PDF file before it is fully downloaded. Ordinarily,
  632 + the entire file must be present before it can be rendered because
  633 + important cross-reference information typically appears at the end
  634 + of the file.
460 635  
461   - --encrypt user-password owner-password key-length [ restrictions ] --
  636 +.. qpdf:option:: --encrypt user owner key-length [ options ] --
462 637  
463   -Note that ":samp:`--`" terminates parsing of encryption
464   -flags and must be present even if no restrictions are present.
  638 + .. help: start encryption options
465 639  
466   -Either or both of the user password and the owner password may be empty
467   -strings. Starting in qpdf 10.2, qpdf defaults to not allowing creation
468   -of PDF files with a non-empty user password, an empty owner password,
469   -and a 256-bit key since such files can be opened with no password. If
470   -you want to create such files, specify the encryption option
471   -:samp:`--allow-insecure`, as described below.
  640 + Run qpdf --help=encryption for details.
472 641  
473   -The value for
474   -:samp:`{key-length}` may
475   -be 40, 128, or 256. The restriction flags are dependent upon key length.
476   -When no additional restrictions are given, the default is to be fully
477   -permissive.
  642 + This flag starts encryption options, used to create encrypted
  643 + files.
478 644  
479   -If :samp:`{key-length}`
480   -is 40, the following restriction options are available:
  645 + Please see :ref:`encryption-options` for details about creating
  646 + encrypted files.
  647 +
  648 +.. qpdf:option:: --decrypt
  649 +
  650 + .. help: remove encryption from input file
  651 +
  652 + Create an unencrypted output file even if the input file was
  653 + encrypted. Normally qpdf preserves whatever encryption was
  654 + present on the input file. This option overrides that behavior.
  655 +
  656 + Create an output file with no encryption even if the input file is
  657 + encrypted. This option overrides the default behavior of preserving
  658 + whatever encryption was present on the input file. This
  659 + functionality is not intended to be used for bypassing copyright
  660 + restrictions or other restrictions placed on files by their
  661 + producers. See also :qpdf:ref:`--copy-encryption`.
  662 +
  663 +.. qpdf:option:: --copy-encryption=file
  664 +
  665 + .. help: copy another file's encryption details
  666 +
  667 + Copy encryption details from the specified file instead of
  668 + preserving the input file's encryption. Use --encryption-file-password
  669 + to specify the encryption file's password.
  670 +
  671 + Copy encryption parameters, including the user password, the owner
  672 + password, and all security restrictions, from the specified file
  673 + instead of preserving encryption details from the input file. This
  674 + works even if only one of the user password or owner password is
  675 + known. If the encryption file requires a password, use the
  676 + :qpdf:ref:`--encryption-file-password` option to set it. Note that
  677 + copying the encryption parameters from a file also copies the first
  678 + half of ``/ID`` from the file since this is part of the encryption
  679 + parameters. This option can be useful if you need to decrypt a file
  680 + to make manual changes to it or to change it outside of qpdf, and
  681 + then you want to restore the original encryption on the file
  682 + without having to manual specify all the individual settings. See
  683 + also :qpdf:ref:`--decrypt`.
  684 +
  685 +.. qpdf:option:: --encryption-file-password=password
  686 +
  687 + .. help: supply password for --copy-encryption
  688 +
  689 + If the file named in --copy-encryption requires a password, use
  690 + this option to specify the password.
  691 +
  692 + If the file specified with :qpdf:ref:`--copy-encryption`
  693 + requires a password, specify the password using this option. This
  694 + option is necessary because the :qpdf:ref:`--password` option
  695 + applies to the input file, not the file from which encryption is
  696 + being copied.
  697 +
  698 +.. qpdf:option:: --qdf
  699 +
  700 + .. help: enable viewing PDF code in a text editor
  701 +
  702 + Create a PDF file suitable for viewing in a text editor and even
  703 + editing. This is to edit the PDF code, not the page contents.
  704 + All streams that can be uncompressed are uncompressed, and
  705 + content streams are normalized, among other changes. The
  706 + companion tool "fix-qdf" can be used to repair hand-edited QDF
  707 + files. QDF is a feature specific to the qpdf tool. There is a
  708 + chapter about it in the manual.
  709 +
  710 + Create a PDF file suitable for viewing and editing in a text
  711 + editor. This is to edit the PDF code, not the page contents. To
  712 + edit a QDF file, your text editor must preserve binary data. In a
  713 + QDF file, all streams that can be uncompressed are uncompressed,
  714 + and content streams are normalized, among other changes. The
  715 + companion tool :command:`fix-qdf` can be used to repair hand-edited
  716 + QDF files. QDF is a feature specific to the qpdf tool. For
  717 + additional information about QDF mode, see :ref:`qdf`. Note that
  718 + :qpdf:ref:`--linearize` disables QDF mode.
  719 +
  720 + QDF mode has full support for object streams, but sometimes it's
  721 + easier to locate a specific object if object streams are disabled.
  722 + When trying to understand some PDF construct by inspecting an
  723 + existing file, it can often be useful to combine :samp:`--qdf` with
  724 + :samp:`--object-streams=disable`.
  725 +
  726 + This flag changes some of the defaults of other options: stream
  727 + data is uncompressed, content streams are normalized, and
  728 + encryption is removed. These defaults can still be overridden by
  729 + specifying the appropriate options with :samp:`--qdf`.
  730 + Additionally, in QDF mode, stream lengths are stored as indirect
  731 + objects, objects are formatted in a less efficient but more
  732 + readable fashion, and the documents are interspersed with comments
  733 + that make it easier for the user to find things and also make it
  734 + possible for :command:`fix-qdf` to work properly. When editing QDF
  735 + files, it is not necessary to maintain the object formatting.
  736 +
  737 + When normalizing content, if qpdf runs into any lexical errors, it
  738 + will print a warning indicating that content may be damaged. If you
  739 + want to create QDF files without content normalization, can you run
  740 + with :samp:`--qdf --normalize-content=n`. You can also create a
  741 + non-QDF file with uncompressed streams using
  742 + :samp:`--stream-data=uncompress`. These will both uncompress all
  743 + the streams but will not attempt to normalize content. Please note
  744 + that if you are using content normalization or QDF mode for the
  745 + purpose of manually inspecting files, you don't have to care about
  746 + this.
  747 +
  748 + See also :qpdf:ref:`--no-original-object-ids`.
  749 +
  750 +.. qpdf:option:: --no-original-object-ids
  751 +
  752 + .. help: omit original object ID in qdf
  753 +
  754 + Omit comments in a QDF file indicating the object ID an object
  755 + had in the original file.
481 756  
482   -:samp:`--print=[yn]`
483   - Determines whether or not to allow printing.
  757 + Suppresses inclusion of original object ID comments in QDF files.
  758 + This can be useful when generating QDF files for test purposes,
  759 + particularly when comparing them to determine whether two PDF files
  760 + have identical content. The original object ID comment is there by
  761 + default because it makes it easier to trace objects back to the
  762 + original file.
  763 +
  764 +.. qpdf:option:: --compress-streams=[yn]
  765 +
  766 + .. help: compress uncompressed streams
  767 +
  768 + Setting --compress-streams=n prevents qpdf from compressing
  769 + uncompressed streams. This can be useful if you are leaving some
  770 + streams uncompressed intentionally.
  771 +
  772 + By default, or with :samp:`--compress-streams=y`, qpdf will
  773 + compress streams using the flate compression algorithm (used by zip
  774 + and gzip) unless those streams are compressed in some other way.
  775 + This analysis is made after qpdf attempts to uncompress streams and
  776 + is therefore closely related to :qpdf:ref:`--decode-level`. To
  777 + suppress this behavior and leave streams streams uncompressed, use
  778 + :samp:`--compress-streams=n`. In QDF mode (see :ref:`qdf` and
  779 + :qpdf:ref:`--qdf`), the default is to leave streams uncompressed.
  780 +
  781 +.. qpdf:option:: --decode-level=option
  782 +
  783 + .. help: control which streams to uncompress
  784 +
  785 + When uncompressing streams, control which types of compression
  786 + schemes should be uncompressed:
  787 + - none: don't uncompress anything
  788 + - generalized: uncompress streams compressed with a
  789 + general-purpose compression algorithm. This is the default.
  790 + - specialized: in addition to generalized, also uncompress
  791 + streams compressed with a special-purpose but non-lossy
  792 + compression scheme
  793 + - all: in addition to specialized, uncompress streams compressed
  794 + with lossy compression schemes like JPEG (DCT)
  795 + qpdf does not know how to uncompress all compression schemes.
484 796  
485   -:samp:`--modify=[yn]`
486   - Determines whether or not to allow document modification.
  797 + Controls which streams qpdf tries to decode. The default is
  798 + :samp:`generalized`.
487 799  
488   -:samp:`--extract=[yn]`
489   - Determines whether or not to allow text/image extraction.
  800 + The following options are available:
490 801  
491   -:samp:`--annotate=[yn]`
492   - Determines whether or not to allow comments and form fill-in and
493   - signing.
  802 + - :samp:`none`: do not attempt to decode any streams
494 803  
495   -If :samp:`{key-length}`
496   -is 128, the following restriction options are available:
  804 + - :samp:`generalized`: decode streams filtered with supported
  805 + generalized filters: ``/LZWDecode``, ``/FlateDecode``,
  806 + ``/ASCII85Decode``, and ``/ASCIIHexDecode``. We define
  807 + generalized filters as those to be used for general-purpose
  808 + compression or encoding, as opposed to filters specifically
  809 + designed for image data.
497 810  
498   -:samp:`--accessibility=[yn]`
499   - Determines whether or not to allow accessibility to visually
500   - impaired. The qpdf library disregards this field when AES is used or
501   - when 256-bit encryption is used. You should really never disable
502   - accessibility, but qpdf lets you do it in case you need to configure
503   - a file this way for testing purposes. The PDF spec says that
504   - conforming readers should disregard this permission and always allow
505   - accessibility.
  811 + - :samp:`specialized`: in addition to generalized, decode streams
  812 + with supported non-lossy specialized filters; currently this is
  813 + just ``/RunLengthDecode``
506 814  
507   -:samp:`--extract=[yn]`
508   - Determines whether or not to allow text/graphic extraction.
  815 + - :samp:`all`: in addition to generalized and specialized, decode
  816 + streams with supported lossy filters; currently this is just
  817 + ``/DCTDecode`` (JPEG)
509 818  
510   -:samp:`--assemble=[yn]`
511   - Determines whether document assembly (rotation and reordering of
512   - pages) is allowed.
  819 + There are several filters that :command:`qpdf` does not support.
  820 + These are left untouched regardless of the option. Future versions
  821 + of qpdf may support additional filters.
513 822  
514   -:samp:`--annotate=[yn]`
515   - Determines whether modifying annotations is allowed. This includes
516   - adding comments and filling in form fields. Also allows editing of
517   - form fields if :samp:`--modify-other=y` is given.
  823 + Because default value is ``generalized``, by default, when a stream
  824 + is encoded using non-lossy filters that qpdf understands and is not
  825 + already compressed using a good compression scheme, qpdf will
  826 + uncompress the stream. If ``--compress-streams=y`` is also in
  827 + effect, which is the default (see :qpdf:ref:`--compress-streams`),
  828 + the overall effect is that qpdf will recompress streams with
  829 + generalized filters using flate compression, effectively
  830 + eliminating LZW and ASCII-based filters. This is usually desirable
  831 + behavior but can be disabled with ``--decode-level=none``.
518 832  
519   -:samp:`--form=[yn]`
520   - Determines whether filling form fields is allowed.
  833 + As a special case, streams already compressed with ``/FlateDecode``
  834 + are not uncompressed and recompressed. You can change this behavior
  835 + with :qpdf:ref:`--recompress-flate`.
521 836  
522   -:samp:`--modify-other=[yn]`
523   - Allow all document editing except those controlled separately by the
524   - :samp:`--assemble`,
525   - :samp:`--annotate`, and
526   - :samp:`--form` options.
  837 +.. qpdf:option:: --stream-data=option
527 838  
528   -:samp:`--print={print-opt}`
529   - Controls printing access.
530   - :samp:`{print-opt}`
531   - may be one of the following:
  839 + .. help: control stream compression
532 840  
533   - - :samp:`full`: allow full printing
  841 + This option controls how streams are compressed in the output.
  842 + It is less granular than the newer options, --compress-streams
  843 + and --decode-level.
534 844  
535   - - :samp:`low`: allow low-resolution printing only
  845 + Options:
  846 + - compress: same as --compress-streams=y --decode-level=generalized
  847 + - preserve: same as --compress-streams=n --decode-level=none
  848 + - uncompress: same as --compress-streams=n --decode-level=generalized
536 849  
537   - - :samp:`none`: disallow printing
  850 + Controls transformation of stream data. This option predates the
  851 + :qpdf:ref:`--compress-streams` and :qpdf:ref:`--decode-level`
  852 + options. Those options can be used to achieve the same affect with
  853 + more control. The value of :samp:`{option}` may be one of the
  854 + following:
  855 +
  856 + - :samp:`compress`: recompress stream data when possible (default);
  857 + equivalent to :samp:`--compress-streams=y`
  858 + :samp:`--decode-level=generalized`. Does not recompress streams
  859 + already compressed with ``/FlateDecode`` unless
  860 + :qpdf:ref:`--recompress-flate` is also specified.
  861 +
  862 + - :samp:`preserve`: leave all stream data as is; equivalent to
  863 + :samp:`--compress-streams=n` :samp:`--decode-level=none`
  864 +
  865 + - :samp:`uncompress`: uncompress stream data compressed with
  866 + generalized filters when possible; equivalent to
  867 + :samp:`--compress-streams=n` :samp:`--decode-level=generalized`
  868 +
  869 +.. qpdf:option:: --recompress-flate
  870 +
  871 + .. help: uncompress and recompress flate
538 872  
539   -:samp:`--modify={modify-opt}`
540   - Controls modify access. This way of controlling modify access has
541   - less granularity than new options added in qpdf 8.4.
542   - :samp:`{modify-opt}`
  873 + The default generalized compression scheme used by PDF is flate,
  874 + which is the same as used by zip and gzip. Usually qpdf just
  875 + leaves these alone. This option tells qpdf to uncompress and
  876 + recompress streams compressed with flate. This can be useful
  877 + when combined with --compression-level.
  878 +
  879 + The default generalized compression scheme used by PDF is flate
  880 + (``/FlateDecode``), which is the same as used by :command:`zip` and
  881 + :command:`gzip`. Usually qpdf just leaves these alone. This option
  882 + tells :command:`qpdf` to uncompress and recompress streams
  883 + compressed with flate. This can be useful when combined with
  884 + :qpdf:ref:`--compression-level`. Using this option may make
  885 + :command:`qpdf` much slower when writing output files.
  886 +
  887 +.. qpdf:option:: --compression-level=level
  888 +
  889 + .. help: set compression level for flate
  890 +
  891 + Set a compression level from 1 (least, fastest) to 9 (most,
  892 + slowest) when compressing files with flate (used in zip and
  893 + gzip), which is the default compression for most PDF files.
  894 + You need --recompress-flate with this option if you want to
  895 + change already compressed streams.
  896 +
  897 + When writing new streams that are compressed with ``/FlateDecode``,
  898 + use the specified compression level. The value of :samp:`level`
  899 + should be a number from 1 to 9 and is passed directly to zlib,
  900 + which implements deflate compression. Lower numbers compress less
  901 + and are faster; higher numbers compress more and are slower. Note
  902 + that :command:`qpdf` doesn't uncompress and recompress streams
  903 + compressed with flate by default. To have this option apply to
  904 + already compressed streams, you should also specify
  905 + :qpdf:ref:`--recompress-flate`. If your goal is to shrink the size
  906 + of PDF files, you should also use
  907 + :samp:`--object-streams=generate`.
  908 +
  909 +.. qpdf:option:: --normalize-content=[yn]
  910 +
  911 + .. help: fix newlines in content streams
  912 +
  913 + Normalize newlines to UNIX-style newlines in PDF content
  914 + streams, which is useful for viewing them in a programmer's text
  915 + editor across multiple platforms. This is also turned on by
  916 + --qdf.
  917 +
  918 + Enables or disables normalization of newlines in PDF content
  919 + streams to UNIX-style newlines, which is useful for viewing files
  920 + in a programmer-friendly text edit across multiple platforms.
  921 + Content normalization is off by default, but is automatically
  922 + enabled by :qpdf:ref:`--qdf` (see also :ref:`qdf`). It is not
  923 + recommended to use this option for production use. If qpdf runs
  924 + into any lexical errors while normalizing content, it will print a
  925 + warning indicating that content may be damaged.
  926 +
  927 +.. qpdf:option:: --object-streams=mode
  928 +
  929 + .. help: control use of object streams
  930 +
  931 + Control what qpdf does regarding object streams. Options:
  932 + - preserve: preserve original object streams, if any (the default)
  933 + - disable: create output files with no object streams
  934 + - generate: create object streams, and compress objects when possible
  935 +
  936 + Controls handling of object streams. The value of :samp:`{mode}`
543 937 may be one of the following:
544 938  
545   - - :samp:`all`: allow full document modification
  939 + - :samp:`preserve`: preserve original object streams, if any (the
  940 + default)
546 941  
547   - - :samp:`annotate`: allow comment authoring, form
548   - operations, and document assembly
  942 + - :samp:`disable`: create output files with no object streams
549 943  
550   - - :samp:`form`: allow form field fill-in and signing
551   - and document assembly
  944 + - :samp:`generate`: create object streams, and compress objects
  945 + when possible
552 946  
553   - - :samp:`assembly`: allow document assembly only
  947 + Object streams are PDF streams that contain other objects. Putting
  948 + objects in object streams allows the PDF objects themselves to be
  949 + compressed, which can result in much smaller PDF files. Combining
  950 + this option with :qpdf:ref:`--compression-level` and
  951 + :qpdf:ref:`--recompress-flate` can often result in creation of
  952 + smaller PDF files.
554 953  
555   - - :samp:`none`: allow no modifications
  954 + Object streams, also known as compressed objects, were introduced
  955 + into the PDF specification at version 1.5 around 2003. Some ancient
  956 + PDF viewers may not support files with object streams. qpdf can be
  957 + used to transform files with object streams to files without object
  958 + streams or vice versa.
556 959  
557   - Using the :samp:`--modify` option does not allow you
558   - to create certain combinations of permissions such as allowing form
559   - filling but not allowing document assembly. Starting with qpdf 8.4,
560   - you can either just use the other options to control fields
561   - individually, or you can use something like :samp:`--modify=form
562   - --assembly=n` to fine tune.
  960 + In :samp:`preserve` mode, the relationship to objects and the
  961 + streams that contain them is preserved from the original file. If
  962 + the file has no object streams, qpdf will not add any. In
  963 + :samp:`disable` mode, all objects are written as regular,
  964 + uncompressed objects. The resulting file should be structurally
  965 + readable by older PDF viewers, though there is still a chance that
  966 + the file may contain other content that the older reader can't
  967 + support. In :samp:`generate` mode, qpdf will create its own object
  968 + streams. This will usually result in more compact PDF files. In
  969 + this mode, qpdf will also make sure the PDF version number in the
  970 + header is at least 1.5.
563 971  
564   -:samp:`--cleartext-metadata`
565   - If specified, any metadata stream in the document will be left
566   - unencrypted even if the rest of the document is encrypted. This also
567   - forces the PDF version to be at least 1.5.
  972 +.. qpdf:option:: --preserve-unreferenced
568 973  
569   -:samp:`--use-aes=[yn]`
570   - If :samp:`--use-aes=y` is specified, AES encryption
571   - will be used instead of RC4 encryption. This forces the PDF version
572   - to be at least 1.6.
573   -
574   -:samp:`--allow-insecure`
575   - From qpdf 10.2, qpdf defaults to not allowing creation of PDF files
576   - where the user password is non-empty, the owner password is empty,
577   - and a 256-bit key is in use. Files created in this way are insecure
578   - since they can be opened without a password. Users would ordinarily
579   - never want to create such files. If you are using qpdf to
580   - intentionally created strange files for testing (a definite valid use
581   - of qpdf!), this option allows you to create such insecure files.
582   -
583   -:samp:`--force-V4`
584   - Use of this option forces the ``/V`` and ``/R`` parameters in the
585   - document's encryption dictionary to be set to the value ``4``. As
586   - qpdf will automatically do this when required, there is no reason to
587   - ever use this option. It exists primarily for use in testing qpdf
588   - itself. This option also forces the PDF version to be at least 1.5.
  974 + .. help: preserve unreferenced objects
589 975  
590   -If :samp:`{key-length}`
591   -is 256, the minimum PDF version is 1.7 with extension level 8, and the
592   -AES-based encryption format used is the PDF 2.0 encryption method
593   -supported by Acrobat X. the same options are available as with 128 bits
594   -with the following exceptions:
  976 + Preserve all objects from the input even if not referenced.
595 977  
596   -:samp:`--use-aes`
597   - This option is not available with 256-bit keys. AES is always used
598   - with 256-bit encryption keys.
  978 + Tells qpdf to preserve objects that are not referenced when writing
  979 + the file. Ordinarily any object that is not referenced in a
  980 + traversal of the document from the trailer dictionary will be
  981 + discarded. This may be useful in working with some damaged files or
  982 + inspecting files with known unreferenced objects.
599 983  
600   -:samp:`--force-V4`
601   - This option is not available with 256 keys.
  984 + This flag is ignored for linearized files and has the effect of
  985 + causing objects in the new file to be written in order by object ID
  986 + from the original file. This does not mean that object numbers will
  987 + be the same since qpdf may create stream lengths as direct or
  988 + indirect differently from the original file, and the original file
  989 + may have gaps in its numbering.
602 990  
603   -:samp:`--force-R5`
604   - If specified, qpdf sets the minimum version to 1.7 at extension level
605   - 3 and writes the deprecated encryption format used by Acrobat version
606   - IX. This option should not be used in practice to generate PDF files
607   - that will be in general use, but it can be useful to generate files
608   - if you are trying to test proper support in another application for
609   - PDF files encrypted in this way.
  991 + See also :qpdf:ref:`--preserve-unreferenced-resources`, which does
  992 + something completely different.
610 993  
611   -The default for each permission option is to be fully permissive.
  994 +.. qpdf:option:: --remove-unreferenced-resources=option
  995 +
  996 + .. help: remove unreferenced page resources
  997 +
  998 + Remove from a page's resource dictionary any resources that are
  999 + not referenced in the page's contents. Options: "auto"
  1000 + (default), "yes", "no".
  1001 +
  1002 + Options: ``auto`` (the default), ``yes``, or ``no``.
  1003 +
  1004 + Starting with qpdf 8.1, when splitting pages, qpdf is able to
  1005 + attempt to remove images and fonts that are not used by a page even
  1006 + if they are referenced in the page's resources dictionary. When
  1007 + shared resources are in use, this behavior can greatly reduce the
  1008 + file sizes of split pages, but the analysis is very slow. In
  1009 + versions from 8.1 through 9.1.1, qpdf did this analysis by default.
  1010 + Starting in qpdf 10.0.0, if ``auto`` is used, qpdf does a quick
  1011 + analysis of the file to determine whether the file is likely to
  1012 + have unreferenced objects on pages, a pattern that frequently
  1013 + occurs when resource dictionaries are shared across multiple pages
  1014 + and rarely occurs otherwise. If it discovers this pattern, then it
  1015 + will attempt to remove unreferenced resources. Usually this means
  1016 + you get the slower splitting speed only when it's actually going to
  1017 + create smaller files. You can suppress removal of unreferenced
  1018 + resources altogether by specifying ``no`` or force it to do the
  1019 + full algorithm by specifying ``yes``.
612 1020  
613   -.. _page-selection:
  1021 + Other than cases in which you don't care about file size and care a
  1022 + lot about runtime, there are few reasons to use this option,
  1023 + especially now that ``auto`` mode is supported. One reason to use
  1024 + this is if you suspect that qpdf is removing resources it shouldn't
  1025 + be removing. If you encounter that case, please report it as bug at
  1026 + https://github.com/qpdf/qpdf/issues/.
614 1027  
615   -Page Selection Options
616   -----------------------
  1028 +.. qpdf:option:: --preserve-unreferenced-resources
617 1029  
618   -Starting with qpdf 3.0, it is possible to split and merge PDF files by
619   -selecting pages from one or more input files. Whatever file is given as
620   -the primary input file is used as the starting point, but its pages are
621   -replaced with pages as specified.
  1030 + .. help: use --remove-unreferenced-resources=no
622 1031  
623   -::
  1032 + Synonym for --remove-unreferenced-resources=no. Use that instead.
624 1033  
625   - --pages input-file [ --password=password ] [ page-range ] [ ... ] --
626   -
627   -Multiple input files may be specified. Each one is given as the name of
628   -the input file, an optional password (if required to open the file), and
629   -the range of pages. Note that ":samp:`--`" terminates
630   -parsing of page selection flags.
631   -
632   -Starting with qpf 8.4, the special input file name
633   -":file:`.`" can be used as a shortcut for the
634   -primary input filename.
635   -
636   -For each file that pages should be taken from, specify the file, a
637   -password needed to open the file (if any), and a page range. The
638   -password needs to be given only once per file. If any of the input files
639   -are the same as the primary input file or the file used to copy
640   -encryption parameters (if specified), you do not need to repeat the
641   -password here. The same file can be repeated multiple times. If a file
642   -that is repeated has a password, the password only has to be given the
643   -first time. All non-page data (info, outlines, page numbers, etc.) are
644   -taken from the primary input file. To discard these, use
645   -:samp:`--empty` as the primary input.
646   -
647   -Starting with qpdf 5.0.0, it is possible to omit the page range. If qpdf
648   -sees a value in the place where it expects a page range and that value
649   -is not a valid range but is a valid file name, qpdf will implicitly use
650   -the range ``1-z``, meaning that it will include all pages in the file.
651   -This makes it possible to easily combine all pages in a set of files
652   -with a command like :command:`qpdf --empty out.pdf --pages \*.pdf
653   ---`.
654   -
655   -The page range is a set of numbers separated by commas, ranges of
656   -numbers separated dashes, or combinations of those. The character "z"
657   -represents the last page. A number preceded by an "r" indicates to count
658   -from the end, so ``r3-r1`` would be the last three pages of the
659   -document. Pages can appear in any order. Ranges can appear with a high
660   -number followed by a low number, which causes the pages to appear in
661   -reverse. Numbers may be repeated in a page range. A page range may be
662   -optionally appended with ``:even`` or ``:odd`` to indicate only the even
663   -or odd pages in the given range. Note that even and odd refer to the
664   -positions within the specified, range, not whether the original number
665   -is even or odd.
  1034 + This is a synonym for :samp:`--remove-unreferenced-resources=no`.
  1035 + See :qpdf:ref:`--remove-unreferenced-resources`.
666 1036  
667   -Example page ranges:
  1037 + See also :qpdf:ref:`--preserve-unreferenced`, which does something
  1038 + completely different. To reduce confusion, you should use
  1039 + :samp:`--remove-unreferenced-resources=no` instead.
668 1040  
669   -- ``1,3,5-9,15-12``: pages 1, 3, 5, 6, 7, 8, 9, 15, 14, 13, and 12 in
670   - that order.
  1041 +.. qpdf:option:: --newline-before-endstream
671 1042  
672   -- ``z-1``: all pages in the document in reverse
  1043 + .. help: force a newline before endstream
673 1044  
674   -- ``r3-r1``: the last three pages of the document
  1045 + For an extra newline before endstream. Using this option enables
  1046 + qpdf to preserve PDF/A when rewriting such files.
675 1047  
676   -- ``r1-r3``: the last three pages of the document in reverse order
  1048 + Tell qpdf to insert a newline before the ``endstream`` keyword,
  1049 + not counted in the length, after any stream content even if the
  1050 + last character of the stream was a newline. This may result in two
  1051 + newlines in some cases. This is a requirement of PDF/A. While qpdf
  1052 + doesn't specifically know how to generate PDF/A-compliant PDFs,
  1053 + this at least prevents it from removing compliance on already
  1054 + compliant files.
677 1055  
678   -- ``1-20:even``: even pages from 2 to 20
  1056 +.. qpdf:option:: --coalesce-contents
679 1057  
680   -- ``5,7-9,12:odd``: pages 5, 8, and, 12, which are the pages in odd
681   - positions from among the original range, which represents pages 5, 7,
682   - 8, 9, and 12.
  1058 + .. help: combine content streams
683 1059  
684   -Starting in qpdf version 8.3, you can specify the
685   -:samp:`--collate` option. Note that this option is
686   -specified outside of :samp:`--pagesย ...ย --`. When
687   -:samp:`--collate` is specified, it changes the meaning
688   -of :samp:`--pages` so that the specified files, as
689   -modified by page ranges, are collated rather than concatenated. For
690   -example, if you add the files :file:`odd.pdf` and
691   -:file:`even.pdf` containing odd and even pages of a
692   -document respectively, you could run :command:`qpdf --collate odd.pdf
693   ---pages odd.pdf even.pdf -- all.pdf` to collate the pages.
694   -This would pick page 1 from odd, page 1 from even, page 2 from odd, page
695   -2 from even, etc. until all pages have been included. Any number of
696   -files and page ranges can be specified. If any file has fewer pages,
697   -that file is just skipped when its pages have all been included. For
698   -example, if you ran :command:`qpdf --collate --empty --pages a.pdf
699   -1-5 b.pdf 6-4 c.pdf r1 -- out.pdf`, you would get the
700   -following pages in this order:
  1060 + If a page has an array of content streams, concatenate them into
  1061 + a single content stream.
701 1062  
702   -- a.pdf page 1
  1063 + When a page's contents are split across multiple streams, this
  1064 + option causes qpdf to combine them into a single stream. Use of
  1065 + this option is never necessary for ordinary usage, but it can help
  1066 + when working with some files in some cases. For example, this can
  1067 + also be combined with QDF mode or content normalization to make it
  1068 + easier to look at all of a page's contents at once. It is common
  1069 + for PDF writers to create multiple content streams for a variety of
  1070 + reasons such as making it easier to modify page contents and
  1071 + splitting very large content streams so PDF viewers may be able to
  1072 + use less memory.
703 1073  
704   -- b.pdf page 6
  1074 +.. qpdf:option:: --externalize-inline-images
705 1075  
706   -- c.pdf last page
  1076 + .. help: convert inline to regular images
707 1077  
708   -- a.pdf page 2
  1078 + Convert inline images to regular images.
709 1079  
710   -- b.pdf page 5
  1080 + Convert inline images to regular images. By default, images whose
  1081 + data is at least 1,024 bytes are converted when this option is
  1082 + selected. Use :qpdf:ref:`--ii-min-bytes` to change the size
  1083 + threshold. This option is implicitly selected when
  1084 + :qpdf:ref:`--optimize-images` is selected unless
  1085 + :qpdf:ref:`--keep-inline-images` is also specified.
711 1086  
712   -- a.pdf page 3
  1087 +.. qpdf:option:: --ii-min-bytes=size-in-bytes
713 1088  
714   -- b.pdf page 4
  1089 + .. help: set minimum size for --externalize-inline-images
715 1090  
716   -- a.pdf page 4
  1091 + Don't externalize inline images smaller than this size. The
  1092 + default is 1,024. Use 0 for no minimum.
717 1093  
718   -- a.pdf page 5
  1094 + Avoid converting inline images whose size is below the specified
  1095 + minimum size to regular images. The default is 1,024 bytes. Use 0
  1096 + for no minimum.
719 1097  
720   -Starting in qpdf version 10.2, you may specify a numeric argument to
721   -:samp:`--collate`. With
722   -:samp:`--collate={n}`,
723   -pull groups of :samp:`{n}` pages from each file,
724   -again, stopping when there are no more pages. For example, if you ran
725   -:command:`qpdf --collate=2 --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf
726   -r1 -- out.pdf`, you would get the following pages in this
727   -order:
  1098 +.. qpdf:option:: --min-version=version
728 1099  
729   -- a.pdf page 1
  1100 + .. help: set minimum PDF version
730 1101  
731   -- a.pdf page 2
  1102 + Force the PDF version of the output to be at least the
  1103 + specified version.
732 1104  
733   -- b.pdf page 6
  1105 + Force the PDF version of the output file to be at least
  1106 + :samp:`{version}`. In other words, if the input file has a lower
  1107 + version than the specified version, the specified version will be
  1108 + used. If the input file has a higher version, the input file's
  1109 + original version will be used. It is seldom necessary to use this
  1110 + option since qpdf will automatically increase the version as needed
  1111 + when adding features that require newer PDF readers.
734 1112  
735   -- b.pdf page 5
  1113 + The version number may be expressed in the form
  1114 + :samp:`{major.minor.extension-level}`, in which case the version is
  1115 + interpreted as :samp:`{major.minor}` at extension level
  1116 + :samp:`{extension-level}`. For example, version ``1.7.8``
  1117 + represents version 1.7 at extension level 8. Note that minimal
  1118 + syntax checking is done on the command line. :command:`qpdf` does
  1119 + not check whether the specified version is actually required.
  1120 +
  1121 +.. qpdf:option:: --force-version=version
  1122 +
  1123 + .. help: set output PDF version
  1124 +
  1125 + Force the output PDF file's PDF version header to be the specified
  1126 + value, even if the file uses features that may not be available
  1127 + in that version.
  1128 +
  1129 + This option forces the PDF version to be the exact version
  1130 + specified *even when the file may have content that is not
  1131 + supported in that version*. The version number is interpreted in
  1132 + the same way as with :qpdf:ref:`--min-version` so that extension
  1133 + levels can be set. In some cases, forcing the output file's PDF
  1134 + version to be lower than that of the input file will cause qpdf to
  1135 + disable certain features of the document. Specifically, 256-bit
  1136 + keys are disabled if the version is less than 1.7 with extension
  1137 + level 8 (except R5 is disabled if less than 1.7 with extension
  1138 + level 3), AES encryption is disabled if the version is less than
  1139 + 1.6, cleartext metadata and object streams are disabled if less
  1140 + than 1.5, 128-bit encryption keys are disabled if less than 1.4,
  1141 + and all encryption is disabled if less than 1.3. Even with these
  1142 + precautions, qpdf won't be able to do things like eliminate use of
  1143 + newer image compression schemes, transparency groups, or other
  1144 + features that may have been added in more recent versions of PDF.
736 1145  
737   -- c.pdf last page
  1146 + As a general rule, with the exception of big structural things like
  1147 + the use of object streams or AES encryption, PDF viewers are
  1148 + supposed to ignore features in files that they don't support from
  1149 + newer versions. This means that forcing the version to a lower
  1150 + version may make it possible to open your PDF file with an older
  1151 + version, though bear in mind that some of the original document's
  1152 + functionality may be lost.
738 1153  
739   -- a.pdf page 3
  1154 +.. _page-ranges:
740 1155  
741   -- a.pdf page 4
  1156 +Page Ranges
  1157 +-----------
742 1158  
743   -- b.pdf page 4
  1159 +.. help-topic page-ranges: page range syntax
744 1160  
745   -- a.pdf page 5
  1161 + A full description of the page range syntax, with examples, can be
  1162 + found in the manual. Summary:
746 1163  
747   -Starting in qpdf version 8.3, when you split and merge files, any page
748   -labels (page numbers) are preserved in the final file. It is expected
749   -that more document features will be preserved by splitting and merging.
750   -In the mean time, semantics of splitting and merging vary across
751   -features. For example, the document's outlines (bookmarks) point to
752   -actual page objects, so if you select some pages and not others,
753   -bookmarks that point to pages that are in the output file will work, and
754   -remaining bookmarks will not work. A future version of
755   -:command:`qpdf` may do a better job at handling these
756   -issues. (Note that the qpdf library already contains all of the APIs
757   -required in order to implement this in your own application if you need
758   -it.) In the mean time, you can always use
759   -:samp:`--empty` as the primary input file to avoid
760   -copying all of that from the first file. For example, to take pages 1
761   -through 5 from a :file:`infile.pdf` while preserving
762   -all metadata associated with that file, you could use
  1164 + - a,b,c pages a, b, and c
  1165 + - a-b pages a through b inclusive; if a > b, this counts down
  1166 + - r<n> where <n> represents a number is the <n>th page from the end
  1167 + - z the last page, same as r1
763 1168  
764   -::
  1169 + You can append :even or :odd to select every other page from the
  1170 + resulting set of pages, where :odd starts with the first page and
  1171 + :even starts with the second page. These are odd and even pages
  1172 + from the resulting set, not based on the original page numbers.
765 1173  
766   - qpdf infile.pdf --pages . 1-5 -- outfile.pdf
  1174 +Several :command:`qpdf` command-line arguments accept page ranges as
  1175 +options. This section describes the syntax of a page range.
767 1176  
768   -If you wanted pages 1 through 5 from
769   -:file:`infile.pdf` but you wanted the rest of the
770   -metadata to be dropped, you could instead run
  1177 +- A plain number indicates a page numbered from ``1``, so ``1``
  1178 + represents the first page.
771 1179  
772   -::
  1180 +- A number preceded by ``r`` counts from the end, so ``r1`` is the
  1181 + last page, ``r2`` is the second-to-last page, etc.
773 1182  
774   - qpdf --empty --pages infile.pdf 1-5 -- outfile.pdf
  1183 +- The letter ``z`` represents the last page and is the same as ``r1``.
775 1184  
776   -If you wanted to take pages 1 through 5 from
777   -:file:`file1.pdf` and pages 11 through 15 from
778   -:file:`file2.pdf` in reverse, taking document-level
779   -metadata from :file:`file2.pdf`, you would run
  1185 +- Page numbers may appear in any order separated by commas.
780 1186  
781   -::
  1187 +- Two page numbers separated by dashes represents the inclusive range
  1188 + of pages from the first to the second. If the first number is higher
  1189 + than the second number, it is the range of pages in reverse.
782 1190  
783   - qpdf file2.pdf --pages file1.pdf 1-5 . 15-11 -- outfile.pdf
  1191 +- The range may be appended with ``:odd`` or ``:even`` to select only
  1192 + pages from the resulting in range in odd or even positions. In this
  1193 + case, odd and even refer to positions in the final range, not
  1194 + whether the original page number is odd or even.
784 1195  
785   -If, for some reason, you wanted to take the first page of an encrypted
786   -file called :file:`encrypted.pdf` with password
787   -``pass`` and repeat it twice in an output file, and if you wanted to
788   -drop document-level metadata but preserve encryption, you would use
  1196 +Example page ranges:
789 1197  
790   -::
  1198 +- ``1,6,4``: pages 1, 6, and 4 in that order
791 1199  
792   - qpdf --empty --copy-encryption=encrypted.pdf \
793   - --encryption-file-password=pass \
794   - --pages encrypted.pdf --password=pass 1 \
795   - ./encrypted.pdf --password=pass 1 -- \
796   - outfile.pdf
797   -
798   -Note that we had to specify the password all three times because giving
799   -a password as :samp:`--encryption-file-password` doesn't
800   -count for page selection, and as far as qpdf is concerned,
801   -:file:`encrypted.pdf` and
802   -:file:`./encrypted.pdf` are separated files. These
803   -are all corner cases that most users should hopefully never have to be
804   -bothered with.
805   -
806   -Prior to version 8.4, it was not possible to specify the same page from
807   -the same file directly more than once, and the workaround of specifying
808   -the same file in more than one way was required. Version 8.4 removes
809   -this limitation, but there is still a valid use case. When you specify
810   -the same page from the same file more than once, qpdf will share objects
811   -between the pages. If you are going to do further manipulation on the
812   -file and need the two instances of the same original page to be deep
813   -copies, then you can specify the file in two different ways. For example
814   -:command:`qpdf in.pdf --pages . 1 ./in.pdf 1 -- out.pdf`
815   -would create a file with two copies of the first page of the input, and
816   -the two copies would share any objects in common. This includes fonts,
817   -images, and anything else the page references.
  1200 +- ``3-7``: pages 3 through 7 inclusive in increasing order
818 1201  
819   -.. _overlay-underlay:
  1202 +- ``7-3``: pages 7, 6, 5, 4, and 3 in that order
820 1203  
821   -Overlay and Underlay Options
822   -----------------------------
  1204 +- ``1-z``: all pages in order
823 1205  
824   -Starting with qpdf 8.4, it is possible to overlay or underlay pages from
825   -other files onto the output generated by qpdf. Specify overlay or
826   -underlay as follows:
  1206 +- ``z-1``: all pages in reverse order
827 1207  
828   -::
  1208 +- ``1,3,5-9,15-12``: pages 1, 3, 5, 6, 7, 8, 9, 15, 14, 13, and 12 in
  1209 + that order
829 1210  
830   - { --overlay | --underlay } file [ options ] --
  1211 +- ``r3-r1``: the last three pages of the document
831 1212  
832   -Overlay and underlay options are processed late, so they can be combined
833   -with other like merging and will apply to the final output. The
834   -:samp:`--overlay` and :samp:`--underlay`
835   -options work the same way, except underlay pages are drawn underneath
836   -the page to which they are applied, possibly obscured by the original
837   -page, and overlay files are drawn on top of the page to which they are
838   -applied, possibly obscuring the page. You can combine overlay and
839   -underlay.
  1213 +- ``r1-r3``: the last three pages of the document in reverse order
840 1214  
841   -The default behavior of overlay and underlay is that pages are taken
842   -from the overlay/underlay file in sequence and applied to corresponding
843   -pages in the output until there are no more output pages. If the overlay
844   -or underlay file runs out of pages, remaining output pages are left
845   -alone. This behavior can be modified by options, which are provided
846   -between the :samp:`--overlay` or
847   -:samp:`--underlay` flag and the
848   -:samp:`--` option. The following options are supported:
849   -
850   -- :samp:`--password=password`: supply a password if the
851   - overlay/underlay file is encrypted.
852   -
853   -- :samp:`--to=page-range`: a range of pages in the same
854   - form at described in :ref:`page-selection`
855   - indicates which pages in the output should have the overlay/underlay
856   - applied. If not specified, overlay/underlay are applied to all pages.
857   -
858   -- :samp:`--from=[page-range]`: a range of pages that
859   - specifies which pages in the overlay/underlay file will be used for
860   - overlay or underlay. If not specified, all pages will be used. This
861   - can be explicitly specified to be empty if
862   - :samp:`--repeat` is used.
863   -
864   -- :samp:`--repeat=page-range`: an optional range of
865   - pages that specifies which pages in the overlay/underlay file will be
866   - repeated after the "from" pages are used up. If you want to repeat a
867   - range of pages starting at the beginning, you can explicitly use
868   - :samp:`--from=`.
869   -
870   -Here are some examples.
871   -
872   -- :command:`--overlay o.pdf --to=1-5 --from=1-3 --repeat=4
873   - --`: overlay the first three pages from file
874   - :file:`o.pdf` onto the first three pages of the
875   - output, then overlay page 4 from :file:`o.pdf`
876   - onto pages 4 and 5 of the output. Leave remaining output pages
877   - untouched.
  1215 +- ``1-20:even``: even pages from 2 to 20
878 1216  
879   -- :command:`--underlay footer.pdf --from= --repeat=1,2
880   - --`: Underlay page 1 of
881   - :file:`footer.pdf` on all odd output pages, and
882   - underlay page 2 of :file:`footer.pdf` on all even
883   - output pages.
  1217 +- ``5,7-9,12``: pages 5, 7, 8, 9, and 12
884 1218  
885   -.. _attachments:
  1219 +- ``5,7-9,12:odd``: pages 5, 8, and 12, which are the pages in odd
  1220 + positions from the original set of 5, 7, 8, 9, 12
886 1221  
887   -Embedded Files/Attachments Options
888   -----------------------------------
889   -
890   -Starting with qpdf 10.2, you can work with file attachments in PDF files
891   -from the command line. The following options are available:
892   -
893   -:samp:`--list-attachments`
894   - Show the "key" and stream number for embedded files. With
895   - :samp:`--verbose`, additional information, including
896   - preferred file name, description, dates, and more are also displayed.
897   - The key is usually but not always equal to the file name, and is
898   - needed by some of the other options.
899   -
900   -:samp:`--show-attachment={key}`
901   - Write the contents of the specified attachment to standard output as
902   - binary data. The key should match one of the keys shown by
903   - :samp:`--list-attachments`. If specified multiple
904   - times, only the last attachment will be shown.
905   -
906   -:samp:`--add-attachment {file} {options} --`
907   - Add or replace an attachment with the contents of
908   - :samp:`{file}`. This may be specified more
909   - than once. The following additional options may appear before the
910   - ``--`` that ends this option:
911   -
912   - :samp:`--key={key}`
913   - The key to use to register the attachment in the embedded files
914   - table. Defaults to the last path element of
915   - :samp:`{file}`.
916   -
917   - :samp:`--filename={name}`
918   - The file name to be used for the attachment. This is what is
919   - usually displayed to the user and is the name most graphical PDF
920   - viewers will use when saving a file. It defaults to the last path
921   - element of :samp:`{file}`.
922   -
923   - :samp:`--creationdate={date}`
924   - The attachment's creation date in PDF format; defaults to the
925   - current time. The date format is explained below.
926   -
927   - :samp:`--moddate={date}`
928   - The attachment's modification date in PDF format; defaults to the
929   - current time. The date format is explained below.
930   -
931   - :samp:`--mimetype={type/subtype}`
932   - The mime type for the attachment, e.g. ``text/plain`` or
933   - ``application/pdf``. Note that the mimetype appears in a field
934   - called ``/Subtype`` in the PDF but actually includes the full type
935   - and subtype of the mime type.
936   -
937   - :samp:`--description={"text"}`
938   - Descriptive text for the attachment, displayed by some PDF
939   - viewers.
940   -
941   - :samp:`--replace`
942   - Indicates that any existing attachment with the same key should be
943   - replaced by the new attachment. Otherwise,
944   - :command:`qpdf` gives an error if an attachment
945   - with that key is already present.
946   -
947   -:samp:`--remove-attachment={key}`
948   - Remove the specified attachment. This doesn't only remove the
949   - attachment from the embedded files table but also clears out the file
950   - specification. That means that any potential internal links to the
951   - attachment will be broken. This option may be specified multiple
952   - times. Run with :samp:`--verbose` to see status of
953   - the removal.
954   -
955   -:samp:`--copy-attachments-from {file} {options} --`
956   - Copy attachments from another file. This may be specified more than
957   - once. The following additional options may appear before the ``--``
958   - that ends this option:
959   -
960   - :samp:`--password={password}`
961   - If required, the password needed to open
962   - :samp:`{file}`
963   -
964   - :samp:`--prefix={prefix}`
965   - Only required if the file from which attachments are being copied
966   - has attachments with keys that conflict with attachments already
967   - in the file. In this case, the specified prefix will be prepended
968   - to each key. This affects only the key in the embedded files
969   - table, not the file name. The PDF specification doesn't preclude
970   - multiple attachments having the same file name.
971   -
972   -When a date is required, the date should conform to the PDF date format
973   -specification, which is
974   -``D:``\ :samp:`{yyyymmddhhmmss<z>}`, where
975   -:samp:`{<z>}` is either ``Z`` for UTC or a
976   -timezone offset in the form :samp:`{-hh'mm'}` or
977   -:samp:`{+hh'mm'}`. Examples:
978   -``D:20210207161528-05'00'``, ``D:20210207211528Z``.
979   -
980   -.. _advanced-parsing:
981   -
982   -Advanced Parsing Options
983   -------------------------
  1222 +- ``5,7-9,12:even``: pages 7 and 9, which are the pages in even
  1223 + positions from the original set of 5, 7, 8, 9, 12
984 1224  
985   -These options control aspects of how qpdf reads PDF files. Mostly these
986   -are of use to people who are working with damaged files. There is little
987   -reason to use these options unless you are trying to solve specific
988   -problems. The following options are available:
989   -
990   -:samp:`--suppress-recovery`
991   - Prevents qpdf from attempting to recover damaged files.
992   -
993   -:samp:`--ignore-xref-streams`
994   - Tells qpdf to ignore any cross-reference streams.
995   -
996   -Ordinarily, qpdf will attempt to recover from certain types of errors in
997   -PDF files. These include errors in the cross-reference table, certain
998   -types of object numbering errors, and certain types of stream length
999   -errors. Sometimes, qpdf may think it has recovered but may not have
1000   -actually recovered, so care should be taken when using this option as
1001   -some data loss is possible. The
1002   -:samp:`--suppress-recovery` option will prevent qpdf
1003   -from attempting recovery. In this case, it will fail on the first error
1004   -that it encounters.
1005   -
1006   -Ordinarily, qpdf reads cross-reference streams when they are present in
1007   -a PDF file. If :samp:`--ignore-xref-streams` is
1008   -specified, qpdf will ignore any cross-reference streams for hybrid PDF
1009   -files. The purpose of hybrid files is to make some content available to
1010   -viewers that are not aware of cross-reference streams. It is almost
1011   -never desirable to ignore them. The only time when you might want to use
1012   -this feature is if you are testing creation of hybrid PDF files and wish
1013   -to see how a PDF consumer that doesn't understand object and
1014   -cross-reference streams would interpret such a file.
1015   -
1016   -.. _advanced-transformation:
1017   -
1018   -Advanced Transformation Options
1019   --------------------------------
1020   -
1021   -These transformation options control fine points of how qpdf creates the
1022   -output file. Mostly these are of use only to people who are very
1023   -familiar with the PDF file format or who are PDF developers. The
1024   -following options are available:
1025   -
1026   -:samp:`--compress-streams={[yn]}`
1027   - By default, or with :samp:`--compress-streams=y`,
1028   - qpdf will compress any stream with no other filters applied to it
1029   - with the ``/FlateDecode`` filter when it writes it. To suppress this
1030   - behavior and preserve uncompressed streams as uncompressed, use
1031   - :samp:`--compress-streams=n`.
1032   -
1033   -:samp:`--decode-level={option}`
1034   - Controls which streams qpdf tries to decode. The default is
1035   - :samp:`generalized`. The following options are
1036   - available:
  1225 +.. _modification-options:
1037 1226  
1038   - - :samp:`none`: do not attempt to decode any streams
  1227 +PDF Modification
  1228 +----------------
1039 1229  
1040   - - :samp:`generalized`: decode streams filtered with
1041   - supported generalized filters: ``/LZWDecode``, ``/FlateDecode``,
1042   - ``/ASCII85Decode``, and ``/ASCIIHexDecode``. We define generalized
1043   - filters as those to be used for general-purpose compression or
1044   - encoding, as opposed to filters specifically designed for image
1045   - data. Note that, by default, streams already compressed with
1046   - ``/FlateDecode`` are not uncompressed and recompressed unless you
1047   - also specify :samp:`--recompress-flate`.
  1230 +.. help-topic modification: change parts of the PDF
1048 1231  
1049   - - :samp:`specialized`: in addition to generalized,
1050   - decode streams with supported non-lossy specialized filters;
1051   - currently this is just ``/RunLengthDecode``
  1232 + Modification options make systematic changes to certain parts of
  1233 + the PDF, causing the PDF to render differently from the original.
1052 1234  
1053   - - :samp:`all`: in addition to generalized and
1054   - specialized, decode streams with supported lossy filters;
1055   - currently this is just ``/DCTDecode`` (JPEG)
  1235 +Modification options make systematic changes to certain parts of the
  1236 +PDF, causing the PDF to render differently from the original. See also
  1237 +:ref:`transformation-options`.
1056 1238  
1057   -:samp:`--stream-data={option}`
1058   - Controls transformation of stream data. This option predates the
1059   - :samp:`--compress-streams` and
1060   - :samp:`--decode-level` options. Those options can be
1061   - used to achieve the same affect with more control. The value of
1062   - :samp:`{option}` may
1063   - be one of the following:
1064   -
1065   - - :samp:`compress`: recompress stream data when
1066   - possible (default); equivalent to
1067   - :samp:`--compress-streams=y`
1068   - :samp:`--decode-level=generalized`. Does not
1069   - recompress streams already compressed with ``/FlateDecode`` unless
1070   - :samp:`--recompress-flate` is also specified.
1071   -
1072   - - :samp:`preserve`: leave all stream data as is;
1073   - equivalent to :samp:`--compress-streams=n`
1074   - :samp:`--decode-level=none`
1075   -
1076   - - :samp:`uncompress`: uncompress stream data
1077   - compressed with generalized filters when possible; equivalent to
1078   - :samp:`--compress-streams=n`
1079   - :samp:`--decode-level=generalized`
1080   -
1081   -:samp:`--recompress-flate`
1082   - By default, streams already compressed with ``/FlateDecode`` are left
1083   - alone rather than being uncompressed and recompressed. This option
1084   - causes qpdf to uncompress and recompress the streams. There is a
1085   - significant performance cost to using this option, but you probably
1086   - want to use it if you specify
1087   - :samp:`--compression-level`.
1088   -
1089   -:samp:`--compression-level={level}`
1090   - When writing new streams that are compressed with ``/FlateDecode``,
1091   - use the specified compression level. The value of
1092   - :samp:`level` should be a number from 1 to 9 and is
1093   - passed directly to zlib, which implements deflate compression. Note
1094   - that qpdf doesn't uncompress and recompress streams by default. To
1095   - have this option apply to already compressed streams, you should also
1096   - specify :samp:`--recompress-flate`. If your goal is
1097   - to shrink the size of PDF files, you should also use
1098   - :samp:`--object-streams=generate`.
  1239 +Related Options
  1240 +~~~~~~~~~~~~~~~
1099 1241  
1100   -:samp:`--normalize-content=[yn]`
1101   - Enables or disables normalization of content streams. Content
1102   - normalization is enabled by default in QDF mode. Please see :ref:`qdf` for additional discussion of QDF mode.
  1242 +.. qpdf:option:: --pages file [ --password=password ] [ page-range ] [ ... ] --
1103 1243  
1104   -:samp:`--object-streams={mode}`
1105   - Controls handling of object streams. The value of
1106   - :samp:`{mode}` may be
1107   - one of the following:
  1244 + .. help: begin page selection
1108 1245  
1109   - - :samp:`preserve`: preserve original object streams
1110   - (default)
  1246 + Run qpdf --help=page-selection for details.
1111 1247  
1112   - - :samp:`disable`: don't write any object streams
  1248 + This flag starts page selection options, which are used to select
  1249 + pages from one or more input files to perform operations such as
  1250 + splitting, merging, and collating files.
1113 1251  
1114   - - :samp:`generate`: use object streams wherever
1115   - possible
  1252 + Please see :ref:`page-selection` for details about selecting pages.
1116 1253  
1117   -:samp:`--preserve-unreferenced`
1118   - Tells qpdf to preserve objects that are not referenced when writing
1119   - the file. Ordinarily any object that is not referenced in a traversal
1120   - of the document from the trailer dictionary will be discarded. This
1121   - may be useful in working with some damaged files or inspecting files
1122   - with known unreferenced objects.
  1254 + See also :qpdf:ref:`--split-pages`, :qpdf:ref:`--collate`,
  1255 + :ref:`page-ranges`.
1123 1256  
1124   - This flag is ignored for linearized files and has the effect of
1125   - causing objects in the new file to be written in order by object ID
1126   - from the original file. This does not mean that object numbers will
1127   - be the same since qpdf may create stream lengths as direct or
1128   - indirect differently from the original file, and the original file
1129   - may have gaps in its numbering.
  1257 +.. qpdf:option:: --collate=n
1130 1258  
1131   - See also :samp:`--preserve-unreferenced-resources`,
1132   - which does something completely different.
1133   -
1134   -:samp:`--remove-unreferenced-resources={option}`
1135   - The :samp:`{option}` may be ``auto``,
1136   - ``yes``, or ``no``. The default is ``auto``.
1137   -
1138   - Starting with qpdf 8.1, when splitting pages, qpdf is able to attempt
1139   - to remove images and fonts that are not used by a page even if they
1140   - are referenced in the page's resources dictionary. When shared
1141   - resources are in use, this behavior can greatly reduce the file sizes
1142   - of split pages, but the analysis is very slow. In versions from 8.1
1143   - through 9.1.1, qpdf did this analysis by default. Starting in qpdf
1144   - 10.0.0, if ``auto`` is used, qpdf does a quick analysis of the file
1145   - to determine whether the file is likely to have unreferenced objects
1146   - on pages, a pattern that frequently occurs when resource dictionaries
1147   - are shared across multiple pages and rarely occurs otherwise. If it
1148   - discovers this pattern, then it will attempt to remove unreferenced
1149   - resources. Usually this means you get the slower splitting speed only
1150   - when it's actually going to create smaller files. You can suppress
1151   - removal of unreferenced resources altogether by specifying ``no`` or
1152   - force it to do the full algorithm by specifying ``yes``.
  1259 + .. help: collate with --pages
1153 1260  
1154   - Other than cases in which you don't care about file size and care a
1155   - lot about runtime, there are few reasons to use this option,
1156   - especially now that ``auto`` mode is supported. One reason to use
1157   - this is if you suspect that qpdf is removing resources it shouldn't
1158   - be removing. If you encounter that case, please report it as bug at
1159   - https://github.com/qpdf/qpdf/issues/.
  1261 + Collate rather than concatenate pages specified with --pages.
  1262 + With a numeric argument, collate in groups of n. The default
  1263 + is 1. Run qpdf --help=page-selection for additional details.
1160 1264  
1161   -:samp:`--preserve-unreferenced-resources`
1162   - This is a synonym for
1163   - :samp:`--remove-unreferenced-resources=no`.
  1265 + This option causes :command:`qpdf` to collate rather than
  1266 + concatenate pages specified with :qpdf:ref:`--pages`. With a
  1267 + numeric argument, collate in groups of :samp:`{n}`. The default
  1268 + is 1.
1164 1269  
1165   - See also :samp:`--preserve-unreferenced`, which does
1166   - something completely different.
  1270 + Please see :ref:`page-selection` for additional details.
1167 1271  
1168   -:samp:`--newline-before-endstream`
1169   - Tells qpdf to insert a newline before the ``endstream`` keyword, not
1170   - counted in the length, after any stream content even if the last
1171   - character of the stream was a newline. This may result in two
1172   - newlines in some cases. This is a requirement of PDF/A. While qpdf
1173   - doesn't specifically know how to generate PDF/A-compliant PDFs, this
1174   - at least prevents it from removing compliance on already compliant
1175   - files.
  1272 +.. qpdf:option:: --split-pages=[n]
  1273 +
  1274 + .. help: write pages to separate files
  1275 +
  1276 + This option causes qpdf to create separate output files for each
  1277 + page or group of pages rather than a single output file.
  1278 +
  1279 + File names are generated from the specified output file as follows:
  1280 +
  1281 + - If the string %d appears in the output file name, it is replaced with a
  1282 + zero-padded page range starting from 1
  1283 + - Otherwise, if the output file name ends in .pdf (case insensitive), a
  1284 + zero-padded page range, preceded by a dash, is inserted before the file
  1285 + extension
  1286 + - Otherwise, the file name is appended with a zero-padded page range
  1287 + preceded by a dash.
  1288 +
  1289 + Page ranges are single page numbers for single-page groups or first-last
  1290 + for multi-page groups.
  1291 +
  1292 + Write each group of :samp:`{n}` pages to a separate output file. If
  1293 + :samp:`{n}` is not specified, create single pages. Output file
  1294 + names are generated as follows:
  1295 +
  1296 + - If the string ``%d`` appears in the output file name, it is
  1297 + replaced with a range of zero-padded page numbers starting
  1298 + from 1.
  1299 +
  1300 + - Otherwise, if the output file name ends in :file:`.pdf` (case
  1301 + insensitive), a zero-padded page range, preceded by a dash, is
  1302 + inserted before the file extension.
  1303 +
  1304 + - Otherwise, the file name is appended with a zero-padded page
  1305 + range preceded by a dash.
  1306 +
  1307 + Zero padding is added to all page numbers in file names so that all
  1308 + the numbers are the same length, which causes the output filenames
  1309 + to sort lexically in numerical order.
  1310 +
  1311 + Page ranges are a single number in the case of single-page groups or
  1312 + two numbers separated by a dash otherwise.
  1313 +
  1314 + Here are some examples. In these examples, :file:`infile.pdf` has
  1315 + 12 pages.
  1316 +
  1317 + - ``qpdf --split-pages infile.pdf %d-out``: output files are
  1318 + :file:`01-out` through :file:`12-out` with no extension.
  1319 +
  1320 + - ``qpdf --split-pages=2 infile.pdf outfile.pdf``: output files are
  1321 + :file:`outfile-01-02.pdf` through :file:`outfile-11-12.pdf`
  1322 +
  1323 + - ``qpdf --split-pages infile.pdf something.else`` would generate
  1324 + files :file:`something.else-01` through
  1325 + :file:`something.else-12`. The extension ``.else`` is not treated
  1326 + in any special way regarding the placement of the number.
  1327 +
  1328 + Note that outlines, threads, and other document-level features of
  1329 + the original PDF file are not preserved. For each page of output,
  1330 + this option creates an empty PDF and copies a single page from the
  1331 + output into it. If you require the document-level data, you will
  1332 + have to run :command:`qpdf` with the :qpdf:ref:`--pages` option
  1333 + once for each page. Using :qpdf:ref:`--split-pages` is much faster
  1334 + if you don't require the document-level data. A future version of
  1335 + qpdf may support preservation of some document-level information.
  1336 +
  1337 +.. qpdf:option:: --overlay file [ options ] --
  1338 +
  1339 + .. help: begin overlay options
  1340 +
  1341 + Overlay pages from another file on the output.
  1342 + Run qpdf --help=overlay-underlay for details.
  1343 +
  1344 + Overlay pages from another file on the output.
  1345 +
  1346 + See :ref:`overlay-underlay` for details.
  1347 +
  1348 +.. qpdf:option:: --underlay file [ options ] --
  1349 +
  1350 + .. help: begin underlay options
  1351 +
  1352 + Underlay pages from another file on the output.
  1353 + Run qpdf --help=overlay-underlay for details.
  1354 +
  1355 + Underlay pages from another file on the output.
  1356 +
  1357 + See :ref:`overlay-underlay` for details.
  1358 +
  1359 +.. qpdf:option:: --flatten-rotation
  1360 +
  1361 + .. help: remove rotation from page dictionary
  1362 +
  1363 + Rotate a page using content commands instead of page-level
  1364 + metadata. This can be useful if a broken PDF viewer fails to
  1365 + properly consider page rotation metadata.
  1366 +
  1367 + For each page that is rotated using the ``/Rotate`` key in the
  1368 + page's dictionary, remove the ``/Rotate`` key and implement the
  1369 + identical rotation semantics by modifying the page's contents. This
  1370 + option can be useful to prepare files for buggy PDF applications
  1371 + that don't properly handle rotated pages. There is usually no
  1372 + reason to use this option unless you are working around a specific
  1373 + problem.
  1374 +
  1375 +.. qpdf:option:: --flatten-annotations=option
  1376 +
  1377 + .. help: push annotations into content
  1378 +
  1379 + Push page annotations into the content streams. This may be
  1380 + necessary in some case when printing or splitting files.
  1381 + Options: "all", "print", "screen".
1176 1382  
1177   -:samp:`--linearize-pass1={file}`
1178   - Write the first pass of linearization to the named file. The
1179   - resulting file is not a valid PDF file. This option is useful only
1180   - for debugging ``QPDFWriter``'s linearization code. When qpdf
1181   - linearizes files, it writes the file in two passes, using the first
1182   - pass to calculate sizes and offsets that are required for hint tables
1183   - and the linearization dictionary. Ordinarily, the first pass is
1184   - discarded. This option enables it to be captured.
1185   -
1186   -:samp:`--coalesce-contents`
1187   - When a page's contents are split across multiple streams, this option
1188   - causes qpdf to combine them into a single stream. Use of this option
1189   - is never necessary for ordinary usage, but it can help when working
1190   - with some files in some cases. For example, this can also be combined
1191   - with QDF mode or content normalization to make it easier to look at
1192   - all of a page's contents at once.
1193   -
1194   -:samp:`--flatten-annotations={option}`
1195 1383 This option collapses annotations into the pages' contents with
1196 1384 special handling for form fields. Ordinarily, an annotation is
1197 1385 rendered separately and on top of the page. Combining annotations
... ... @@ -1200,48 +1388,113 @@ following options are available:
1200 1388 transformations. The library functionality backing this option was
1201 1389 added for the benefit of programs that want to create *n-up* page
1202 1390 layouts and other similar things that don't work well with
1203   - annotations. The :samp:`{option}` parameter
1204   - may be any of the following:
1205   -
1206   - - :samp:`all`: include all annotations that are not
1207   - marked invisible or hidden
1208   -
1209   - - :samp:`print`: only include annotations that
1210   - indicate that they should appear when the page is printed
1211   -
1212   - - :samp:`screen`: omit annotations that indicate
1213   - they should not appear on the screen
1214   -
1215   - Note that form fields are special because the annotations that are
1216   - used to render filled-in form fields may become out of date from the
1217   - fields' values if the form is filled in by a program that doesn't
1218   - know how to update the appearances. If qpdf detects this case, its
1219   - default behavior is not to flatten those annotations because doing so
1220   - would cause the value of the form field to be lost. This gives you a
1221   - chance to go back and resave the form with a program that knows how
1222   - to generate appearances. QPDF itself can generate appearances with
1223   - some limitations. See the
1224   - :samp:`--generate-appearances` option below.
1225   -
1226   -:samp:`--generate-appearances`
  1391 + annotations. The :samp:`{option}` parameter may be any of the
  1392 + following:
  1393 +
  1394 + - :samp:`all`: include all annotations that are not marked
  1395 + invisible or hidden
  1396 +
  1397 + - :samp:`print`: only include annotations that indicate that they
  1398 + should appear when the page is printed
  1399 +
  1400 + - :samp:`screen`: omit annotations that indicate they should not
  1401 + appear on the screen
  1402 +
  1403 + In a PDF file, interactive form fields have a value and,
  1404 + independently, a set of instructions, called an appearance, to
  1405 + render the filled-in field. If a form is filled in by a program
  1406 + that doesn't know how to update the appearances, they may become
  1407 + inconsistent with the fields' values. If qpdf detects this case,
  1408 + its default behavior is not to flatten those annotations because
  1409 + doing so would cause the value of the form field to be lost. This
  1410 + gives you a chance to go back and resave the form with a program
  1411 + that knows how to generate appearances. qpdf itself can generate
  1412 + appearances with some limitations. See the
  1413 + :qpdf:ref:`--generate-appearances` option for details.
  1414 +
  1415 +.. qpdf:option:: --rotate=[+|-]angle[:page-range]
  1416 +
  1417 + .. help: rotate pages
  1418 +
  1419 + Rotate specified pages by multiples of 90 degrees specifying
  1420 + either absolute or relative angles. "angle" may be 0, 90, 180,
  1421 + or 270. You almost always want to use +angle or -angle rather
  1422 + than just angle, as discussed in the manual. Run
  1423 + qpdf --help=page-ranges for help with page ranges.
  1424 +
  1425 + Rotate the specified range of pages by the specified angle, which
  1426 + must be a multiple of 90 degrees.
  1427 +
  1428 + The value of :samp:`{angle}` may be ``0``, ``90``, ``180``, or ``270``.
  1429 +
  1430 + For a description of the syntax of :samp:`{page-range}`, see
  1431 + :ref:`page-ranges`. If the page range is omitted, the rotation is
  1432 + applied to all pages.
  1433 +
  1434 + If ``+`` is prepended to :samp:`{angle}`, the angle is added, so an
  1435 + angle of ``+90`` indicates a 90-degree clockwise rotation. If ``-``
  1436 + is prepended, the angle is subtracted, so ``-90`` is a 90-degree
  1437 + counterclockwise rotation and is exactly the same as ``+270``.
  1438 +
  1439 + If neither ``+`` or ``-`` is prepended, the rotation angle is set
  1440 + exactly. You almost always want ``+`` or ``-`` since, without
  1441 + inspecting the actual PDF code, it is impossible to know whether a
  1442 + page that appears to be rotate is rotated "naturally" or has been
  1443 + rotated by specifying rotation. For example, if a page appears to
  1444 + contain a portrait-mode image rotated by 90 degrees so that the top
  1445 + of the image is on the right edge of the page, there is no way to
  1446 + tell by visual inspection whether the literal top of the image is
  1447 + the top of the page or whether the literal top of the image is the
  1448 + right edge and the page is already rotated in the PDF. Specifying a
  1449 + rotation angle of ``-90`` will produce an image that appears
  1450 + upright in either case. Use of absolute rotation angles should be
  1451 + reserved for cases in which you have specific knowledge about the
  1452 + way the PDF file is constructed.
  1453 +
  1454 + Examples:
  1455 +
  1456 + - ``qpdf in.pdf out.pdf --rotate=+90:2,4,6 --rotate=+180:7-8``:
  1457 + rotate pages 2, 4, and 6 by 90 degrees clockwise from their
  1458 + original rotation
  1459 +
  1460 + - ``qpdf in.pdf out.pdf --rotate=+180``: rotate all pages by 180
  1461 + degrees
  1462 +
  1463 + - ``qpdf in.pdf out.pdf --rotate=0``: force each page to displayed
  1464 + in its natural orientation, which would undo the effect of any
  1465 + rotations previously applied in page metadata.
  1466 +
  1467 + See also :qpdf:ref:`--flatten-rotation`.
  1468 +
  1469 +.. qpdf:option:: --generate-appearances
  1470 +
  1471 + .. help: generate appearances for form fields
  1472 +
  1473 + PDF form fields consist of values and appearances, which may be
  1474 + inconsistent with each other if a form field value has been
  1475 + modified without updating its appearance. This option tells qpdf
  1476 + to generate new appearance streams. There are some limitations,
  1477 + which are discussed in the manual.
  1478 +
1227 1479 If a file contains interactive form fields and indicates that the
1228 1480 appearances are out of date with the values of the form, this flag
1229   - will regenerate appearances, subject to a few limitations. Note that
1230   - there is not usually a reason to do this, but it can be necessary
1231   - before using the :samp:`--flatten-annotations`
1232   - option. Most of these are not a problem with well-behaved PDF files.
1233   - The limitations are as follows:
  1481 + will regenerate appearances, subject to a few limitations. Note
  1482 + that there is not usually a reason to do this, but it can be
  1483 + necessary before using the :qpdf:ref:`--flatten-annotations`
  1484 + option. Here is a summary of the limitations.
1234 1485  
1235 1486 - Radio button and checkbox appearances use the pre-set values in
1236   - the PDF file. QPDF just makes sure that the correct appearance is
1237   - displayed based on the value of the field. This is fine for PDF
1238   - files that create their forms properly. Some PDF writers save
1239   - appearances for fields when they change, which could cause some
1240   - controls to have inconsistent appearances.
  1487 + the PDF file. :command:`qpdf` just makes sure that the correct
  1488 + appearance is displayed based on the value of the field. This is
  1489 + fine for PDF files that create their forms properly. Some PDF
  1490 + writers save appearances for fields when they change, which could
  1491 + cause some controls to have inconsistent appearances.
1241 1492  
1242 1493 - For text fields and list boxes, any characters that fall outside
1243 1494 of US-ASCII or, if detected, "Windows ANSI" or "Mac Roman"
1244 1495 encoding, will be replaced by the ``?`` character.
  1496 + :command:`qpdf` does not know enough about fonts and encodings to
  1497 + correctly represent characters that fall outside of this range.
1245 1498  
1246 1499 - Quadding is ignored. Quadding is used to specify whether the
1247 1500 contents of a field should be left, center, or right aligned with
... ... @@ -1252,326 +1505,1583 @@ following options are available:
1252 1505  
1253 1506 - There is no support for multi-select fields or signature fields.
1254 1507  
1255   - If qpdf doesn't do a good enough job with your form, use an external
1256   - application to save your filled-in form before processing it with
1257   - qpdf.
  1508 + Appearances generated by :command:`qpdf` should be good enough for
  1509 + simple forms consisting of ASCII characters where the original file
  1510 + followed the PDF specification and provided template information
  1511 + for text field appearances. If :command:`qpdf` doesn't do a good
  1512 + enough job with your form, use an external application to save your
  1513 + filled-in form before processing it with :command:`qpdf`. Most PDF
  1514 + viewers that support filling in of forms will generate appearance
  1515 + streams. Some of them will even do it for forms filled in with
  1516 + characters outside the original font's character range by embedding
  1517 + additional fonts as needed.
  1518 +
  1519 +.. qpdf:option:: --optimize-images
  1520 +
  1521 + .. help: use efficient compression for images
  1522 +
  1523 + Attempt to use DCT (JPEG) compression for images that fall
  1524 + within certain constraints as long as doing so decreases the
  1525 + size in bytes of the image. See also help for the following
  1526 + options:
  1527 + --oi-min-width
  1528 + --oi-min-height
  1529 + --oi-min-area
  1530 + --keep-inline-images
  1531 +
  1532 + The --verbose flag is useful with this option.
  1533 +
  1534 + This flag causes qpdf to recompress all images that are not
  1535 + compressed with DCT (JPEG) using DCT compression as long as doing
  1536 + so decreases the size in bytes of the image data and the image does
  1537 + not fall below minimum specified dimensions. Useful information is
  1538 + provided when used in combination with :qpdf:ref:`--verbose`. See
  1539 + also the :qpdf:ref:`--oi-min-width`, :qpdf:ref:`--oi-min-height`,
  1540 + and :qpdf:ref:`--oi-min-area` options. By default, inline images
  1541 + are converted to regular images and optimized as well. Use
  1542 + :qpdf:ref:`--keep-inline-images` to prevent inline images from
  1543 + being included.
  1544 +
  1545 +.. qpdf:option:: --oi-min-width=width
  1546 +
  1547 + .. help: minimum width for --optimize-images
  1548 +
  1549 + Don't optimize images whose width is below the specified value.
1258 1550  
1259   -:samp:`--optimize-images`
1260   - This flag causes qpdf to recompress all images that are not
1261   - compressed with DCT (JPEG) using DCT compression as long as doing so
1262   - decreases the size in bytes of the image data and the image does not
1263   - fall below minimum specified dimensions. Useful information is
1264   - provided when used in combination with
1265   - :samp:`--verbose`. See also the
1266   - :samp:`--oi-min-width`,
1267   - :samp:`--oi-min-height`, and
1268   - :samp:`--oi-min-area` options. By default, starting
1269   - in qpdf 8.4, inline images are converted to regular images and
1270   - optimized as well. Use :samp:`--keep-inline-images`
1271   - to prevent inline images from being included.
1272   -
1273   -:samp:`--oi-min-width={width}`
1274 1551 Avoid optimizing images whose width is below the specified amount. If
1275 1552 omitted, the default is 128 pixels. Use 0 for no minimum.
1276 1553  
1277   -:samp:`--oi-min-height={height}`
  1554 +.. qpdf:option:: --oi-min-height=height
  1555 +
  1556 + .. help: minimum height for --optimize-images
  1557 +
  1558 + Don't optimize images whose height is below the specified value.
  1559 +
1278 1560 Avoid optimizing images whose height is below the specified amount.
1279 1561 If omitted, the default is 128 pixels. Use 0 for no minimum.
1280 1562  
1281   -:samp:`--oi-min-area={area-in-pixels}`
1282   - Avoid optimizing images whose pixel count (widthย ร—ย height) is below
1283   - the specified amount. If omitted, the default is 16,384 pixels. Use 0
1284   - for no minimum.
  1563 +.. qpdf:option:: --oi-min-area=area-in-pixels
1285 1564  
1286   -:samp:`--externalize-inline-images`
1287   - Convert inline images to regular images. By default, images whose
1288   - data is at least 1,024 bytes are converted when this option is
1289   - selected. Use :samp:`--ii-min-bytes` to change the
1290   - size threshold. This option is implicitly selected when
1291   - :samp:`--optimize-images` is selected. Use
1292   - :samp:`--keep-inline-images` to exclude inline images
1293   - from image optimization.
  1565 + .. help: minimum area for --optimize-images
1294 1566  
1295   -:samp:`--ii-min-bytes={bytes}`
1296   - Avoid converting inline images whose size is below the specified
1297   - minimum size to regular images. If omitted, the default is 1,024
1298   - bytes. Use 0 for no minimum.
1299   -
1300   -:samp:`--keep-inline-images`
1301   - Prevent inline images from being included in image optimization. This
1302   - option has no affect when :samp:`--optimize-images`
1303   - is not specified.
1304   -
1305   -:samp:`--remove-page-labels`
1306   - Remove page labels from the output file.
1307   -
1308   -:samp:`--qdf`
1309   - Turns on QDF mode. For additional information on QDF, please see :ref:`qdf`. Note that :samp:`--linearize`
1310   - disables QDF mode.
1311   -
1312   -:samp:`--min-version={version}`
1313   - Forces the PDF version of the output file to be at least
1314   - :samp:`{version}`. In other words, if the
1315   - input file has a lower version than the specified version, the
1316   - specified version will be used. If the input file has a higher
1317   - version, the input file's original version will be used. It is seldom
1318   - necessary to use this option since qpdf will automatically increase
1319   - the version as needed when adding features that require newer PDF
1320   - readers.
  1567 + Don't optimize images whose area in pixels is below the specified value.
1321 1568  
1322   - The version number may be expressed in the form
1323   - :samp:`{major.minor.extension-level}`, in
1324   - which case the version is interpreted as
1325   - :samp:`{major.minor}` at extension level
1326   - :samp:`{extension-level}`. For example,
1327   - version ``1.7.8`` represents version 1.7 at extension level 8. Note
1328   - that minimal syntax checking is done on the command line.
1329   -
1330   -:samp:`--force-version={version}`
1331   - This option forces the PDF version to be the exact version specified
1332   - *even when the file may have content that is not supported in that
1333   - version*. The version number is interpreted in the same way as with
1334   - :samp:`--min-version` so that extension levels can be
1335   - set. In some cases, forcing the output file's PDF version to be lower
1336   - than that of the input file will cause qpdf to disable certain
1337   - features of the document. Specifically, 256-bit keys are disabled if
1338   - the version is less than 1.7 with extension level 8 (except R5 is
1339   - disabled if less than 1.7 with extension level 3), AES encryption is
1340   - disabled if the version is less than 1.6, cleartext metadata and
1341   - object streams are disabled if less than 1.5, 128-bit encryption keys
1342   - are disabled if less than 1.4, and all encryption is disabled if less
1343   - than 1.3. Even with these precautions, qpdf won't be able to do
1344   - things like eliminate use of newer image compression schemes,
1345   - transparency groups, or other features that may have been added in
1346   - more recent versions of PDF.
  1569 + Avoid optimizing images whose pixel count
  1570 + (:samp:`{width}`ย ร—ย :samp:`{height}`) is below the specified amount.
  1571 + If omitted, the default is 16,384 pixels. Use 0 for no minimum.
1347 1572  
1348   - As a general rule, with the exception of big structural things like
1349   - the use of object streams or AES encryption, PDF viewers are supposed
1350   - to ignore features in files that they don't support from newer
1351   - versions. This means that forcing the version to a lower version may
1352   - make it possible to open your PDF file with an older version, though
1353   - bear in mind that some of the original document's functionality may
1354   - be lost.
1355   -
1356   -By default, when a stream is encoded using non-lossy filters that qpdf
1357   -understands and is not already compressed using a good compression
1358   -scheme, qpdf will uncompress and recompress streams. Assuming proper
1359   -filter implements, this is safe and generally results in smaller files.
1360   -This behavior may also be explicitly requested with
1361   -:samp:`--stream-data=compress`.
1362   -
1363   -When :samp:`--normalize-content=y` is specified, qpdf
1364   -will attempt to normalize whitespace and newlines in page content
1365   -streams. This is generally safe but could, in some cases, cause damage
1366   -to the content streams. This option is intended for people who wish to
1367   -study PDF content streams or to debug PDF content. You should not use
1368   -this for "production" PDF files.
1369   -
1370   -When normalizing content, if qpdf runs into any lexical errors, it will
1371   -print a warning indicating that content may be damaged. The only
1372   -situation in which qpdf is known to cause damage during content
1373   -normalization is when a page's contents are split across multiple
1374   -streams and streams are split in the middle of a lexical token such as a
1375   -string, name, or inline image. Note that files that do this are invalid
1376   -since the PDF specification states that content streams are not to be
1377   -split in the middle of a token. If you want to inspect the original
1378   -content streams in an uncompressed format, you can always run with
1379   -:samp:`--qdf --normalize-content=n` for a QDF file
1380   -without content normalization, or alternatively
1381   -:samp:`--stream-data=uncompress` for a regular non-QDF
1382   -mode file with uncompressed streams. These will both uncompress all the
1383   -streams but will not attempt to normalize content. Please note that if
1384   -you are using content normalization or QDF mode for the purpose of
1385   -manually inspecting files, you don't have to care about this.
1386   -
1387   -Object streams, also known as compressed objects, were introduced into
1388   -the PDF specification at version 1.5, corresponding to Acrobat 6. Some
1389   -older PDF viewers may not support files with object streams. qpdf can be
1390   -used to transform files with object streams to files without object
1391   -streams or vice versa. As mentioned above, there are three object stream
1392   -modes: :samp:`preserve`,
1393   -:samp:`disable`, and :samp:`generate`.
1394   -
1395   -In :samp:`preserve` mode, the relationship to objects
1396   -and the streams that contain them is preserved from the original file.
1397   -In :samp:`disable` mode, all objects are written as
1398   -regular, uncompressed objects. The resulting file should be readable by
1399   -older PDF viewers. (Of course, the content of the files may include
1400   -features not supported by older viewers, but at least the structure will
1401   -be supported.) In :samp:`generate` mode, qpdf will
1402   -create its own object streams. This will usually result in more compact
1403   -PDF files, though they may not be readable by older viewers. In this
1404   -mode, qpdf will also make sure the PDF version number in the header is
1405   -at least 1.5.
1406   -
1407   -The :samp:`--qdf` flag turns on QDF mode, which changes
1408   -some of the defaults described above. Specifically, in QDF mode, by
1409   -default, stream data is uncompressed, content streams are normalized,
1410   -and encryption is removed. These defaults can still be overridden by
1411   -specifying the appropriate options as described above. Additionally, in
1412   -QDF mode, stream lengths are stored as indirect objects, objects are
1413   -laid out in a less efficient but more readable fashion, and the
1414   -documents are interspersed with comments that make it easier for the
1415   -user to find things and also make it possible for
1416   -:command:`fix-qdf` to work properly. QDF mode is intended
1417   -for people, mostly developers, who wish to inspect or modify PDF files
1418   -in a text editor. For details, please see :ref:`qdf`.
1419   -
1420   -.. _testing-options:
1421   -
1422   -Testing, Inspection, and Debugging Options
1423   -------------------------------------------
1424   -
1425   -These options can be useful for digging into PDF files or for use in
1426   -automated test suites for software that uses the qpdf library. When any
1427   -of the options in this section are specified, no output file should be
1428   -given. The following options are available:
1429   -
1430   -:samp:`--deterministic-id`
1431   - Causes generation of a deterministic value for /ID. This prevents use
1432   - of timestamp and output file name information in the /ID generation.
1433   - Instead, at some slight additional runtime cost, the /ID field is
1434   - generated to include a digest of the significant parts of the content
1435   - of the output PDF file. This means that a given qpdf operation should
1436   - generate the same /ID each time it is run, which can be useful when
1437   - caching results or for generation of some test data. Use of this flag
1438   - is not compatible with creation of encrypted files.
1439   -
1440   -:samp:`--static-id`
1441   - Causes generation of a fixed value for /ID. This is intended for
1442   - testing only. Never use it for production files. If you are trying to
1443   - get the same /ID each time for a given file and you are not
1444   - generating encrypted files, consider using the
1445   - :samp:`--deterministic-id` option.
1446   -
1447   -:samp:`--static-aes-iv`
1448   - Causes use of a static initialization vector for AES-CBC. This is
1449   - intended for testing only so that output files can be reproducible.
1450   - Never use it for production files. This option in particular is not
1451   - secure since it significantly weakens the encryption.
1452   -
1453   -:samp:`--no-original-object-ids`
1454   - Suppresses inclusion of original object ID comments in QDF files.
1455   - This can be useful when generating QDF files for test purposes,
1456   - particularly when comparing them to determine whether two PDF files
1457   - have identical content.
  1573 +.. qpdf:option:: --keep-inline-images
  1574 +
  1575 + .. help: exclude inline images from optimization
  1576 +
  1577 + Prevent inline images from being considered by --optimize-images.
  1578 +
  1579 + Prevent inline images from being included in image optimization.
  1580 + This option has no effect when :qpdf:ref:`--optimize-images` is not
  1581 + specified.
  1582 +
  1583 +.. qpdf:option:: --remove-page-labels
  1584 +
  1585 + .. help: remove page labels (numbers)
  1586 +
  1587 + Exclude page labels (explicit page numbers) from the output file.
  1588 +
  1589 + Exclude page labels (explicit page numbers) from the output file.
  1590 +
  1591 +.. _encryption-options:
  1592 +
  1593 +Encryption
  1594 +----------
  1595 +
  1596 +.. help-topic encryption: create encrypted files
  1597 +
  1598 + Create encrypted files. Usage:
  1599 +
  1600 + --encrypt user-password owner-password key-length [ options ] --
  1601 +
  1602 + Either or both of user-password and owner-password may be empty
  1603 + strings. key-length may be 40, 128, or 256. Encryption options are
  1604 + terminated by "--" by itself.
  1605 +
  1606 + 40-bit encryption is insecure, as is 128-bit encryption without
  1607 + AES. Use 256-bit encryption unless you have a specific reason to
  1608 + use an insecure format, such as testing or compatibility with very
  1609 + old viewers. You must use the --allow-weak-crypto to create
  1610 + encrypted files that use insecure cryptographic algorithms. The
  1611 + --allow-weak-crypto flag appears outside of --encrypt ... --
  1612 + (before --encrypt or after --).
  1613 +
  1614 + Available options vary by key length. Not all readers respect all
  1615 + restrictions. Different PDF readers respond differently to various
  1616 + combinations of options. Sometimes a PDF viewer may show you
  1617 + restrictions that differ from what you selected. This is probably
  1618 + not a bug in qpdf.
  1619 +
  1620 + Options for 40-bit only:
  1621 + --annotate=[yn] restrict comments, filling forms, and signing
  1622 + --extract=[yn] restrict text/graphic extraction
  1623 + --modify=[yn] restrict document modification
  1624 + --print=[yn] restrict printing
  1625 +
  1626 + Options for 128-bit or 256-bit:
  1627 + --accessibility=[yn] restrict accessibility (usually ignored)
  1628 + --annotate=[yn] restrict commenting/filling form fields
  1629 + --assemble=[yn] restrict document assembly
  1630 + --extract=[yn] restrict text/graphic extraction
  1631 + --form=[yn] restrict filling form fields
  1632 + --modify-other=[yn] restrict other modifications
  1633 + --modify=modify-opt control modify access by level
  1634 + --print=print-opt control printing access
  1635 + --cleartext-metadata prevent encryption of metadata
  1636 +
  1637 + For 128-bit only:
  1638 + --use-aes=[yn] indicates whether to use AES encryption
  1639 + --force-V4 forces use of V=4 encryption handler
  1640 +
  1641 + For 256-bit only:
  1642 + --force-R5 forces use of deprecated R=5 encryption
  1643 + --allow-insecure allow user password with empty owner password
  1644 +
  1645 + Values for print-opt:
  1646 + none disallow printing
  1647 + low allow only low-resolution printing
  1648 + full allow full printing
  1649 +
  1650 + Values for modify-opt:
  1651 + none allow no modifications
  1652 + assembly allow document assembly only
  1653 + form assembly + filling in form fields and signing
  1654 + annotate form + commenting and modifying forms
  1655 + all allow full document modification
  1656 +
  1657 +This section describes the options used to create encrypted files. For
  1658 +other options related to encryption, see also :qpdf:ref:`--decrypt`
  1659 +and :qpdf:ref:`--copy-encryption`. For a more in-depth technical
  1660 +discussion of how PDF encryption works internally, see
  1661 +:ref:`pdf-encryption`.
  1662 +
  1663 +To create an encrypted file, use
  1664 +
  1665 +::
  1666 +
  1667 + --encrypt user-password owner-password key-length [ options ] --
  1668 +
  1669 +Either or both of :samp:`{user-password}` and :samp:`{owner-password}`
  1670 +may be empty strings. :samp:`{key-length}` may be ``40``, ``128``, or
  1671 +``256``. Encryption options are terminated by ``--`` by itself.
  1672 +
  1673 +40-bit encryption is insecure, as is 128-bit encryption without AES.
  1674 +Use 256-bit encryption unless you have a specific reason to use an
  1675 +insecure format, such as testing or compatibility with very old
  1676 +viewers. You must use the :qpdf:ref:`--allow-weak-crypto` flag to
  1677 +create encrypted files that use insecure cryptographic algorithms. The
  1678 +:qpdf:ref:`--allow-weak-crypto` flag appears outside of ``--encrypt
  1679 +... --`` (before ``--encrypt`` or after ``--``).
  1680 +
  1681 +If :samp:`{key-length}` is 256, the minimum PDF version is 1.7 with
  1682 +extension level 8, and the AES-based encryption format used is the one
  1683 +described in the PDF 2.0 specification. Using 128-bit encryption
  1684 +forces the PDF version to be at least 1.4, or if AES is used, 1.6.
  1685 +Using 40-bit encryption forces the PDF version to be at least 1.3.
  1686 +
  1687 +When 256-bit encryption is used, PDF files with empty owner
  1688 +passwords are insecure. To create such files, you must specify the
  1689 +:qpdf:ref:`--allow-insecure` option.
  1690 +
  1691 +Available options vary by key length. Not all readers respect all
  1692 +restrictions. The default for each permission option is to be fully
  1693 +permissive. These restrictions may or may not be enforced by any
  1694 +particular reader. :command:`qpdf` allows very granular setting of
  1695 +restrictions. Some readers may not recognize the combination of
  1696 +options you specify. If you specify certain combinations of
  1697 +restrictions and find a reader that doesn't seem to honor them as you
  1698 +expect, it is most likely not a bug in :command:`qpdf`. qpdf itself
  1699 +does not obey encryption restrictions already imposed on the file.
  1700 +Doing so would be meaningless since qpdf can be used to remove
  1701 +encryption from the file entirely.
  1702 +
  1703 +Here is a summary of encryption options. Details are provided below.
  1704 +
  1705 +Options for 40-bit only
  1706 + - ``--annotate=[yn]``: restrict comments, filling forms, and signing
  1707 +
  1708 + - ``--extract=[yn]``: restrict text/graphic extraction
  1709 +
  1710 + - ``--modify=[yn]``: restrict document modification
  1711 +
  1712 + - ``--print=[yn]``: restrict printing
  1713 +
  1714 +Options for 128-bit or 256-bit
  1715 + - ``--accessibility=[yn]``: restrict accessibility (usually ignored)
  1716 +
  1717 + - ``--annotate=[yn]``: restrict commenting/filling form fields
  1718 +
  1719 + - ``--assemble=[yn]``: restrict document assembly
  1720 +
  1721 + - ``--extract=[yn]``: restrict text/graphic extraction
  1722 +
  1723 + - ``--form=[yn]``: restrict filling form fields
  1724 +
  1725 + - ``--modify-other=[yn]``: restrict other modifications
  1726 +
  1727 + - ``--modify=modify-opt``: control modify access by level
  1728 +
  1729 + - ``--print=print-opt``: control printing access
  1730 +
  1731 + - ``--cleartext-metadata``: prevent encryption of metadata
  1732 +
  1733 +For 128-bit only
  1734 + - ``--use-aes=[yn]``: indicates whether to use AES encryption
  1735 +
  1736 + - ``--force-V4``: forces use of V=4 encryption handler
  1737 +
  1738 +For 256-bit only
  1739 + - ``--force-R5``: forces use of deprecated ``R=5`` encryption algorithm
  1740 +
  1741 + - ``--allow-insecure``: allow user password with empty owner password
  1742 +
  1743 +Values for :samp:`{print-opt}`
  1744 + - ``none``: disallow printing
  1745 +
  1746 + - ``low``: allow only low-resolution printing
  1747 +
  1748 + - ``full``: allow full printing
  1749 +
  1750 +Values for :samp:`{modify-opt}`
  1751 + - ``none``: allow no modifications
  1752 +
  1753 + - ``assembly``: allow document assembly only
  1754 +
  1755 + - ``form``: ``assembly`` permissions plus filling in form fields and signing
  1756 +
  1757 + - ``annotate``: ``form`` permissions plus commenting and modifying forms
  1758 +
  1759 + - ``all``: allow full document modification
  1760 +
  1761 +Related Options
  1762 +~~~~~~~~~~~~~~~
  1763 +
  1764 +.. qpdf:option:: --accessibility=[yn]
  1765 +
  1766 + .. help: restrict document accessibility
  1767 +
  1768 + This option is ignored except with very old encryption formats.
  1769 + The current PDF specification does not allow restriction of
  1770 + document accessibility. This option is not available with 40-bit
  1771 + encryption.
  1772 +
  1773 + Enable/disable extraction of text for accessibility to visually
  1774 + impaired. The qpdf library disregards this field when AES is used
  1775 + with 128-bit encryption or when 256-bit encryption is used. You
  1776 + should never disable accessibility unless you are explicitly doing
  1777 + so for creating test files. The PDF spec says that conforming
  1778 + readers should disregard this permission and always allow
  1779 + accessibility.
  1780 +
  1781 + This option is not available with 40-bit encryption.
  1782 +
  1783 +.. qpdf:option:: --annotate=[yn]
  1784 +
  1785 + .. help: restrict document annotation
  1786 +
  1787 + Enable/disable modifying annotations including making comments
  1788 + and filling in form fields. For 128-bit and 256-bit encryption,
  1789 + this also enables editing, creating, and deleting form fields
  1790 + unless --modify-other=n or --modify=none is also specified.
  1791 +
  1792 + Enable/disable modifying annotations including making comments and
  1793 + filling in form fields. For 128-bit and 256-bit encryption, this
  1794 + also enables editing, creating, and deleting form fields unless
  1795 + :samp:`--modify-other=n` or :samp:`--modify=none` is also
  1796 + specified.
  1797 +
  1798 +.. qpdf:option:: --assemble=[yn]
  1799 +
  1800 + .. help: restrict document assembly
  1801 +
  1802 + Enable/disable document assembly (rotation and reordering of
  1803 + pages). This option is not available with 40-bit encryption.
  1804 +
  1805 + Enable/disable document assembly (rotation and reordering of
  1806 + pages).
  1807 +
  1808 + This option is not available with 40-bit encryption.
  1809 +
  1810 +.. qpdf:option:: --extract=[yn]
  1811 +
  1812 + .. help: restrict text/graphic extraction
  1813 +
  1814 + Enable/disable text/graphic extraction for purposes other than
  1815 + accessibility.
  1816 +
  1817 + Enable/disable text/graphic extraction for purposes other than
  1818 + accessibility.
  1819 +
  1820 +.. qpdf:option:: --form=[yn]
  1821 +
  1822 + .. help: restrict form filling
  1823 +
  1824 + Enable/disable whether filling form fields is allowed even if
  1825 + modification of annotations is disabled. This option is not
  1826 + available with 40-bit encryption.
  1827 +
  1828 + Enable/disable whether filling form fields is allowed even if
  1829 + modification of annotations is disabled.
  1830 +
  1831 + This option is not available with 40-bit encryption.
  1832 +
  1833 +.. qpdf:option:: --modify-other=[yn]
  1834 +
  1835 + .. help: restrict other modifications
  1836 +
  1837 + Enable/disable modifications not controlled by --assemble,
  1838 + --annotate, or --form. --modify-other=n is implied by any of the
  1839 + other --modify options. This option is not available with 40-bit
  1840 + encryption.
  1841 +
  1842 + Enable/disable modifications not controlled by
  1843 + :qpdf:ref:`--assemble`, :qpdf:ref:`--annotate`, or
  1844 + :qpdf:ref:`--form`. ``--modify-other=n`` is implied by any of the
  1845 + other :qpdf:ref:`--modify` options except for ``--modify=all``.
  1846 +
  1847 + This option is not available with 40-bit encryption.
  1848 +
  1849 +.. qpdf:option:: --modify=modify-opt
  1850 +
  1851 + .. help: restrict document modification
  1852 +
  1853 + For 40-bit files, modify-opt may only be y or n and controls all
  1854 + aspects of document modification.
  1855 +
  1856 + For 128-bit and 256-bit encryption, modify-opt values allow
  1857 + enabling and disabling levels of restriction in a manner similar
  1858 + to how some PDF creation tools do it. modify-opt values map to
  1859 + other combinations of options as follows:
  1860 +
  1861 + all: allow full modification (the default)
  1862 + annotate: --modify-other=n
  1863 + form: --modify-other=n --annotate=n
  1864 + assembly: --modify-other=n --annotate=n --form=n
  1865 + none: --modify-other=n --annotate=n --form=n --assemble=n
  1866 +
  1867 + For 40-bit files, :samp:`{modify-opt}` may only be ``y`` or ``n``
  1868 + and controls all aspects of document modification.
  1869 +
  1870 + For 128-bit and 256-bit encryption, :samp:`{modify-opt}` values
  1871 + allow enabling and disabling levels of restriction in a manner
  1872 + similar to how some PDF creation tools do it:
  1873 +
  1874 + - ``none``: allow no modifications
  1875 +
  1876 + - ``assembly``: allow document assembly only
  1877 +
  1878 + - ``form``: ``assembly`` permissions plus filling in form fields
  1879 + and signing
  1880 +
  1881 + - ``annotate``: ``form`` permissions plus commenting and modifying
  1882 + forms
  1883 +
  1884 + - ``all``: allow full document modification (the default)
  1885 +
  1886 + :samp:`{modify-opt}` values map to other combinations of options as
  1887 + follows:
  1888 +
  1889 + - ``none``: same as ``--modify-other=n --annotate=n --form=n --assemble=n``
  1890 +
  1891 + - ``assembly``: same as ``--modify-other=n --annotate=n --form=n``
  1892 +
  1893 + - ``form``: same as ``--modify-other=n --annotate=n``
  1894 +
  1895 + - ``annotate``: same as ``--modify-other=n``
  1896 +
  1897 + - ``all``: the default
  1898 +
  1899 + You can combine this option with the options listed above. If you
  1900 + do, later options override earlier options.
  1901 +
  1902 +.. qpdf:option:: --print=print-opt
  1903 +
  1904 + .. help: restrict printing
  1905 +
  1906 + Control what kind of printing is allowed. For 40-bit encryption,
  1907 + print-opt may only be y or n and enables or disables all
  1908 + printing. For 128-bit and 256-bit encryption, print-opt may have
  1909 + the following values:
  1910 +
  1911 + none: disallow printing
  1912 + low: allow low-resolution printing only
  1913 + full: allow full printing (the default)
  1914 +
  1915 + Control what kind of printing is allowed. For 40-bit encryption,
  1916 + :samp:`{print-opt}` may be ``y`` or ``n`` and enable or disable all
  1917 + printing. For 128-bit and 256-bit encryption, :samp:`{print-opt}`
  1918 + may have the following values:
  1919 +
  1920 + - :samp:`none`: disallow printing
  1921 +
  1922 + - :samp:`low`: allow low-resolution printing only
  1923 +
  1924 + - :samp:`full`: allow full printing (the default)
  1925 +
  1926 +.. qpdf:option:: --cleartext-metadata
  1927 +
  1928 + .. help: don't encrypt metadata
  1929 +
  1930 + If specified, don't encrypt document metadata even when
  1931 + encrypting the rest of the document. This option is not
  1932 + available with 40-bit encryption.
  1933 +
  1934 + If specified, any metadata stream in the document will be left
  1935 + unencrypted even if the rest of the document is encrypted. This also
  1936 + forces the PDF version to be at least 1.5.
  1937 +
  1938 + This option is not available with 40-bit encryption.
  1939 +
  1940 +.. qpdf:option:: --use-aes=[yn]
  1941 +
  1942 + .. help: use AES with 128-bit encryption
  1943 +
  1944 + Enables/disables use of the more secure AES encryption with
  1945 + 128-bit encryption. Specifying --use-aes=y forces the PDF
  1946 + version to be at least 1.6. This option is only available with
  1947 + 128-bit encryption. The default is "n" for compatibility
  1948 + reasons. Use 256-bit encryption instead.
  1949 +
  1950 + Enables/disables use of the more secure AES encryption with 128-bit
  1951 + encryption. Specifying ``--use-aes=y`` forces the PDF version to be
  1952 + at least 1.6. This option is only available with 128-bit
  1953 + encryption. The default is ``n`` for compatibility reasons. Use
  1954 + 256-bit encryption instead.
  1955 +
  1956 +.. qpdf:option:: --allow-insecure
  1957 +
  1958 + .. help: allow empty owner passwords
  1959 +
  1960 + Allow creation of PDF files with empty owner passwords and
  1961 + non-empty user passwords when using 256-bit encryption.
  1962 +
  1963 + Allow creation of PDF files with 256-bit keys where the user
  1964 + password is non-empty and the owner password is empty. Files
  1965 + created in this way are insecure since they can be opened without a
  1966 + password, and restrictions will not be enforced. Users would
  1967 + ordinarily never want to create such files. If you are using qpdf
  1968 + to intentionally created strange files for testing (a definite
  1969 + valid use of qpdf!), this option allows you to create such insecure
  1970 + files. This option is only available with 256-bit encryption.
  1971 +
  1972 + See :ref:`pdf-passwords` for a more technical discussion of this
  1973 + issue.
  1974 +
  1975 +.. qpdf:option:: --force-V4
  1976 +
  1977 + .. help: force V=4 in encryption dictionary
  1978 +
  1979 + This option is for testing and is never needed in practice since
  1980 + qpdf does this automatically when needed.
  1981 +
  1982 + Use of this option forces the ``V`` and ``R`` parameters in the
  1983 + document's encryption dictionary to be set to the value ``4``. As
  1984 + qpdf will automatically do this when required, there is no reason
  1985 + to ever use this option. It exists primarily for use in testing
  1986 + qpdf itself. This option also forces the PDF version to be at least
  1987 + 1.5.
  1988 +
  1989 +.. qpdf:option:: --force-R5
  1990 +
  1991 + .. help: use unsupported R=5 encryption
  1992 +
  1993 + Use an undocumented, unsupported, deprecated encryption
  1994 + algorithm that existed only in Acrobat version IX. This option
  1995 + should not be used except for compatibility testing.
  1996 +
  1997 + Use an undocumented, unsupported, deprecated encryption algorithm
  1998 + that existed only in Acrobat version IX. This option should not be
  1999 + used except for compatibility testing. If specified, qpdf sets the
  2000 + minimum version to 1.7 at extension level 3.
  2001 +
  2002 +.. _page-selection:
  2003 +
  2004 +Page Selection
  2005 +--------------
  2006 +
  2007 +.. help-topic page-selection: select pages from one or more files
  2008 +
  2009 + Use the --pages option to select pages from multiple files. Usage:
  2010 +
  2011 + qpdf in.pdf --pages input-file [ --password=password ] [ page-range ] \
  2012 + [ ... ] -- out.pdf
  2013 +
  2014 + Between --pages and the -- that terminates pages option, repeat
  2015 + the following:
  2016 +
  2017 + filename [ --password=password ] [ page-range ]
  2018 +
  2019 + Document-level information, such as outlines, tags, etc., is taken
  2020 + from in.pdf is preserved in out.pdf. You can use --empty in place
  2021 + of an input file to start from an empty file and just copy pages
  2022 + equally from all files. You can use "." as a shorthand for the
  2023 + primary input file (if not --empty). In the above example, "."
  2024 + would refer to in.pdf.
  2025 +
  2026 + Use --password=password to specify the password for a
  2027 + password-protected input file. If the same input file is used more
  2028 + than once, you only need to supply the password the first time. If
  2029 + the page range is omitted, all pages are selected.
  2030 +
  2031 + Run qpdf --help=page-ranges for help with page ranges.
  2032 +
  2033 + Use --collate=n to cause pages to be collated in groups of n pages
  2034 + (default 1) instead of concatenating the input.
  2035 +
  2036 + Examples:
  2037 +
  2038 + - Start with in.pdf and append all pages from a.pdf and the even
  2039 + pages from b.pdf, and write the output to out.pdf. Document-level
  2040 + information from in.pdf is retained. Note the use of "." to refer
  2041 + to in.pdf.
  2042 +
  2043 + qpdf in.pdf --pages . a.pdf b.pdf:even -- out.pdf
  2044 +
  2045 + - Take all the pages from a.pdf, all the pages from b.pdf in
  2046 + reverse, and only pages 3 and 6 from c.pdf and write the result
  2047 + to out.pdf. Use password "x" to open b.pdf:
  2048 +
  2049 + qpdf --empty --pages a.pdf b.pdf --password=x z-1 c.pdf 3,6
  2050 +
  2051 + More examples are in the manual.
  2052 +
  2053 +:command:`qpdf` allows you to use the :qpdf:ref:`--pages` option to
  2054 +split and merge PDF files by selecting pages from one or more input
  2055 +files.
  2056 +
  2057 +Usage: :samp:`qpdf {in.pdf} --pages input-file [ --password={password} ] [ {page-range} ] [ ... ] -- {out.pdf}`
  2058 +
  2059 +Between ``--pages`` and the ``--`` that terminates pages option,
  2060 +repeat the following:
  2061 +
  2062 +:samp:`{filename} [ --password={password} ] [ {page-range} ]`
  2063 +
  2064 +Notes:
  2065 + - The password argument is needed only for password-protected files.
  2066 + If you specify the same file more than once, you only need to supply
  2067 + the password the first time.
  2068 +
  2069 + - The page range may be omitted. If omitted, all pages are included.
  2070 +
  2071 + - Document-level information, such as outlines, tags, etc., is taken
  2072 + from the primary input file (in the above example, :file:`in.pdf`)
  2073 + and is preserved in :file:`out.pdf`. You can use
  2074 + :qpdf:ref:`--empty` in place of an input file to start from an
  2075 + empty file and just copy pages equally from all files.
  2076 +
  2077 + - You can use ``.`` as a shorthand for the primary input file, if not
  2078 + empty.
  2079 +
  2080 +See :ref:`page-ranges` for help on specifying a page range.
  2081 +
  2082 +Use :samp:`--collate={n}` to cause pages to be collated in groups of
  2083 +:samp:`{n}` pages (default 1) instead of concatenating the input. Note
  2084 +that the :qpdf:ref:`--collate` appears outside of ``--pages ... --``
  2085 +(before ``--pages`` or after ``--``). Pages are pulled from each
  2086 +document in turn. When a document is out of pages, it is skipped. See
  2087 +examples below.
  2088 +
  2089 +Examples
  2090 +~~~~~~~~
  2091 +
  2092 +- Start with :file:`in.pdf` and append all pages from :file:`a.pdf`
  2093 + and the even pages from :file:`b.pdf`, and write the output to
  2094 + :file:`out.pdf`. Document-level information from :file:`in.pdf` is
  2095 + retained. Note the use of ``.`` to refer to :file:`in.pdf`.
  2096 +
  2097 + ::
  2098 +
  2099 + qpdf in.pdf --pages . a.pdf b.pdf:even -- out.pdf
  2100 +
  2101 +
  2102 +- Take all the pages from :file:`a.pdf`, all the pages from
  2103 + :file:`b.pdf` in reverse, and only pages 3 and 6 from :file:`c.pdf`
  2104 + and write the result to :file:`out.pdf`. Document-level metadata is
  2105 + discarded from all input files. The password ``x`` is used to open
  2106 + :file:`b.pdf`.
  2107 +
  2108 + ::
  2109 +
  2110 + qpdf --empty --pages a.pdf b.pdf --password=x z-1 c.pdf 3,6
  2111 +
  2112 +- Scan a document with printing on both sides by scanning the fronts
  2113 + into :file:`odd.pdf` and the backs into :file:`even.pdf`. Collate
  2114 + the results into :file:`all.pdf`. This takes the first page of
  2115 + :file:`odd.pdf`, the first page of :file:`even.pdf`, the second page
  2116 + of :file:`odd.pdf`, the second page of :file:`even.pdf`, etc.
  2117 +
  2118 + ::
  2119 +
  2120 + qpdf --collate odd.pdf --pages . even.pdf -- all.pdf
  2121 + OR
  2122 + qpdf --collate --empty --pages odd.pdf even.pdf -- all.pdf
  2123 +
  2124 +- When collating, any number of files and page ranges can be
  2125 + specified. If any file has fewer pages, that file is just skipped
  2126 + when its pages have all been included. For example, if you ran
  2127 +
  2128 + ::
  2129 +
  2130 + qpdf --collate --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf r1 -- out.pdf
  2131 +
  2132 + you would get the following pages in this order:
  2133 +
  2134 + - a.pdf page 1
  2135 +
  2136 + - b.pdf page 6
  2137 +
  2138 + - c.pdf last page
  2139 +
  2140 + - a.pdf page 2
  2141 +
  2142 + - b.pdf page 5
  2143 +
  2144 + - a.pdf page 3
  2145 +
  2146 + - b.pdf page 4
  2147 +
  2148 + - a.pdf page 4
  2149 +
  2150 + - a.pdf page 5
  2151 +
  2152 +- You can specify a numeric argument to :qpdf:ref:`--collate`. With
  2153 + :samp:`--collate={n}`, pull groups of :samp:`{n}` pages from each
  2154 + file, as always, stopping when there are no more pages. For example,
  2155 + if you ran
  2156 +
  2157 + ::
  2158 +
  2159 + qpdf --collate=2 --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf r1 -- out.pdf
  2160 +
  2161 + you would get the following pages in this order:
  2162 +
  2163 + - a.pdf page 1
  2164 +
  2165 + - a.pdf page 2
  2166 +
  2167 + - b.pdf page 6
  2168 +
  2169 + - b.pdf page 5
  2170 +
  2171 + - c.pdf last page
  2172 +
  2173 + - a.pdf page 3
  2174 +
  2175 + - a.pdf page 4
  2176 +
  2177 + - b.pdf page 4
  2178 +
  2179 + - a.pdf page 5
  2180 +
  2181 +- Take pages 1 through 5 from :file:`file1.pdf` and pages 11 through
  2182 + 15 in reverse from :file:`file2.pdf`, taking document-level metadata
  2183 + from :file:`file2.pdf`.
  2184 +
  2185 + ::
  2186 +
  2187 + qpdf file2.pdf --pages file1.pdf 1-5 . 15-11 -- outfile.pdf
  2188 +
  2189 +- Here's a more contrived example. If, for some reason, you wanted to
  2190 + take the first page of an encrypted file called
  2191 + :file:`encrypted.pdf` with password ``pass`` and repeat it twice in
  2192 + an output file without any shared data between the two copies of
  2193 + page 1, and if you wanted to drop document-level metadata but
  2194 + preserve encryption, you could run
  2195 +
  2196 + ::
  2197 +
  2198 + qpdf --empty --copy-encryption=encrypted.pdf \
  2199 + --encryption-file-password=pass \
  2200 + --pages encrypted.pdf --password=pass 1 \
  2201 + ./encrypted.pdf --password=pass 1 -- \
  2202 + outfile.pdf
  2203 +
  2204 + Note that we had to specify the password all three times because
  2205 + giving a password as :qpdf:ref:`--encryption-file-password` doesn't
  2206 + count for page selection, and as far as qpdf is concerned,
  2207 + :file:`encrypted.pdf` and :file:`./encrypted.pdf` are separate
  2208 + files. (This is by design. See :ref:`page-limitations` for a
  2209 + discussion.) These are all corner cases that most users should
  2210 + hopefully never have to be bothered with.
  2211 +
  2212 +.. _page-limitations:
  2213 +
  2214 +Limitations
  2215 +~~~~~~~~~~~
  2216 +
  2217 +With the exception of page labels (page numbers), :command:`qpdf`
  2218 +doesn't yet have full support for handling document-level data as it
  2219 +relates to pages. Certain document-level features such as form fields,
  2220 +outlines (bookmarks), and article tags among others, are copied in
  2221 +their entirety from the primary input file. Starting with qpdf version
  2222 +8.3, page labels are preserved from all files unless
  2223 +:qpdf:ref:`--remove-page-labels` is specified.
  2224 +
  2225 +.. If updating this after limitations are removed or reduced,
  2226 + recheck --split-pages as well.
  2227 +
  2228 +It is expected that a future version of :command:`qpdf` will have more
  2229 +complete and configurable behavior regarding document-level metadata.
  2230 +In the meantime, semantics of splitting and merging vary across
  2231 +features. For example, the document's outlines (bookmarks) point to
  2232 +actual page objects, so if you select some pages and not others,
  2233 +bookmarks that point to pages that are in the output file will work,
  2234 +and remaining bookmarks will not work. If you don't want to preserve
  2235 +the primary file's metadata, use :qpdf:ref:`--empty` as the primary
  2236 +input file.
  2237 +
  2238 +Visit `qpdf issues labeled with "pages"
  2239 +<https://github.com/qpdf/qpdf/issues?q=is%3Aopen+is%3Aissue+label%3Apages>`__
  2240 +or look at the :file:`TODO` file in the qpdf source distribution for
  2241 +some of the ideas.
  2242 +
  2243 +.. NOTE:
  2244 +
  2245 + The workaround described in the following paragraph is mentioned in
  2246 + the documentation in more than one place. Searching for ./ should
  2247 + help find them. It is also in the test suite. I believe there are
  2248 + several valid uses cases for doing this, and so it is my intention
  2249 + to leave the behavior of treating different paths to the same file
  2250 + as separate even if the above limitations are removed. See also
  2251 + https://github.com/qpdf/qpdf/issues/399
  2252 +
  2253 +Prior to :command:`qpdf` version 8.4, it was not possible to specify
  2254 +the same page from the same file directly more than once, and a
  2255 +workaround of specifying the same file in more than one way was
  2256 +required. Version 8.4 removes this limitation, but when the same page
  2257 +is copied more than once, all its data is shared between the pages.
  2258 +Sometimes this is fine, but sometimes it may not work correctly,
  2259 +particularly if there are form fields or you intend to perform other
  2260 +modifications on one of the pages. A future version of qpdf should
  2261 +address this more completely. You can work around this by specifying
  2262 +the same file in two different ways. For example :command:`qpdf
  2263 +in.pdf --pages . 1 ./in.pdf 1 -- out.pdf` would create a file with two
  2264 +copies of the first page of the input, and the two copies would not
  2265 +share any objects in common. This includes fonts, images, and anything
  2266 +else the page references.
  2267 +
  2268 +.. _overlay-underlay:
  2269 +
  2270 +Overlay and Underlay
  2271 +--------------------
1458 2272  
1459   -:samp:`--show-encryption`
1460   - Shows document encryption parameters. Also shows the document's user
1461   - password if the owner password is given.
  2273 +.. help-topic overlay-underlay: overlay/underlay pages from other files
  2274 +
  2275 + These options allow pages from another file to be overlaid or
  2276 + underlaid on the primary output. Overlaid pages are drawn on top of
  2277 + the destination page and may obscure the page. Underlaid pages are
  2278 + drawn below the destination page. Usage:
  2279 +
  2280 + {--overlay | --underlay } file
  2281 + [ --password=password ]
  2282 + [ --to=page-range ]
  2283 + [ --from=[page-range] ]
  2284 + [ --repeat=page-range ]
  2285 + --
  2286 +
  2287 + Note the use of "--" by itself to terminate overlay/underlay options.
  2288 +
  2289 + For overlay and underlay, a file and optional password are specified, along
  2290 + with a series of optional page ranges. The default behavior is that each
  2291 + page of the overlay or underlay file is imposed on the corresponding page
  2292 + of the primary output until it runs out of pages, and any extra pages are
  2293 + ignored. You can also give a page range with --repeat to cause
  2294 + those pages to be repeated after the original pages are exhausted.
  2295 +
  2296 + Run qpdf --help=page-ranges for help with page ranges.
  2297 +
  2298 +You can use :command:`qpdf` to overlay or underlay pages from other
  2299 +files onto the output generated by qpdf. Specify overlay or underlay
  2300 +as follows:
  2301 +
  2302 +::
  2303 +
  2304 + { --overlay | --underlay } file [ options ] --
  2305 +
  2306 +Overlay and underlay options are processed late, so they can be
  2307 +combined with other options like merging and will apply to the final
  2308 +output. The ``--overlay`` and ``--underlay`` options work the same
  2309 +way, except underlay pages are drawn underneath the page to which they
  2310 +are applied, possibly obscured by the original page, and overlay files
  2311 +are drawn on top of the page to which they are applied, possibly
  2312 +obscuring the page. You can combine overlay and underlay.
  2313 +
  2314 +The default behavior of overlay and underlay is that pages are taken
  2315 +from the overlay/underlay file in sequence and applied to
  2316 +corresponding pages in the output until there are no more output
  2317 +pages. If the overlay or underlay file runs out of pages, remaining
  2318 +output pages are left alone. This behavior can be modified by options,
  2319 +which are provided between the ``--overlay`` or ``--underlay`` flag
  2320 +and the ``--`` option. The following options are supported:
  2321 +
  2322 +.. qpdf:option:: --to=page-range
  2323 +
  2324 + .. help: destination pages for underlay/overlay
  2325 +
  2326 + Specify the range of pages in the primary output to apply
  2327 + overlay/underlay to. See qpdf --help=page-ranges for help with
  2328 + the page range syntax.
  2329 +
  2330 + Specify a page range (see :ref:`page-ranges`) that indicates which
  2331 + pages in the output should have the overlay/underlay applied. If not
  2332 + specified, overlay/underlay are applied to all pages.
  2333 +
  2334 +.. qpdf:option:: --from=[page-range]
  2335 +
  2336 + .. help: source pages for underlay/overlay
  2337 +
  2338 + Specify pages from the overlay/underlay file that are applied to
  2339 + the destination pages. See qpdf --help=page-ranges for help
  2340 + with the page range syntax. The page range may be omitted
  2341 + if --repeat is used.
  2342 +
  2343 +Specify a page range that indicates which pages in the
  2344 +overlay/underlay file will be used for overlay or underlay. If not
  2345 +specified, all pages will be used. This can be left empty by omitting
  2346 +:samp:`{page-range}` if :qpdf:ref:`--repeat` is used.
  2347 +
  2348 +.. qpdf:option:: --repeat=page-range
  2349 +
  2350 + .. help: overlay/underlay pages to repeat
  2351 +
  2352 + Specify pages from the overlay/underlay that are repeated after
  2353 + "from" pages have been exhausted. See qpdf --help=page-ranges
  2354 + for help with the page range syntax.
  2355 +
  2356 +Specify an optional page range that indicates which pages in the
  2357 +overlay/underlay file will be repeated after the "from" pages are used
  2358 +up. If you want to apply a repeat a range of pages starting with the
  2359 +first page of output, you can explicitly use ``--from=``.
  2360 +
  2361 +Examples
  2362 +~~~~~~~~
  2363 +
  2364 +- Overlay the first three pages from file :file:`o.pdf` onto the first
  2365 + three pages of the output, then overlay page 4 from :file:`o.pdf`
  2366 + onto pages 4 and 5 of the output. Leave remaining output pages
  2367 + untouched.
  2368 +
  2369 + ::
  2370 +
  2371 + qpdf in.pdf --overlay o.pdf --to=1-5 --from=1-3 --repeat=4 -- out.pdf
  2372 +
  2373 +
  2374 +- Underlay page 1 of :file:`footer.pdf` on all odd output pages, and
  2375 + underlay page 2 of :file:`footer.pdf` on all even output pages.
  2376 +
  2377 + ::
  2378 +
  2379 + qpdf in.pdf --underlay footer.pdf --from= --repeat=1,2 -- out.pdf
  2380 +
  2381 +- Combine two files and overlay the single page from watermark.pdf on
  2382 + the result.
  2383 +
  2384 + ::
  2385 +
  2386 + qpdf --empty --pages a.pdf b.pdf -- \
  2387 + --overlay watermark.pdf --from= --repeat=1 -- out.pdf
  2388 +
  2389 +.. _attachments:
  2390 +
  2391 +Embedded Files/Attachments
  2392 +--------------------------
  2393 +
  2394 +.. help-topic attachments: work with embedded files
  2395 +
  2396 + It is possible to list, add, or delete embedded files (also known
  2397 + as attachments) and to copy attachments from other files. See help
  2398 + on individual options for details. Run qpdf --help=add-attachment
  2399 + for additional details about adding attachments.
  2400 +
  2401 +It is possible to list, add, or delete embedded files (also known as
  2402 +attachments) and to copy attachments from other files.
  2403 +
  2404 +Related Options
  2405 +~~~~~~~~~~~~~~~
  2406 +
  2407 +.. qpdf:option:: --list-attachments
  2408 +
  2409 + .. help: list embedded files
  2410 +
  2411 + Show the key and stream number for each embedded file. Combine
  2412 + with --verbose for more detailed information.
  2413 +
  2414 + Show the *key* and stream number for each embedded file. With
  2415 + :qpdf:ref:`--verbose`, additional information, including preferred
  2416 + file name, description, dates, and more are also displayed. The key
  2417 + is usually but not always equal to the file name and is needed by
  2418 + some of the other options.
  2419 +
  2420 +.. qpdf:option:: --show-attachment=key
  2421 +
  2422 + .. help: export an embedded file
  2423 +
  2424 + Write the contents of the specified attachment to standard
  2425 + output as binary data. Get the key with --list-attachments.
  2426 +
  2427 + Write the contents of the specified attachment to standard output
  2428 + as binary data. The key should match one of the keys shown by
  2429 + :qpdf:ref:`--list-attachments`. If this option is given more than
  2430 + once, only the last attachment will be shown.
  2431 +
  2432 +.. qpdf:option:: --add-attachment file options --
  2433 +
  2434 + .. help: start add attachment options
  2435 +
  2436 + The --add-attachment flag and its options may be repeated to add
  2437 + multiple attachments. Run qpdf --help=add-attachment for details.
  2438 +
  2439 + This flag starts add attachment options, which are used to add
  2440 + attachments to a file.
  2441 +
  2442 + The ``--add-attachment`` flag and its options may be repeated to
  2443 + add multiple attachments. Please see :ref:`add-attachment` for
  2444 + additional details.
  2445 +
  2446 +.. qpdf:option:: --remove-attachment=key
  2447 +
  2448 + .. help: remove an embedded file
  2449 +
  2450 + Remove an embedded file using its key. Get the key with
  2451 + --list-attachments.
  2452 +
  2453 + Remove the specified attachment. This doesn't only remove the
  2454 + attachment from the embedded files table but also clears out the
  2455 + file specification to ensure that the attachment is actually not
  2456 + present in the output file. That means that any potential internal
  2457 + links to the attachment will be broken. This option may be
  2458 + specified multiple times. Run with :qpdf:ref:`--verbose` to see
  2459 + status of the removal. Use :qpdf:ref:`--list-attachments` to find
  2460 + the attachment key. This option may be repeated to remove multiple
  2461 + attachments.
  2462 +
  2463 +.. qpdf:option:: --copy-attachments-from file options --
  2464 +
  2465 + .. help: start copy attachment options
  2466 +
  2467 + The --copy-attachments-from flag and its options may be repeated
  2468 + to copy attachments from multiple files. Run
  2469 + qpdf --help=copy-attachments for details.
  2470 +
  2471 + This flag starts copy attachment options, which are used to copy
  2472 + attachments from other files.
  2473 +
  2474 + The ``--copy-attachments-from`` flag and its options may be
  2475 + repeated to copy attachments from multiple files. Please see
  2476 + :ref:`copy-attachments` for additional details.
  2477 +
  2478 +.. _pdf-dates:
  2479 +
  2480 +PDF Date Format
  2481 +~~~~~~~~~~~~~~~
  2482 +
  2483 +.. help-topic pdf-dates: PDF date format
  2484 +
  2485 + When a date is required, the date should conform to the PDF date
  2486 + format specification, which is "D:yyyymmddhhmmssz" where "z" is
  2487 + either literally upper case "Z" for UTC or a timezone offset in
  2488 + the form "-hh'mm'" or "+hh'mm'". Negative timezone offsets indicate
  2489 + time before UTC. Positive offsets indicate how far after. For
  2490 + example, US Eastern Standard Time (America/New_York) is "-05'00'",
  2491 + and Indian Standard Time (Asia/Calcutta) is "+05'30'".
  2492 +
  2493 + Examples:
  2494 + - D:20210207161528-05'00' February 7, 2021 at 4:15:28 p.m.
  2495 + - D:20210207211528Z February 7, 2021 at 21:15:28 UTC
  2496 +
  2497 +When a date is required, the date should conform to the PDF date
  2498 +format specification, which is :samp:`D:{yyyymmddhhmmssz}` where
  2499 +:samp:`{z}` is either literally upper case ``Z`` for UTC or a
  2500 +timezone offset in the form :samp:`{-hh'mm'}` or :samp:`{+hh'mm'}`.
  2501 +Negative timezone offsets indicate time before UTC. Positive offsets
  2502 +indicate how far after. For example, US Eastern Standard Time
  2503 +(America/New_York) is ``-05'00'``, and Indian Standard Time
  2504 +(Asia/Calcutta) is ``+05'30'``.
  2505 +
  2506 +Examples:
  2507 + - ``D:20210207161528-05'00'``: February 7, 2021 at 4:15:28 p.m.
  2508 +
  2509 + - ``D:20210207211528Z``: February 7, 2021 at 21:15:28 UTC
  2510 +
  2511 +.. _add-attachment:
  2512 +
  2513 +Options for Adding Attachments
  2514 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2515 +
  2516 +.. help-topic add-attachment: attach (embed) files
  2517 +
  2518 + The options listed below appear between --add-attachment and its
  2519 + terminating "--".
  2520 +
  2521 +These options are valid between :qpdf:ref:`--add-attachment` and ``--``.
  2522 +
  2523 +.. qpdf:option:: --key=key
  2524 +
  2525 + .. help: specify attachment key
  2526 +
  2527 + Specify the key to use for the attachment in the embedded files
  2528 + table. It defaults to the last element of the attached file's
  2529 + filename.
  2530 +
  2531 + Specify the key to use for the attachment in the embedded files
  2532 + table. It defaults to the last element of the attached file's
  2533 + filename.
  2534 +
  2535 +.. qpdf:option:: --filename=name
  2536 +
  2537 + .. help: set attachment's displayed filename
  2538 +
  2539 + Specify the filename to be used for the attachment. This is what
  2540 + is usually displayed to the user and is the name most graphical
  2541 + PDF viewers will use when saving a file. It defaults to the last
  2542 + element of the attached file's filename.
  2543 +
  2544 + Specify the filename to be used for the attachment. This is what is
  2545 + usually displayed to the user and is the name most graphical PDF
  2546 + viewers will use when saving a file. It defaults to the last
  2547 + element of the attached file's filename.
  2548 +
  2549 +.. qpdf:option:: --creationdate=date
  2550 +
  2551 + .. help: set attachment's creation date
  2552 +
  2553 + Specify the attachment's creation date in PDF format; defaults
  2554 + to the current time. Run qpdf --help=pdf-dates for information
  2555 + about the date format.
  2556 +
  2557 + Specify the attachment's creation date in PDF format; defaults to
  2558 + the current time. See :ref:`pdf-dates` for information about the
  2559 + date format.
  2560 +
  2561 +.. qpdf:option:: --moddate=date
  2562 +
  2563 + .. help: set attachment's modification date
  2564 +
  2565 + Specify the attachment's modification date in PDF format;
  2566 + defaults to the current time. Run qpdf --help=pdf-dates for
  2567 + information about the date format.
  2568 +
  2569 + Specify the attachment's modification date in PDF format; defaults
  2570 + to the current time. See :ref:`pdf-dates` for information about the
  2571 + date format.
  2572 +
  2573 +.. qpdf:option:: --mimetype=type/subtype
  2574 +
  2575 + .. help: attachment mime type (e.g. application/pdf)
  2576 +
  2577 + Specify the mime type for the attachment, such as text/plain,
  2578 + application/pdf, image/png, etc.
  2579 +
  2580 + Specify the mime type for the attachment, such as ``text/plain``,
  2581 + ``application/pdf``, ``image/png``, etc. The qpdf library does not
  2582 + automatically determine the mime type. In a UNIX-like environment,
  2583 + the :command:`file` command can often provide this information. In
  2584 + MacOS, you can use :samp:`file -I {filename}`. In Linux, it's
  2585 + :samp:`file -i {filename}`.
  2586 +
  2587 + Implementation note: the mime type appears in a field called
  2588 + ``/Subtype`` in the PDF file, but that field actually includes the
  2589 + full type and subtype of the mime type. This is because ``/Type``
  2590 + already means something else in PDF.
  2591 +
  2592 +.. qpdf:option:: --description="text"
  2593 +
  2594 + .. help: set attachment's description
  2595 +
  2596 + Supply descriptive text for the attachment, displayed by some
  2597 + PDF viewers.
  2598 +
  2599 + Supply descriptive text for the attachment, displayed by some PDF
  2600 + viewers.
  2601 +
  2602 +.. qpdf:option:: --replace
  2603 +
  2604 + .. help: replace attachment with same key
  2605 +
  2606 + Indicate that any existing attachment with the same key should
  2607 + be replaced by the new attachment. Otherwise, qpdf gives an
  2608 + error if an attachment with that key is already present.
  2609 +
  2610 + Indicate that any existing attachment with the same key should be
  2611 + replaced by the new attachment. Otherwise, :command:`qpdf` gives an
  2612 + error if an attachment with that key is already present.
  2613 +
  2614 +.. _copy-attachments:
  2615 +
  2616 +Options for Copying Attachments
  2617 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2618 +
  2619 +.. help-topic copy-attachments: copy attachments from another file
  2620 +
  2621 + The options listed below appear between --copy-attachments-from and
  2622 + its terminating "--".
  2623 +
  2624 + To copy attachments from a password-protected file, use
  2625 + the --password option after the file name.
  2626 +
  2627 +Options in this section are valid between
  2628 +:qpdf:ref:`--copy-attachments-from` and ``--``.
  2629 +
  2630 +.. qpdf:option:: --prefix=prefix
  2631 +
  2632 + .. help: key prefix for copying attachments
  2633 +
  2634 + Prepend a prefix to each key; may be needed if there are
  2635 + duplicate attachment keys. This affects the key only, not the
  2636 + file name.
  2637 +
  2638 + Only required if the file from which attachments are being copied
  2639 + has attachments with keys that conflict with attachments already
  2640 + in the file. In this case, the specified prefix will be prepended
  2641 + to each key. This affects only the key in the embedded files
  2642 + table, not the file name. The PDF specification doesn't preclude
  2643 + multiple attachments having the same file name.
  2644 +
  2645 +.. _inspection-options:
  2646 +
  2647 +PDF Inspection
  2648 +--------------
  2649 +
  2650 +These options provide tools for inspecting PDF files. When any of the
  2651 +options in this section are specified, no output file should be given.
  2652 +
  2653 +Related Options
  2654 +~~~~~~~~~~~~~~~
  2655 +
  2656 +.. qpdf:option:: --is-encrypted
  2657 +
  2658 + .. help: silently test whether a file is encrypted
  2659 +
  2660 + Silently exit with a code indicating the file's encryption status:
  2661 +
  2662 + 0: the file is encrypted
  2663 + 1: not used
  2664 + 2: the file is not encrypted
  2665 +
  2666 + This can be used with password-protected files even if you don't
  2667 + know the password.
  2668 +
  2669 + Silently exit with a code indicating the file's encryption status:
  2670 +
  2671 + - ``0``: the file is encrypted
  2672 +
  2673 + - ``1``: not used
  2674 +
  2675 + - ``2``: the file is not encrypted
  2676 +
  2677 + This option can be used for password-protected files even if you
  2678 + don't know the password.
  2679 +
  2680 + This option is useful for shell scripts. Other options are ignored
  2681 + if this is given. This option is mutually exclusive with
  2682 + :qpdf:ref:`--requires-password`. Both this option and
  2683 + :qpdf:ref:`--requires-password` exit with status ``2`` for
  2684 + non-encrypted files.
  2685 +
  2686 +.. qpdf:option:: --requires-password
  2687 +
  2688 + .. help: silently test a file's password
  2689 +
  2690 + Silently exit with a code indicating the file's password status:
  2691 +
  2692 + 0: a password, other than as supplied, is required
  2693 + 1: not used
  2694 + 2: the file is not encrypted
  2695 + 3: the file is encrypted, and correct password (if any) has been supplied
  2696 +
  2697 + Silently exit with a code indicating the file's password status:
  2698 +
  2699 + - ``0``: a password, other than as supplied, is required
  2700 +
  2701 + - ``1``: not used
  2702 +
  2703 + - ``2``: the file is not encrypted
  2704 +
  2705 + - ``3``: the file is encrypted, and correct password (if any) has
  2706 + been supplied
  2707 +
  2708 + Use with the :qpdf:ref:`--password` option to specify the password
  2709 + to test.
  2710 +
  2711 + The choice of exit status ``0`` to mean that a password is required
  2712 + is to enable code like
  2713 +
  2714 + .. code-block:: bash
  2715 +
  2716 + if [ qpdf --requires-password file.pdf ]; then
  2717 + # prompt for password
  2718 + fi
  2719 +
  2720 + If a password is supplied with :qpdf:ref:`--password`, that
  2721 + password is used to open the file just as with any normal
  2722 + invocation of :command:`qpdf`. That means that using this option
  2723 + with :qpdf:ref:`--password` option can be used to check the
  2724 + correctness of the password. In that case, an exit status of ``3``
  2725 + means the file works with the supplied password. This option is
  2726 + mutually exclusive with :qpdf:ref:`--is-encrypted`. Both this
  2727 + option and :qpdf:ref:`--is-encrypted` exit with status ``2`` for
  2728 + non-encrypted files.
  2729 +
  2730 +.. qpdf:option:: --check
  2731 +
  2732 + .. help: partially check whether PDF is valid
  2733 +
  2734 + Check the structure of the PDF file as well as a number of other
  2735 + aspects of the file, and write information about the file to
  2736 + standard output. Note that qpdf does not perform any validation
  2737 + of the actual PDF page content or semantic correctness of the
  2738 + PDF file. It merely checks that the PDF file is syntactically
  2739 + valid.
  2740 +
  2741 + Check the file's structure and well as encryption, linearization,
  2742 + and encoding of stream data, and write information about the file
  2743 + to standard output. An exit status of ``0`` indicates syntactic
  2744 + correctness of the PDF file. Note that :samp:`--check` writes
  2745 + nothing to standard error when everything is valid, so if you are
  2746 + using this to programmatically validate files in bulk, it is safe
  2747 + to run without output redirected to :file:`/dev/null` and just
  2748 + check for a ``0`` exit code.
  2749 +
  2750 + A file for which :samp:`--check` reports no errors may still have
  2751 + errors in stream data content or may contain constructs that don't
  2752 + conform to the PDF specification, but it should be syntactically
  2753 + valid. If :samp:`--check` reports any errors, qpdf will exit with a
  2754 + status of ``2``. There are some recoverable conditions that
  2755 + :samp:`--check` detects. These are issued as warnings instead of
  2756 + errors. If qpdf finds no errors but finds warnings, it will exit
  2757 + with a status of ``3``. When :samp:`--check` is combined with other
  2758 + options, checks are always performed before any other options are
  2759 + processed. For erroneous files, :samp:`--check` will cause qpdf to
  2760 + attempt to recover, after which other options are effectively
  2761 + operating on the recovered file. Combining :samp:`--check` with
  2762 + other options in this way can be useful for manually recovering
  2763 + severely damaged files.
  2764 +
  2765 +.. qpdf:option:: --show-encryption
  2766 +
  2767 + .. help: information about encrypted files
  2768 +
  2769 + Show document encryption parameters. Also show the document's
  2770 + user password if the owner password is given and the file was
  2771 + encrypted using older encryption formats that allow user
  2772 + password recovery.
  2773 +
  2774 + This option shows document encryption parameters. It also shows the
  2775 + document's user password if the owner password is given and the
  2776 + file was encrypted using older encryption formats that allow user
  2777 + password recovery. (See :ref:`pdf-encryption` for a technical
  2778 + discussion of this feature.) The output of ``--show-encryption`` is
  2779 + included in the output of :qpdf:ref:`--check`.
  2780 +
  2781 +.. qpdf:option:: --show-encryption-key
  2782 +
  2783 + .. help: show key with --show-encryption
  2784 +
  2785 + When used with --show-encryption, causes the underlying
  2786 + encryption key to be displayed.
1462 2787  
1463   -:samp:`--show-encryption-key`
1464 2788 When encryption information is being displayed, as when
1465   - :samp:`--check` or
1466   - :samp:`--show-encryption` is given, display the
  2789 + :qpdf:ref:`--check` or :qpdf:ref:`--show-encryption` is given, display the
1467 2790 computed or retrieved encryption key as a hexadecimal string. This
1468 2791 value is not ordinarily useful to users, but it can be used as the
1469   - argument to :samp:`--password` if the
1470   - :samp:`--password-is-hex-key` is specified. Note
1471   - that, when PDF files are encrypted, passwords and other metadata are
1472   - used only to compute an encryption key, and the encryption key is
1473   - what is actually used for encryption. This enables retrieval of that
1474   - key.
1475   -
1476   -:samp:`--check-linearization`
1477   - Checks file integrity and linearization status.
1478   -
1479   -:samp:`--show-linearization`
1480   - Checks and displays all data in the linearization hint tables.
1481   -
1482   -:samp:`--show-xref`
1483   - Shows the contents of the cross-reference table in a human-readable
1484   - form. This is especially useful for files with cross-reference
1485   - streams which are stored in a binary format.
1486   -
1487   -:samp:`--show-object=trailer|obj[,gen]`
1488   - Show the contents of the given object. This is especially useful for
1489   - inspecting objects that are inside of object streams (also known as
1490   - "compressed objects").
1491   -
1492   -:samp:`--raw-stream-data`
1493   - When used along with the :samp:`--show-object`
1494   - option, if the object is a stream, shows the raw stream data instead
1495   - of object's contents.
1496   -
1497   -:samp:`--filtered-stream-data`
1498   - When used along with the :samp:`--show-object`
1499   - option, if the object is a stream, shows the filtered stream data
1500   - instead of object's contents. If the stream is filtered using filters
1501   - that qpdf does not support, an error will be issued.
1502   -
1503   -:samp:`--show-npages`
1504   - Prints the number of pages in the input file on a line by itself.
  2792 + argument to :qpdf:ref:`--password` if the :qpdf:ref:`--password-is-hex-key`
  2793 + is specified. Note that, when PDF files are encrypted, passwords
  2794 + and other metadata are used only to compute an encryption key, and
  2795 + the encryption key is what is actually used for encryption. This
  2796 + enables retrieval of that key. See :ref:`pdf-encryption` for a
  2797 + technical discussion.
  2798 +
  2799 +.. qpdf:option:: --check-linearization
  2800 +
  2801 + .. help: check linearization tables
  2802 +
  2803 + Check to see whether a file is linearized and, if so, whether
  2804 + the linearization hint tables are correct.
  2805 +
  2806 + Check to see whether a file is linearized and, if so, whether the
  2807 + linearization hint tables are correct. qpdf does not check all
  2808 + aspects of linearization. A linearized PDF file with linearization
  2809 + errors that is otherwise correct is almost always readable by a PDF
  2810 + viewer. As such, "errors" in PDF linearization are treated by
  2811 + :command:`qpdf` as warnings.
  2812 +
  2813 +.. qpdf:option:: --show-linearization
  2814 +
  2815 + .. help: show linearization hint tables
  2816 +
  2817 + Check and display all data in the linearization hint tables.
  2818 +
  2819 + Check and display all data in the linearization hint tables.
  2820 +
  2821 +.. qpdf:option:: --show-xref
  2822 +
  2823 + .. help: show cross reference data
  2824 +
  2825 + Show the contents of the cross-reference table or stream (object
  2826 + locations in the file) in a human-readable form. This is
  2827 + especially useful for files with cross-reference streams, which
  2828 + are stored in a binary format.
  2829 +
  2830 + Show the contents of the cross-reference table or stream in a
  2831 + human-readable form. The cross-reference data gives the offset of
  2832 + regular objects and the object stream ID and 0-based index within
  2833 + the object stream for compressed objects. This is especially useful
  2834 + for files with cross-reference streams, which are stored in a
  2835 + binary format. If the file is invalid and cross reference table
  2836 + reconstruction is performed, this option will show the information
  2837 + in the reconstructed table.
  2838 +
  2839 +.. qpdf:option:: --show-object=trailer|obj[,gen]
  2840 +
  2841 + .. help: show contents of an object
  2842 +
  2843 + Show the contents of the given object. This is especially useful
  2844 + for inspecting objects that are inside of object streams (also
  2845 + known as "compressed objects").
  2846 +
  2847 + Show the contents of the given object. This is especially useful
  2848 + for inspecting objects that are inside of object streams (also
  2849 + known as "compressed objects").
  2850 +
  2851 +.. qpdf:option:: --raw-stream-data
  2852 +
  2853 + .. help: show raw stream data
  2854 +
  2855 + When used with --show-object, if the object is a stream, write
  2856 + the raw (compressed) binary stream data to standard output
  2857 + instead of the object's contents. See also
  2858 + --filtered-stream-data.
  2859 +
  2860 + When used with :qpdf:ref:`--show-object`, if the object is a
  2861 + stream, write the raw (compressed) binary stream data to standard
  2862 + output instead of the object's contents. Avoid combining this with
  2863 + other inspection options to avoid commingling the stream data with
  2864 + other output. See also :qpdf:ref:`--filtered-stream-data`.
  2865 +
  2866 +.. qpdf:option:: --filtered-stream-data
  2867 +
  2868 + .. help: show filtered stream data
  2869 +
  2870 + When used with --show-object, if the object is a stream, write
  2871 + the filtered (uncompressed, potentially binary) stream data to
  2872 + standard output instead of the object's contents. See also
  2873 + --raw-stream-data.
  2874 +
  2875 + When used with :qpdf:ref:`--show-object`, if the object is a stream,
  2876 + write the filtered (uncompressed, potentially binary) stream data
  2877 + to standard output instead of the object's contents. If the stream
  2878 + is filtered using filters that qpdf does not support, an error will
  2879 + be issued. This option acts as if ``--decode-level=all`` was
  2880 + specified (see :qpdf:ref:`--decode-level`), so it will uncompress
  2881 + images compressed with supported lossy compression schemes. Avoid
  2882 + combining this with other inspection options to avoid commingling
  2883 + the stream data with other output.
  2884 +
  2885 + This option may be combined with :qpdf:ref:`--normalize-content`.
  2886 + If you do this, qpdf will attempt to run content normalization even
  2887 + if the stream is not a content stream, which will probably produce
  2888 + unusable results.
  2889 +
  2890 + See also :qpdf:ref:`--raw-stream-data`.
  2891 +
  2892 +.. qpdf:option:: --show-npages
  2893 +
  2894 + .. help: show number of pages
  2895 +
  2896 + Print the number of pages in the input file on a line by itself.
  2897 + Useful for scripts.
  2898 +
  2899 + Print the number of pages in the input file on a line by itself.
1505 2900 Since the number of pages appears by itself on a line, this option
1506 2901 can be useful for scripting if you need to know the number of pages
1507 2902 in a file.
1508 2903  
1509   -:samp:`--show-pages`
1510   - Shows the object and generation number for each page dictionary
  2904 +.. qpdf:option:: --show-pages
  2905 +
  2906 + .. help: display page dictionary information
  2907 +
  2908 + Show the object and generation number for each page dictionary
  2909 + object and for each content stream associated with the page.
  2910 +
  2911 + Show the object and generation number for each page dictionary
1511 2912 object and for each content stream associated with the page. Having
1512 2913 this information makes it more convenient to inspect objects from a
1513   - particular page.
  2914 + particular page. See also :qpdf:ref:`--with-images`.
  2915 +
  2916 +.. qpdf:option:: --with-images
  2917 +
  2918 + .. help: include image details with --show-pages
  2919 +
  2920 + When used with --show-pages, also shows the object and
  2921 + generation numbers for the image objects on each page.
  2922 +
  2923 + When used with :qpdf:ref:`--show-pages`, also shows the object and
  2924 + generation numbers for the image objects on each page.
  2925 +
  2926 +.. _json-options:
1514 2927  
1515   -:samp:`--with-images`
1516   - When used along with :samp:`--show-pages`, also shows
1517   - the object and generation numbers for the image objects on each page.
1518   - (At present, information about images in shared resource dictionaries
1519   - are not output by this command. This is discussed in a comment in the
1520   - source code.)
  2928 +JSON Options
  2929 +------------
  2930 +
  2931 +.. help-topic json: JSON output for PDF information
  2932 +
  2933 + Show information about the PDF file in JSON format. Please see the
  2934 + JSON chapter in the qpdf manual for details.
  2935 +
  2936 +It is possible to view information about PDF files in a JSON format.
  2937 +See :ref:`json` for details about the qpdf JSON format.
  2938 +
  2939 +Related Options
  2940 +~~~~~~~~~~~~~~~
  2941 +
  2942 +.. qpdf:option:: --json
  2943 +
  2944 + .. help: show file in json format
  2945 +
  2946 + Generate a JSON representation of the file. This is described in
  2947 + depth in the JSON section of the manual.
1521 2948  
1522   -:samp:`--json`
1523 2949 Generate a JSON representation of the file. This is described in
1524   - depth in :ref:`json`
  2950 + depth in :ref:`json`.
  2951 +
  2952 +.. qpdf:option:: --json-help
  2953 +
  2954 + .. help: show format of json output
  2955 +
  2956 + Describe the format of the JSON output.
1525 2957  
1526   -:samp:`--json-help`
1527 2958 Describe the format of the JSON output.
1528 2959  
1529   -:samp:`--json-key=key`
1530   - This option is repeatable. If specified, only top-level keys
1531   - specified will be included in the JSON output. If not specified, all
1532   - keys will be shown.
  2960 +.. qpdf:option:: --json-key=key
  2961 +
  2962 + .. help: restrict which keys are in json output
  2963 +
  2964 + This option is repeatable. If given, only the specified
  2965 + top-level keys will be included in the JSON output. Otherwise,
  2966 + all keys will be included.
  2967 +
  2968 + This option is repeatable. If given, only the specified top-level
  2969 + keys will be included in the JSON output. Otherwise, all keys will
  2970 + be included.
  2971 +
  2972 +.. qpdf:option:: --json-object=trailer|obj[,gen]
1533 2973  
1534   -:samp:`--json-object=trailer|obj[,gen]`
1535   - This option is repeatable. If specified, only specified objects will
1536   - be shown in the "``objects``" key of the JSON output. If absent, all
  2974 + .. help: restrict which objects are in JSON
  2975 +
  2976 + This option is repeatable. If given, only specified objects will
  2977 + be shown in the "objects" key of the JSON output. Otherwise, all
  2978 + objects will be shown.
  2979 +
  2980 + This option is repeatable. If given, only specified objects will
  2981 + be shown in the "``objects``" key of the JSON output. Otherwise, all
1537 2982 objects will be shown.
1538 2983  
1539   -:samp:`--check`
1540   - Checks file structure and well as encryption, linearization, and
1541   - encoding of stream data. A file for which
1542   - :samp:`--check` reports no errors may still have
1543   - errors in stream data content but should otherwise be structurally
1544   - sound. If :samp:`--check` any errors, qpdf will exit
1545   - with a status of 2. There are some recoverable conditions that
1546   - :samp:`--check` detects. These are issued as warnings
1547   - instead of errors. If qpdf finds no errors but finds warnings, it
1548   - will exit with a status of 3 (as of versionย 2.0.4). When
1549   - :samp:`--check` is combined with other options,
1550   - checks are always performed before any other options are processed.
1551   - For erroneous files, :samp:`--check` will cause qpdf
1552   - to attempt to recover, after which other options are effectively
1553   - operating on the recovered file. Combining
1554   - :samp:`--check` with other options in this way can be
1555   - useful for manually recovering severely damaged files. Note that
1556   - :samp:`--check` produces no output to standard output
1557   - when everything is valid, so if you are using this to
1558   - programmatically validate files in bulk, it is safe to run without
1559   - output redirected to :file:`/dev/null` and just
1560   - check for a 0 exit code.
1561   -
1562   -The :samp:`--raw-stream-data` and
1563   -:samp:`--filtered-stream-data` options are ignored
1564   -unless :samp:`--show-object` is given. Either of these
1565   -options will cause the stream data to be written to standard output. In
1566   -order to avoid commingling of stream data with other output, it is
1567   -recommend that these objects not be combined with other test/inspection
1568   -options.
1569   -
1570   -If :samp:`--filtered-stream-data` is given and
1571   -:samp:`--normalize-content=y` is also given, qpdf will
1572   -attempt to normalize the stream data as if it is a page content stream.
1573   -This attempt will be made even if it is not a page content stream, in
1574   -which case it will produce unusable results.
  2984 +.. _test-options:
  2985 +
  2986 +Options for Testing or Debugging
  2987 +--------------------------------
  2988 +
  2989 +.. help-topic testing: options for testing or debugging
  2990 +
  2991 + The options below are useful when writing automated test code that
  2992 + includes files created by qpdf or when testing qpdf itself.
  2993 +
  2994 +The options below are useful when writing automated test code that
  2995 +includes files created by qpdf or when testing qpdf itself. When
  2996 +changes are made to qpdf, care is taken to avoid gratuitously changing
  2997 +the output of PDF files. This is to make it easier to do direct
  2998 +comparisons in test suites with files created by qpdf. However, there
  2999 +are no guarantees that the PDF output won't change such as in the
  3000 +event of a bug fix or feature enhancement to some aspect of the output
  3001 +that qpdf creates.
  3002 +
  3003 +.. _idempotency:
  3004 +
  3005 +Idempotency
  3006 +~~~~~~~~~~~
  3007 +
  3008 +Note about idempotency of byte-for-byte content: there is no
  3009 +expectation that qpdf is idempotent in the general case. In other
  3010 +words, there is no expectation that, when qpdf is run on its own
  3011 +output, it will create *byte-for-byte* identical output, even though
  3012 +it will create semantically identical files. There are a variety of
  3013 +reasons for this including document ID generation, which includes a
  3014 +random element, as well as the interaction of stream length encoding
  3015 +with dictionary key sorting.
  3016 +
  3017 +It is possible to get idempotent behavior by using the
  3018 +:qpdf:ref:`--static-id` or :qpdf:ref:`--deterministic-id` option with
  3019 +qpdf and running it *three* times so that you are processing the
  3020 +output of qpdf on its own previous output. For example, in this
  3021 +sequence of commands:
  3022 +
  3023 +::
  3024 +
  3025 + qpdf any-file.pdf 1.pdf
  3026 + qpdf --static-id 1.pdf 2.pdf
  3027 + qpdf --static-id 2.pdf 3.pdf
  3028 +
  3029 +the files :file:`2.pdf` and :file:`3.pdf` should be *byte-for-byte*
  3030 +identical. The qpdf test suite relies on this behavior. See also
  3031 +:qpdf:ref:`--static-aes-iv`.
  3032 +
  3033 +Related Options
  3034 +~~~~~~~~~~~~~~~
  3035 +
  3036 +.. qpdf:option:: --static-id
  3037 +
  3038 + .. help: use a fixed document ID
  3039 +
  3040 + Use a fixed value for the document ID. This is intended for
  3041 + testing only. Never use it for production files. See also
  3042 + qpdf --help=--deterministic-id.
  3043 +
  3044 + Use a fixed value for the document ID (``/ID`` in the trailer).
  3045 + **This is intended for testing only. Never use it for production
  3046 + files.** If you are trying to get the same ID each time for a given
  3047 + file and you are not generating encrypted files, consider using the
  3048 + :qpdf:ref:`--deterministic-id` option.
  3049 +
  3050 +.. qpdf:option:: --static-aes-iv
  3051 +
  3052 + .. help: use a fixed AES vector
  3053 +
  3054 + Use a static initialization vector for AES-CBC. This is intended
  3055 + for testing only so that output files can be reproducible. Never
  3056 + use it for production files. This option is not secure since it
  3057 + significantly weakens the encryption.
  3058 +
  3059 + Use a static initialization vector for AES-CBC. This is intended
  3060 + for testing only so that output files can be reproducible. Never
  3061 + use it for production files. **This option in particular is not
  3062 + secure since it significantly weakens the encryption.** When
  3063 + combined with :qpdf:ref:`--static-id` and using the three-step
  3064 + process described in :ref:`idempotency`, it is possible to create
  3065 + byte-for-byte idempotent output with PDF files that use 256-bit
  3066 + encryption to assist with creating reproducible test suites.
  3067 +
  3068 +.. qpdf:option:: --linearize-pass1=file
  3069 +
  3070 + .. help: save pass 1 of linearization
  3071 +
  3072 + Write the first pass of linearization to the named file. The
  3073 + resulting file is not a valid PDF file. This option is useful only
  3074 + for debugging qpdf.
  3075 +
  3076 + Write the first pass of linearization to the named file. *The
  3077 + resulting file is not a valid PDF file.* This option is useful only
  3078 + for debugging ``QPDFWriter``'s linearization code. When qpdf
  3079 + linearizes files, it writes the file in two passes, using the first
  3080 + pass to calculate sizes and offsets that are required for hint
  3081 + tables and the linearization dictionary. Ordinarily, the first pass
  3082 + is discarded. This option enables it to be captured, allowing
  3083 + inspection of the file before values calculated in pass 1 are
  3084 + inserted into the file for pass 2.
1575 3085  
1576 3086 .. _unicode-passwords:
1577 3087  
... ... @@ -1623,7 +3133,7 @@ you by interpreting them as UTF-8, you can use
1623 3133 :samp:`--password-mode=bytes` to suppress qpdf's
1624 3134 automatic behavior.
1625 3135  
1626   -The :samp:`--password-mode` option, as described earlier
  3136 +The :qpdf:ref:`--password-mode` option, as described earlier
1627 3137 in this chapter, can be used to change qpdf's interpretation of supplied
1628 3138 passwords. There are very few reasons to use this option. One would be
1629 3139 the unlikely case described in the previous paragraph in which the
... ... @@ -1654,7 +3164,7 @@ recovery methods should make qpdf transparently open most encrypted
1654 3164 files with the password supplied correctly but in the wrong coding
1655 3165 system. There are no real downsides to this behavior, but if you don't
1656 3166 want qpdf to do this, you can use the
1657   -:samp:`--suppress-password-recovery` option. One reason
  3167 +:qpdf:ref:`--suppress-password-recovery` option. One reason
1658 3168 to do that is to ensure that you know the exact password that was used
1659 3169 to encrypt the file.
1660 3170  
... ... @@ -1670,8 +3180,9 @@ will be addressed in a future version of qpdf. The ``QPDFWriter``
1670 3180 methods that enable encryption on the output file accept passwords as
1671 3181 strings of bytes.
1672 3182  
1673   -Please note that the :samp:`--password-is-hex-key`
1674   -option is unrelated to all this. This flag bypasses the normal process
1675   -of going from password to encryption string entirely, allowing the raw
1676   -encryption key to be specified directly. This is useful for forensic
1677   -purposes or for brute-force recovery of files with unknown passwords.
  3183 +Please note that the :qpdf:ref:`--password-is-hex-key` option is
  3184 +unrelated to all this. That flag bypasses the normal process of going
  3185 +from password to encryption string entirely, allowing the raw
  3186 +encryption key to be specified directly. That behavior is useful for
  3187 +forensic purposes or for brute-force recovery of files with unknown
  3188 +passwords and has nothing to do with the document's actual passwords.
... ...
manual/conf.py
... ... @@ -7,6 +7,10 @@
7 7 # To see the default sample conf.py, run sphinx-quickstart in an empty
8 8 # directory. Most of the original comments and options were removed.
9 9 import sphinx_rtd_theme # noQA F401
  10 +import os
  11 +import sys
  12 +
  13 +sys.path.append(os.path.abspath("./_ext"))
10 14  
11 15 project = 'QPDF'
12 16 copyright = '2005-2021, Jay Berkenbilt'
... ... @@ -16,6 +20,7 @@ release = &#39;10.5.0&#39;
16 20 version = release
17 21 extensions = [
18 22 'sphinx_rtd_theme',
  23 + 'qpdf',
19 24 ]
20 25 html_theme = 'sphinx_rtd_theme'
21 26 html_theme_options = {
... ...
manual/encryption.rst 0 โ†’ 100644
  1 +.. _pdf-encryption:
  2 +
  3 +PDF Encryption
  4 +==============
  5 +
  6 +This chapter discusses PDF encryption in a general way with an angle
  7 +toward how it works in :command:`qpdf`. This chapter is not intended
  8 +to replace the PDF specification. Please consult the spec for full
  9 +details.
  10 +
  11 +PDF Encryption Concepts
  12 +-----------------------
  13 +
  14 +Encryption
  15 + Encryption is the replacement of *clear text* with encrypted text,
  16 + also known as *ciphertext*. The clear text may be retrieved from the
  17 + ciphertext if the encryption key is known.
  18 +
  19 + PDF files consist of an object structure. PDF objects may be of a
  20 + variety of types including (among others) numbers, boolean values,
  21 + names, arrays, dictionaries, strings, and streams. In a PDF file,
  22 + only strings and streams are encrypted.
  23 +
  24 +Security Handler
  25 + Since the inception of PDF, there have been several modifications to
  26 + the way files are encrypted. Encryption is handled by a *security
  27 + handler*. The *standard security handler* is password-based. This is
  28 + the only security handler implemented by qpdf, and this material is
  29 + all focused on the standard security handler. There are various
  30 + flags that control the specific details of encryption with the
  31 + standard security handler. These are discussed below.
  32 +
  33 +Encryption Key
  34 + This refers to the actual key used by the encryption and decryption
  35 + algorithms. It is distinct from the password. The main encryption
  36 + key is generated at random and stored encrypted in the PDF file. The
  37 + passwords used to protect a PDF file, if any, are used to protect
  38 + the encryption key. This design makes it possible to use different
  39 + passwords (e.g., user and owner passwords) to retrieve the
  40 + encryption key or even to change the password on a file without
  41 + changing the encryption key. qpdf can expose the encryption key when
  42 + run with the :qpdf:ref:`--show-encryption-key` option and can accept
  43 + a hex-encoded encryption key in place of a password when run with
  44 + the :qpdf:ref:`--password-is-hex-key` option.
  45 +
  46 +Password Protection
  47 + Password protection is distinct from encryption. This point is often
  48 + misunderstood. A PDF file can be encrypted without being
  49 + password-protected. The intent of PDF encryption was that there
  50 + would be two passwords: a *user password* and an *owner password*.
  51 + Either password can be used to retrieve the encryption key. A
  52 + conforming reader is supposed to obey the security restrictions
  53 + if the file is opened using the user password but not if the file is
  54 + opened with the owner password. :command:`qpdf` makes no distinction
  55 + between which password is used to open the file. The distinction
  56 + made by conforming readers between the user and owner password is
  57 + what makes it common to create encrypted files with no password
  58 + protection. This is done by using the empty string as the user
  59 + password and some secret string as the owner password. When a user
  60 + opens the PDF file, the empty string is used to retrieve the
  61 + encryption key, making the file usable, but a conforming reader
  62 + restricts certain operations from the user.
  63 +
  64 +What does all this mean? Here are a few things to realize.
  65 +
  66 +- Since the user password and the owner password are both used to
  67 + recover the single encryption key, there is *fundamentally no way*
  68 + to prevent an application from disregarding the security
  69 + restrictions on a file. Any software that can read the encrypted
  70 + file at all has the encryption key. Therefore, the security of the
  71 + restrictions placed on PDF files is solely enforced by the software.
  72 + Any open source PDF reader could be trivially modified to ignore the
  73 + security restrictions on a file. The PDF specification is clear
  74 + about this point. This means that PDF restrictions on
  75 + non-password-protected files only restrict users who don't know how
  76 + to circumvent them.
  77 +
  78 +- If a file is password-protected, you have to know at least one of
  79 + the user or owner password to retrieve the encryption key. However,
  80 + in the case of 40-bit encryption, the actual encryption key is only
  81 + 5 bytes long and can be easily brute-forced. As such, files
  82 + encrypted with 40-bit encryption are not secure regardless of how
  83 + strong the password is. With 128-bit encryption, the default
  84 + security handler uses RC4 encryption, which is also known be
  85 + insecure. As such, the only way to securely encrypt a PDF file using
  86 + the standard security handler (as of the last review of this chapter
  87 + in 2022) is to use AES encryption. This is the only supported
  88 + algorithm with 256-bit encryption, and it can be selected to be used
  89 + with 128-bit encryption as well. However there is no reason to use
  90 + 128-bit encryption with AES. If you are going to use AES, just use
  91 + 256-bit encryption instead. The security of a 256-bit AES-encrypted
  92 + PDF file with a strong password is comparable to using a
  93 + general-purpose encryption tool like :command:`gpg` or
  94 + :command:`openssl` to encrypt the PDF file with the same password,
  95 + but the advantage of using PDF encryption is that no software is
  96 + required beyond a regular PDF viewer.
  97 +
  98 +PDF Encryption Details
  99 +----------------------
  100 +
  101 +This section describes a few details about PDF encryption. It does not
  102 +describe all the details. For that, read the PDF specification. The
  103 +details presented here, however, should go a long way toward helping a
  104 +casual user/developer understand what's going on with encrypted PDF
  105 +files.
  106 +
  107 +Here are more concepts to understand.
  108 +
  109 +Algorithm parameters ``V`` and ``R``
  110 + There are two parameters that control the details of encryption
  111 + using the standard security handler: ``V`` and ``R``.
  112 +
  113 + ``V`` is a code specifying the algorithms that are used for
  114 + encrypting the file, handling keys, etc. It may have any of the
  115 + following values:
  116 +
  117 + - 1: The original algorithm, which encrypted files using 40-bit keys.
  118 +
  119 + - 2: An extension of the original algorithm allowing longer keys.
  120 + Introduced in PDF 1.4.
  121 +
  122 + - 3: An unpublished algorithm that permits file encryption key
  123 + lengths ranging from 40 to 128 bits. Introduced in PDF 1.4. qpdf
  124 + is believed to be able to read files with ``V`` = 3 but does not
  125 + write such files.
  126 +
  127 + - 4: An extension of the algorithm that allows it to be
  128 + parameterized by additional rules for handling strings and
  129 + streams. Introduced in PDF 1.5.
  130 +
  131 + - 5: An algorithm that allows specification of separate security
  132 + handlers for strings and streams as well as embedded files, and
  133 + which supports 256-bit keys. Introduced in PDF 1.7 extension level
  134 + 3 and later extended in extension level 8. This is the encryption
  135 + system in the PDF 2.0 specification, ISO-32000.
  136 +
  137 + ``R`` is a code specifying the revision of the standard handler. It
  138 + is tightly coupled with the value of ``V``. ``R`` may have any of
  139 + the following values:
  140 +
  141 + - 2: ``V`` must be 1
  142 +
  143 + - 3: ``V`` must be 2 or 3
  144 +
  145 + - 4: ``V`` must be 4
  146 +
  147 + - 5: ``V`` must be 5; this extension was never fully specified and
  148 + existed for a short time in some versions of Acrobat.
  149 + :command:`qpdf` is able to read and write this format, but it
  150 + should not be used for any purpose other than testing
  151 + compatibility with the format.
  152 +
  153 + - 6: ``V`` must be 5. This is the only value that is not deprecated
  154 + in the PDF 2.0 specification, ISO-32000.
  155 +
  156 +Encryption Dictionary
  157 + Encrypted PDF files have an encryption dictionary. There are several
  158 + fields, but these are the important ones for our purposes:
  159 +
  160 + - ``V`` and ``R`` as described above
  161 +
  162 + - ``O``, ``U``, ``OE``, ``UE``: values used by the algorithms that
  163 + recover the encryption key from the user and owner password. Which
  164 + of these are defined and how they are used vary based on the value
  165 + of ``R``.
  166 +
  167 + - ``P``: a bit field that describes which restrictions are in place.
  168 + This is discussed below in :ref:`security-restrictions`
  169 +
  170 +Encryption Algorithms
  171 + PDF files may be encrypted with the obsolete, insecure RC4 algorithm
  172 + or the more secure AES algorithm. See also :ref:`weak-crypto` for a
  173 + discussion. 40-bit encryption always uses RC4. 128-bit can use
  174 + either RC4 (the default for compatibility reasons) or, starting with
  175 + PDF 1.6, AES. 256-bit encryption always uses AES.
  176 +
  177 +.. _security-restrictions:
  178 +
  179 +PDF Security Restrictions
  180 +-------------------------
  181 +
  182 +PDF security restrictions are described by a bit field whose value is
  183 +stored in the ``P`` field in the encryption dictionary. The value of
  184 +``P`` is used by the algorithms to recover the encryption key given
  185 +the password, which makes the value of ``P`` tamper-resistent.
  186 +
  187 +``P`` is a 32-bit integer, treated as a signed twos-complement number.
  188 +A 1 in any bit position means the permission is granted. The PDF
  189 +specification numbers the bits from 1 (least significant bit) to 32
  190 +(most significant bit) rather than the more customary 0 to 31. For
  191 +consistency with the spec, the remainder of this section uses the
  192 +1-based numbering.
  193 +
  194 +Only bits 3, 4, 5, 6, 9, 10, 11, and 12 are used. All other bits are
  195 +set to 1. Since bit 32 is always set to 1, the value of ``P`` is
  196 +always a negative number. (:command:`qpdf` recognizes a positive
  197 +number on behalf of buggy writers that treat ``P`` as unsigned. Such
  198 +files have been seen in the wild.)
  199 +
  200 +Here are the meanings of the bit positions. All bits not listed must
  201 +have the value 1 except bits 1 and 2, which must have the value 0.
  202 +However, the values of bits other than those in the table are ignored,
  203 +so having incorrect values probably doesn't break anything in most
  204 +cases. A value of 1 indicates that the permission is granted.
  205 +
  206 +- 3: for ``R`` = 2 printing; for ``R`` >= 3, printing at low
  207 + resolution
  208 +
  209 +- 4: modifying the document except as controlled by bits 6,
  210 + 9, and 11
  211 +
  212 +- 5: extracting text and graphics for purposes other than
  213 + accessibility to visually impaired users
  214 +
  215 +- 6: add or modify annotations, fill in interactive form fields;
  216 + if bit 4 is also set, create or modify interactive form fields
  217 +
  218 +- 9: for ``R`` >= 3, fill in interactive form fields even if bit 6 is
  219 + clear
  220 +
  221 +- 10: not used; formerly granted permission to extract material
  222 + for accessibility, but the specification now disallows restriction
  223 + of accessibility, and conforming readers are to treat this bit as if
  224 + it is set regardless of its value
  225 +
  226 +- 11: for ``R`` >= 3, assemble document including inserting, rotating,
  227 + or deleting pages or creating document outlines or thumbnail images
  228 +
  229 +- 12: for ``R`` >= 3, allow printing at full resolution
  230 +
  231 +.. _qpdf-P:
  232 +
  233 +How qpdf handles security restrictions
  234 +--------------------------------------
  235 +
  236 +The section describes exactly what the qpdf library does with regard
  237 +to ``P`` based on the various settings of different security options.
  238 +
  239 +- Start with all bits set except bits 1 and 2, which are cleared
  240 +
  241 +- For ``R`` = 2:
  242 +
  243 + - ``--print=n``: clear bit 3
  244 +
  245 + - ``--modify=n``: clear bit 4
  246 +
  247 + - ``--extract=n``: clear bit 5
  248 +
  249 + - ``--annotate=n``: clear bit 6
  250 +
  251 +- For ``R >= 3``:
  252 +
  253 + - ``--accessibility=n``: for ``R`` = 3, clear bit 10; otherwise,
  254 + ignore so bit 10 is always clear if ``R`` >= 4. qpdf allows
  255 + creating files with bit 10 clear so that it can be used to create
  256 + test files to ensure that a conforming reader ignores the value of
  257 + the bit. You should never intentionally clear accessibility.
  258 +
  259 + - ``--extract=n``: clear bit 5
  260 +
  261 + - ``--print=none``: clear bits 3 and 12
  262 +
  263 + - ``--print=low``: clear bit 12
  264 +
  265 + - ``--modify=none``: clear bits 4, 6, 9, and 11
  266 +
  267 + - ``--modify=assembly``: clear bits 4, 6, and 9
  268 +
  269 + - ``--modify=form``: clear bits 4 and 6
  270 +
  271 + - ``--modify=annotate``: clear bit 4
  272 +
  273 + - ``--assemble=n``: clear bit 11
  274 +
  275 + - ``--annotate=n``: clear bit 6
  276 +
  277 + - ``--form=n``: clear bit 9
  278 +
  279 + - ``--modify-other=n``: clear bit 4
  280 +
  281 +Options to :command:`qpdf`, both at the CLI and library level, allow
  282 +more granular clearing of permission bits than do most tools,
  283 +including Adobe Acrobat. As such, PDF viewers may respond in
  284 +surprising ways based on options passed to qpdf. If you observe this,
  285 +it is probably not because of a bug in qpdf.
  286 +
  287 +.. _pdf-passwords:
  288 +
  289 +User and Owner Passwords
  290 +------------------------
  291 +
  292 +When you use qpdf to show encryption parameters and you open a file
  293 +with the owner password, sometimes qpdf reveals the user password, and
  294 +sometimes it doesn't. Here's why.
  295 +
  296 +For ``V`` < 5, the user password is actually stored in the PDF file
  297 +encrypted with a key that is derived from the owner password, and the
  298 +main encryption key is encrypted using a key derived from the user
  299 +password. When you open a PDF file, the reader first tries to treat
  300 +the given password as the user password, using it to recover the
  301 +encryption key. If that works, you're in with restrictions (assuming
  302 +the reader chooses to enforce them). If it doesn't work, then the
  303 +reader treats the password as the owner password, using it to recover
  304 +the user password, and then uses the user password to retrieve the
  305 +encryption key. This is why creating a file with the same user
  306 +password and owner password with ``V`` < 5 results in a file that some
  307 +readers will never allow you to open as the owner. Typically when a
  308 +reader encounters a file with ``V`` < 5, it will first attempt to
  309 +treat the empty string as a user password. If that works, the file is
  310 +encrypted but not password-protected. If it doesn't work, then a
  311 +password prompt is given. Creating a file with an empty owner password
  312 +is like creating a file with the same owner and user password: there
  313 +is no way to open the file as an owner.
  314 +
  315 +For ``V`` >= 5, the main encryption key is independently encrypted
  316 +using the user password and the owner password. There is no way to
  317 +recover the user password from the owner password. Restrictions are
  318 +imposed or not depending on which password was used. In this case, the
  319 +password supplied, if any, is tried both as the user password and the
  320 +owner password, and whichever works is used. Typically the password is
  321 +tried as the owner password first. (This is what the PDF specification
  322 +says to do.) As such, specifying a user password and leaving the owner
  323 +password blank results in a file that is opened as owner with no
  324 +password, effectively rendering the security restrictions useless.
  325 +This is why :command:`qpdf` requires you to pass
  326 +:qpdf:ref:`--allow-insecure` to create a file with an empty owner
  327 +password when 256-bit encryption is in use.
... ...
manual/index.rst
... ... @@ -30,5 +30,11 @@ documentation, please visit `https://qpdf.readthedocs.io
30 30 design
31 31 linearization
32 32 object-streams
  33 + encryption
33 34 release-notes
34 35 acknowledgement
  36 +
  37 +Indices
  38 +=======
  39 +
  40 +* :ref:`qpdf-options`
... ...
manual/installation.rst
... ... @@ -208,7 +208,7 @@ files you need to build.
208 208 Runtime Crypto Provider Selection
209 209 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
210 210  
211   -You can use the :samp:`--show-crypto` option to
  211 +You can use the :qpdf:ref:`--show-crypto` option to
212 212 :command:`qpdf` to get a list of available crypto
213 213 providers. The default provider is always listed first, and the rest are
214 214 listed in lexical order. Each crypto provider is listed on a line by
... ...
manual/json.rst
... ... @@ -51,7 +51,7 @@ Compatibility
51 51  
52 52 Documentation
53 53 The :command:`qpdf` command can be invoked with the
54   - :samp:`--json-help` option. This will output a JSON
  54 + :qpdf:ref:`--json-help` option. This will output a JSON
55 55 structure that has the same structure as the JSON output that qpdf
56 56 generates, except that each field in the help output is a description
57 57 of the corresponding field in the JSON output. The specific
... ... @@ -134,7 +134,7 @@ There are a few limitations to be aware of with the JSON structure:
134 134 encoding. In other words, it's best if you don't try to use the JSON
135 135 format to extract binary strings from the PDF file, but if you really
136 136 had to, it could be done. Note that qpdf's
137   - :samp:`--show-object` option does not have this
  137 + :qpdf:ref:`--show-object` option does not have this
138 138 limitation and will reveal the string as encoded in the original
139 139 file.
140 140  
... ... @@ -150,9 +150,9 @@ be aware of:
150 150 - While qpdf guarantees that keys present in the help will be present
151 151 in the output, those fields may be null or empty if the information
152 152 is not known or absent in the file. Also, if you specify
153   - :samp:`--json-keys`, the keys that are not listed
  153 + :qpdf:ref:`--json-key`, the keys that are not listed
154 154 will be excluded entirely except for those that
155   - :samp:`--json-help` says are always present.
  155 + :qpdf:ref:`--json-help` says are always present.
156 156  
157 157 - In a few places, there are keys with names containing
158 158 ``pageposfrom1``. The values of these keys are null or an integer. If
... ... @@ -168,7 +168,7 @@ be aware of:
168 168  
169 169 - The image information included in the ``page`` section of the JSON
170 170 output includes the key "``filterable``". Note that the value of this
171   - field may depend on the :samp:`--decode-level` that
  171 + field may depend on the :qpdf:ref:`--decode-level` that
172 172 you invoke qpdf with. The JSON output includes a top-level key
173 173 "``parameters``" that indicates the decode level used for computing
174 174 whether a stream was filterable. For example, jpeg images will be
... ...
manual/release-notes.rst
... ... @@ -90,7 +90,7 @@ For a detailed list of changes, please see the file
90 90 - Handling of Weak Cryptography Algorithms
91 91  
92 92 - From the qpdf CLI, the
93   - :samp:`--allow-weak-crypto` is now required to
  93 + :qpdf:ref:`--allow-weak-crypto` is now required to
94 94 suppress a warning when explicitly creating PDF files using RC4
95 95 encryption. While qpdf will always retain the ability to read
96 96 and write such files, doing so will require explicit
... ... @@ -108,7 +108,7 @@ For a detailed list of changes, please see the file
108 108 (with no resource dictionary).
109 109  
110 110 - Fix crash that could occur under certain conditions when using
111   - :samp:`--pages` with files that had form
  111 + :qpdf:ref:`--pages` with files that had form
112 112 fields.
113 113  
114 114 - Library Enhancements
... ... @@ -127,7 +127,7 @@ For a detailed list of changes, please see the file
127 127 - CLI Enhancements
128 128  
129 129 - Improve diagnostics around parsing
130   - :samp:`--pages` command-line options
  130 + :qpdf:ref:`--pages` command-line options
131 131  
132 132 - Packaging Changes
133 133  
... ... @@ -139,7 +139,7 @@ For a detailed list of changes, please see the file
139 139  
140 140 - When generating a file while preserving object streams,
141 141 unreferenced objects are correctly removed unless
142   - :samp:`--preserve-unreferenced` is specified.
  142 + :qpdf:ref:`--preserve-unreferenced` is specified.
143 143  
144 144 - Library Enhancements
145 145  
... ... @@ -202,19 +202,19 @@ For a detailed list of changes, please see the file
202 202  
203 203 - Operations that work on combining pages are much better about
204 204 protecting form fields. In particular,
205   - :samp:`--split-pages` and
206   - :samp:`--pages` now preserve interaction form
  205 + :qpdf:ref:`--split-pages` and
  206 + :qpdf:ref:`--pages` now preserve interaction form
207 207 functionality by copying the relevant form field information
208 208 from the original files. Additionally, if you use
209   - :samp:`--pages` to select only some pages from
  209 + :qpdf:ref:`--pages` to select only some pages from
210 210 the original input file, unused form fields are removed, which
211 211 prevents lots of unused annotations from being retained.
212 212  
213 213 - By default, :command:`qpdf` no longer allows
214 214 creation of encrypted PDF files whose user password is
215 215 non-empty and owner password is empty when a 256-bit key is in
216   - use. The :samp:`--allow-insecure` option,
217   - specified inside the :samp:`--encrypt` options,
  216 + use. The :qpdf:ref:`--allow-insecure` option,
  217 + specified inside the :qpdf:ref:`--encrypt` options,
218 218 allows creation of such files. Behavior changes in the CLI are
219 219 avoided when possible, but an exception was made here because
220 220 this is security-related. qpdf must always allow creation of
... ... @@ -255,7 +255,7 @@ For a detailed list of changes, please see the file
255 255 removing, and and copying file attachments. See :ref:`attachments` for details.
256 256  
257 257 - Page splitting and merging operations, as well as
258   - :samp:`--flatten-rotation`, are better behaved
  258 + :qpdf:ref:`--flatten-rotation`, are better behaved
259 259 with respect to annotations and interactive form fields. In
260 260 most cases, interactive form field functionality and proper
261 261 formatting and functionality of annotations is preserved by
... ... @@ -284,7 +284,7 @@ For a detailed list of changes, please see the file
284 284 extraction of attachments. More detailed information can be
285 285 obtained by following the reference to the file spec object.
286 286  
287   - - Add numeric option to :samp:`--collate`. If
  287 + - Add numeric option to :qpdf:ref:`--collate`. If
288 288 :samp:`--collate={n}`
289 289 is given, take pages in groups of
290 290 :samp:`{n}` from the given files.
... ... @@ -367,7 +367,7 @@ For a detailed list of changes, please see the file
367 367  
368 368 - Bug Fixes
369 369  
370   - - The :samp:`--flatten-rotation` option applies
  370 + - The :qpdf:ref:`--flatten-rotation` option applies
371 371 transformations to any annotations that may be on the page.
372 372  
373 373 - If a form XObject lacks a resources dictionary, consider any
... ... @@ -390,7 +390,7 @@ For a detailed list of changes, please see the file
390 390 10.1.0: January 5, 2021
391 391 - CLI Enhancements
392 392  
393   - - Add :samp:`--flatten-rotation` command-line
  393 + - Add :qpdf:ref:`--flatten-rotation` command-line
394 394 option, which causes all pages that are rotated using
395 395 parameters in the page's dictionary to instead be identically
396 396 rotated in the page's contents. The change is not user-visible
... ... @@ -510,7 +510,7 @@ For a detailed list of changes, please see the file
510 510 - Bug Fixes
511 511  
512 512 - When concatenating content streams, as with
513   - :samp:`--coalesce-contents`, there were cases
  513 + :qpdf:ref:`--coalesce-contents`, there were cases
514 514 in which qpdf would merge two lexical tokens together, creating
515 515 invalid results. A newline is now inserted between merged
516 516 content streams if one is not already present.
... ... @@ -527,7 +527,7 @@ For a detailed list of changes, please see the file
527 527 already ignored the user's locale for numeric conversion.
528 528  
529 529 - Fix several instances in which warnings were not suppressed in
530   - spite of :samp:`--no-warn` and/or errors or
  530 + spite of :qpdf:ref:`--no-warn` and/or errors or
531 531 warnings were written to standard output rather than standard
532 532 error.
533 533  
... ... @@ -540,10 +540,10 @@ For a detailed list of changes, please see the file
540 540  
541 541 - Enhancements
542 542  
543   - - New option :samp:`--warning-exit-0` causes qpdf
  543 + - New option :qpdf:ref:`--warning-exit-0` causes qpdf
544 544 to exit with a status of ``0`` rather than ``3`` if there are
545 545 warnings but no errors. Combine with
546   - :samp:`--no-warn` to completely ignore
  546 + :qpdf:ref:`--no-warn` to completely ignore
547 547 warnings.
548 548  
549 549 - Performance improvements have been made to
... ... @@ -656,17 +656,16 @@ For a detailed list of changes, please see the file
656 656 :command:`qpdf --json-help` for details.
657 657  
658 658 - Add new option
659   - :samp:`--remove-unreferenced-resources` which
  659 + :qpdf:ref:`--remove-unreferenced-resources` which
660 660 takes ``auto``, ``yes``, or ``no`` as arguments. The new
661 661 ``auto`` mode, which is the default, performs a fast heuristic
662 662 over a PDF file when splitting pages to determine whether the
663 663 expensive process of finding and removing unreferenced
664 664 resources is likely to be of benefit. For most files, this new
665 665 default will result in a significant performance improvement
666   - for splitting pages. See :ref:`advanced-transformation` for a more detailed
667   - discussion.
  666 + for splitting pages.
668 667  
669   - - The :samp:`--preserve-unreferenced-resources`
  668 + - The :qpdf:ref:`--preserve-unreferenced-resources`
670 669 is now just a synonym for
671 670 :samp:`--remove-unreferenced-resources=no`.
672 671  
... ... @@ -760,8 +759,8 @@ For a detailed list of changes, please see the file
760 759  
761 760 - CLI Enhancements
762 761  
763   - - Added options :samp:`--is-encrypted` and
764   - :samp:`--requires-password` for testing whether
  762 + - Added options :qpdf:ref:`--is-encrypted` and
  763 + :qpdf:ref:`--requires-password` for testing whether
765 764 a file is encrypted or requires a password other than the
766 765 supplied (or empty) password. These communicate via exit
767 766 status, making them useful for shell scripts. They also work on
... ... @@ -770,7 +769,7 @@ For a detailed list of changes, please see the file
770 769 - Added ``encrypt`` key to JSON options. With the exception of
771 770 the reconstructed user password for older encryption formats,
772 771 this provides the same information as
773   - :samp:`--show-encryption` but in a consistent,
  772 + :qpdf:ref:`--show-encryption` but in a consistent,
774 773 parseable format. See output of :command:`qpdf
775 774 --json-help` for details.
776 775  
... ... @@ -778,7 +777,7 @@ For a detailed list of changes, please see the file
778 777  
779 778 - In QDF mode, be sure not to write more than one XRef stream to
780 779 a file, even when
781   - :samp:`--preserve-unreferenced` is used.
  780 + :qpdf:ref:`--preserve-unreferenced` is used.
782 781 :command:`fix-qdf` assumes that there is only
783 782 one XRef stream, and that it appears at the end of the file.
784 783  
... ... @@ -824,7 +823,7 @@ For a detailed list of changes, please see the file
824 823  
825 824 - CLI Enhancements
826 825  
827   - - Addition of the :samp:`--show-crypto` option in
  826 + - Addition of the :qpdf:ref:`--show-crypto` option in
828 827 support of selectable crypto providers, as described in :ref:`crypto`.
829 828  
830 829 - Allow ``:even`` or ``:odd`` to be appended to numeric ranges
... ... @@ -838,7 +837,7 @@ For a detailed list of changes, please see the file
838 837 - Bug Fix
839 838  
840 839 - Fix the name of the temporary file used by
841   - :samp:`--replace-input` so that it doesn't
  840 + :qpdf:ref:`--replace-input` so that it doesn't
842 841 require path splitting and works with paths include
843 842 directories.
844 843  
... ... @@ -891,21 +890,21 @@ For a detailed list of changes, please see the file
891 890  
892 891 - CLI Enhancements
893 892  
894   - - The :samp:`--replace-input` option may be given
  893 + - The :qpdf:ref:`--replace-input` option may be given
895 894 in place of an output file name. This causes qpdf to overwrite
896 895 the input file with the output. See the description of
897   - :samp:`--replace-input` in :ref:`basic-options` for more details.
  896 + :qpdf:ref:`--replace-input` for more details.
898 897  
899   - - The :samp:`--recompress-flate` instructs
  898 + - The :qpdf:ref:`--recompress-flate` instructs
900 899 :command:`qpdf` to recompress streams that are
901 900 already compressed with ``/FlateDecode``. Useful with
902   - :samp:`--compression-level`.
  901 + :qpdf:ref:`--compression-level`.
903 902  
904 903 - The
905 904 :samp:`--compression-level={level}`
906 905 sets the zlib compression level used for any streams compressed
907 906 by ``/FlateDecode``. Most effective when combined with
908   - :samp:`--recompress-flate`.
  907 + :qpdf:ref:`--recompress-flate`.
909 908  
910 909 - Library Enhancements
911 910  
... ... @@ -998,8 +997,8 @@ For a detailed list of changes, please see the file
998 997 a file with linearization warnings but not errors, it now
999 998 properly exits with exit code 3 instead of 2.
1000 999  
1001   - - The :samp:`--completion-bash` and
1002   - :samp:`--completion-zsh` options now work
  1000 + - The :qpdf:ref:`--completion-bash` and
  1001 + :qpdf:ref:`--completion-zsh` options now work
1003 1002 properly when qpdf is invoked as an AppImage.
1004 1003  
1005 1004 - Calling ``QPDFWriter::set*EncryptionParameters`` on a
... ... @@ -1063,7 +1062,7 @@ For a detailed list of changes, please see the file
1063 1062 qpdf than the library, which may indicate a problem with the
1064 1063 installation.
1065 1064  
1066   - - New option :samp:`--remove-page-labels` will
  1065 + - New option :qpdf:ref:`--remove-page-labels` will
1067 1066 remove page labels before generating output. This used to
1068 1067 happen if you ran :command:`qpdf --empty --pages ..
1069 1068 --`, but the behavior changed in qpdf 8.3.0. This
... ... @@ -1090,7 +1089,7 @@ For a detailed list of changes, please see the file
1090 1089 during page splitting operations.
1091 1090  
1092 1091 - Revert change that included preservation of outlines
1093   - (bookmarks) in :samp:`--split-pages`. The way
  1092 + (bookmarks) in :qpdf:ref:`--split-pages`. The way
1094 1093 it was implemented in 8.3.0 and 8.4.0 caused a very significant
1095 1094 degradation of performance for splitting certain files. A
1096 1095 future release of qpdf may re-introduce the behavior in a more
... ... @@ -1143,16 +1142,16 @@ For a detailed list of changes, please see the file
1143 1142 depth in :ref:`unicode-passwords`.
1144 1143  
1145 1144 - New options
1146   - :samp:`--externalize-inline-images`,
1147   - :samp:`--ii-min-bytes`, and
1148   - :samp:`--keep-inline-images` control qpdf's
  1145 + :qpdf:ref:`--externalize-inline-images`,
  1146 + :qpdf:ref:`--ii-min-bytes`, and
  1147 + :qpdf:ref:`--keep-inline-images` control qpdf's
1149 1148 handling of inline images and possible conversion of them to
1150 1149 regular images. By default,
1151   - :samp:`--optimize-images` now also applies to
1152   - inline images. These options are discussed in :ref:`advanced-transformation`.
  1150 + :qpdf:ref:`--optimize-images` now also applies to
  1151 + inline images.
1153 1152  
1154   - - Add options :samp:`--overlay` and
1155   - :samp:`--underlay` for overlaying or
  1153 + - Add options :qpdf:ref:`--overlay` and
  1154 + :qpdf:ref:`--underlay` for overlaying or
1156 1155 underlaying pages of other files onto output pages. See
1157 1156 :ref:`overlay-underlay` for
1158 1157 details.
... ... @@ -1162,32 +1161,32 @@ For a detailed list of changes, please see the file
1162 1161 non-ASCII characters, qpdf will try a number of alternative
1163 1162 passwords to try to compensate for possible character encoding
1164 1163 errors. This behavior can be suppressed with the
1165   - :samp:`--suppress-password-recovery` option.
  1164 + :qpdf:ref:`--suppress-password-recovery` option.
1166 1165 See :ref:`unicode-passwords` for a full
1167 1166 discussion.
1168 1167  
1169   - - Add the :samp:`--password-mode` option to
  1168 + - Add the :qpdf:ref:`--password-mode` option to
1170 1169 fine-tune how qpdf interprets password arguments, especially
1171 1170 when they contain non-ASCII characters. See :ref:`unicode-passwords` for more information.
1172 1171  
1173   - - In the :samp:`--pages` option, it is now
  1172 + - In the :qpdf:ref:`--pages` option, it is now
1174 1173 possible to copy the same page more than once from the same
1175 1174 file without using the previous workaround of specifying two
1176 1175 different paths to the same file.
1177 1176  
1178   - - In the :samp:`--pages` option, allow use of "."
  1177 + - In the :qpdf:ref:`--pages` option, allow use of "."
1179 1178 as a shortcut for the primary input file. That way, you can do
1180 1179 :command:`qpdf in.pdf --pages . 1-2 -- out.pdf`
1181 1180 instead of having to repeat :file:`in.pdf`
1182 1181 in the command.
1183 1182  
1184 1183 - When encrypting with 128-bit and 256-bit encryption, new
1185   - encryption options :samp:`--assemble`,
1186   - :samp:`--annotate`,
1187   - :samp:`--form`, and
1188   - :samp:`--modify-other` allow more fine-grained
  1184 + encryption options :qpdf:ref:`--assemble`,
  1185 + :qpdf:ref:`--annotate`,
  1186 + :qpdf:ref:`--form`, and
  1187 + :qpdf:ref:`--modify-other` allow more fine-grained
1189 1188 granularity in configuring options. Before, the
1190   - :samp:`--modify` option only configured certain
  1189 + :qpdf:ref:`--modify` option only configured certain
1191 1190 predefined groups of permissions.
1192 1191  
1193 1192 - Bug Fixes and Enhancements
... ... @@ -1198,7 +1197,7 @@ For a detailed list of changes, please see the file
1198 1197 file's internal structure shared these resource lists across
1199 1198 pages and if some but not all of the pages in the output did
1200 1199 not reference all the fonts and images. Using the
1201   - :samp:`--preserve-unreferenced-resources`
  1200 + :qpdf:ref:`--preserve-unreferenced-resources`
1202 1201 option would work around the incorrect behavior. This bug was
1203 1202 the result of a typo in the code and a deficiency in the test
1204 1203 suite. The case that triggered the error was known, just not
... ... @@ -1326,11 +1325,11 @@ For a detailed list of changes, please see the file
1326 1325  
1327 1326 - Page numbers (also known as page labels) are now preserved when
1328 1327 merging and splitting files with the
1329   - :samp:`--pages` and
1330   - :samp:`--split-pages` options.
  1328 + :qpdf:ref:`--pages` and
  1329 + :qpdf:ref:`--split-pages` options.
1331 1330  
1332 1331 - Bookmarks are partially preserved when splitting pages with the
1333   - :samp:`--split-pages` option. Specifically, the
  1332 + :qpdf:ref:`--split-pages` option. Specifically, the
1334 1333 outlines dictionary and some supporting metadata are copied
1335 1334 into the split files. The result is that all bookmarks from the
1336 1335 original file appear, those that point to pages that are
... ... @@ -1340,48 +1339,48 @@ For a detailed list of changes, please see the file
1340 1339 operations.
1341 1340  
1342 1341 - Page collation: add new option
1343   - :samp:`--collate`. When specified, the
1344   - semantics of :samp:`--pages` change from
  1342 + :qpdf:ref:`--collate`. When specified, the
  1343 + semantics of :qpdf:ref:`--pages` change from
1345 1344 concatenation to collation. See :ref:`page-selection` for examples and discussion.
1346 1345  
1347 1346 - Generation of information in JSON format, primarily to
1348 1347 facilitate use of qpdf from languages other than C++. Add new
1349   - options :samp:`--json`,
1350   - :samp:`--json-key`, and
1351   - :samp:`--json-object` to generate a JSON
  1348 + options :qpdf:ref:`--json`,
  1349 + :qpdf:ref:`--json-key`, and
  1350 + :qpdf:ref:`--json-object` to generate a JSON
1352 1351 representation of the PDF file. Run :command:`qpdf
1353 1352 --json-help` to get a description of the JSON
1354 1353 format. For more information, see :ref:`json`.
1355 1354  
1356   - - The :samp:`--generate-appearances` flag will
  1355 + - The :qpdf:ref:`--generate-appearances` flag will
1357 1356 cause qpdf to generate appearances for form fields if the PDF
1358 1357 file indicates that form field appearances are out of date.
1359 1358 This can happen when PDF forms are filled in by a program that
1360 1359 doesn't know how to regenerate the appearances of the filled-in
1361 1360 fields.
1362 1361  
1363   - - The :samp:`--flatten-annotations` flag can be
  1362 + - The :qpdf:ref:`--flatten-annotations` flag can be
1364 1363 used to *flatten* annotations, including form fields.
1365 1364 Ordinarily, annotations are drawn separately from the page.
1366 1365 Flattening annotations is the process of combining their
1367 1366 appearances into the page's contents. You might want to do this
1368 1367 if you are going to rotate or combine pages using a tool that
1369 1368 doesn't understand about annotations. You may also want to use
1370   - :samp:`--generate-appearances` when using this
  1369 + :qpdf:ref:`--generate-appearances` when using this
1371 1370 flag since annotations for outdated form fields are not
1372 1371 flattened as that would cause loss of information.
1373 1372  
1374   - - The :samp:`--optimize-images` flag tells qpdf
  1373 + - The :qpdf:ref:`--optimize-images` flag tells qpdf
1375 1374 to recompresses every image using DCT (JPEG) compression as
1376 1375 long as the image is not already compressed with lossy
1377 1376 compression and recompressing the image reduces its size. The
1378   - additional options :samp:`--oi-min-width`,
1379   - :samp:`--oi-min-height`, and
1380   - :samp:`--oi-min-area` prevent recompression of
  1377 + additional options :qpdf:ref:`--oi-min-width`,
  1378 + :qpdf:ref:`--oi-min-height`, and
  1379 + :qpdf:ref:`--oi-min-area` prevent recompression of
1381 1380 images whose width, height, or pixel area (widthย ร—ย height) are
1382 1381 below a specified threshold.
1383 1382  
1384   - - The :samp:`--show-object` option can now be
  1383 + - The :qpdf:ref:`--show-object` option can now be
1385 1384 given as :samp:`--show-object=trailer` to show
1386 1385 the trailer dictionary.
1387 1386  
... ... @@ -1531,12 +1530,12 @@ For a detailed list of changes, please see the file
1531 1530 :samp:`--keep-files-open={[yn]}`
1532 1531 to override default determination of whether to keep files open
1533 1532 when merging. Please see the discussion of
1534   - :samp:`--keep-files-open` in :ref:`basic-options` for additional details.
  1533 + :qpdf:ref:`--keep-files-open` for additional details.
1535 1534  
1536 1535 8.2.0: August 16, 2018
1537 1536 - Command-line Enhancements
1538 1537  
1539   - - Add :samp:`--no-warn` option to suppress
  1538 + - Add :qpdf:ref:`--no-warn` option to suppress
1540 1539 issuing warning messages. If there are any conditions that
1541 1540 would have caused warnings to be issued, the exit status is
1542 1541 still 3.
... ... @@ -1556,7 +1555,7 @@ For a detailed list of changes, please see the file
1556 1555 - Bug fix: end of line characters were not properly handled
1557 1556 inside strings in some cases.
1558 1557  
1559   - - Bug fix: using :samp:`--progress` on very small
  1558 + - Bug fix: using :qpdf:ref:`--progress` on very small
1560 1559 files could cause an infinite loop.
1561 1560  
1562 1561 - API enhancements
... ... @@ -1596,15 +1595,14 @@ For a detailed list of changes, please see the file
1596 1595 old behavior should be desired, or if you have a case where
1597 1596 page splitting is very slow, the old behavior (and speed) can
1598 1597 be enabled by specifying
1599   - :samp:`--preserve-unreferenced-resources`. For
1600   - additional details, please see :ref:`advanced-transformation`.
  1598 + :qpdf:ref:`--preserve-unreferenced-resources`.
1601 1599  
1602 1600 - When merging multiple PDF files, qpdf no longer leaves all the
1603 1601 files open. This makes it possible to merge numbers of files
1604 1602 that may exceed the operating system's limit for the maximum
1605 1603 number of open files.
1606 1604  
1607   - - The :samp:`--rotate` option's syntax has been
  1605 + - The :qpdf:ref:`--rotate` option's syntax has been
1608 1606 extended to make the page range optional. If you specify
1609 1607 :samp:`--rotate={angle}`
1610 1608 without specifying a page range, the rotation will be applied
... ... @@ -1613,10 +1611,10 @@ For a detailed list of changes, please see the file
1613 1611 down.
1614 1612  
1615 1613 - When merging multiple files, the
1616   - :samp:`--verbose` option now prints information
  1614 + :qpdf:ref:`--verbose` option now prints information
1617 1615 about each file as it operates on that file.
1618 1616  
1619   - - When the :samp:`--progress` option is
  1617 + - When the :qpdf:ref:`--progress` option is
1620 1618 specified, qpdf will print a running indicator of its best
1621 1619 guess at how far through the writing process it is. Note that,
1622 1620 as with all progress meters, it's an approximation. This option
... ... @@ -1672,7 +1670,7 @@ For a detailed list of changes, please see the file
1672 1670 it thinks it is through writing its output. Client programs can
1673 1671 use this to implement reasonably accurate progress meters. The
1674 1672 :command:`qpdf` command line tool uses this to
1675   - implement its :samp:`--progress` option.
  1673 + implement its :qpdf:ref:`--progress` option.
1676 1674  
1677 1675 - New methods ``QPDFObjectHandle::newUnicodeString`` and
1678 1676 ``QPDFObject::unparseBinary`` have been added to allow for more
... ... @@ -1733,7 +1731,7 @@ For a detailed list of changes, please see the file
1733 1731 :samp:`--linearize-pass1={file}`
1734 1732 has been added for debugging qpdf's linearization code.
1735 1733  
1736   - - The option :samp:`--coalesce-contents` can be
  1734 + - The option :qpdf:ref:`--coalesce-contents` can be
1737 1735 used to combine content streams of a page whose contents are an
1738 1736 array of streams into a single stream.
1739 1737  
... ... @@ -1782,8 +1780,7 @@ For a detailed list of changes, please see the file
1782 1780 password when opening encrypted files, and will optionally display
1783 1781 the encryption key used by a file. This is a non-standard
1784 1782 operation, but it can be useful in certain situations. Please see
1785   - the discussion of :samp:`--password-is-hex-key` in
1786   - :ref:`basic-options` or the comments around
  1783 + the discussion of :qpdf:ref:`--password-is-hex-key` or the comments around
1787 1784 ``QPDF::setPasswordIsHexKey`` in
1788 1785 :file:`QPDF.hh` for additional details.
1789 1786  
... ... @@ -1820,8 +1817,8 @@ For a detailed list of changes, please see the file
1820 1817 or RunLength encoding. Library API enhancements and
1821 1818 command-line options have been added to control this behavior.
1822 1819 See command-line options
1823   - :samp:`--compress-streams` and
1824   - :samp:`--decode-level` and methods
  1820 + :qpdf:ref:`--compress-streams` and
  1821 + :qpdf:ref:`--decode-level` and methods
1825 1822 ``QPDFWriter::setCompressStreams`` and
1826 1823 ``QPDFWriter::setDecodeLevel``.
1827 1824  
... ... @@ -1846,27 +1843,27 @@ For a detailed list of changes, please see the file
1846 1843 - Command-line arguments can now be read from files or standard
1847 1844 input using ``@file`` or ``@-`` syntax. Please see :ref:`invocation`.
1848 1845  
1849   - - :samp:`--rotate`: request page rotation
  1846 + - :qpdf:ref:`--rotate`: request page rotation
1850 1847  
1851   - - :samp:`--newline-before-endstream`: ensure that
  1848 + - :qpdf:ref:`--newline-before-endstream`: ensure that
1852 1849 a newline appears before every ``endstream`` keyword in the
1853 1850 file; used to prevent qpdf from breaking PDF/A compliance on
1854 1851 already compliant files.
1855 1852  
1856   - - :samp:`--preserve-unreferenced`: preserve
  1853 + - :qpdf:ref:`--preserve-unreferenced`: preserve
1857 1854 unreferenced objects in the input PDF
1858 1855  
1859   - - :samp:`--split-pages`: break output into chunks
  1856 + - :qpdf:ref:`--split-pages`: break output into chunks
1860 1857 with fixed numbers of pages
1861 1858  
1862   - - :samp:`--verbose`: print the name of each
  1859 + - :qpdf:ref:`--verbose`: print the name of each
1863 1860 output file that is created
1864 1861  
1865   - - :samp:`--compress-streams` and
1866   - :samp:`--decode-level` replace
1867   - :samp:`--stream-data` for improving granularity
  1862 + - :qpdf:ref:`--compress-streams` and
  1863 + :qpdf:ref:`--decode-level` replace
  1864 + :qpdf:ref:`--stream-data` for improving granularity
1868 1865 of controlling compression and decompression of stream data.
1869   - The :samp:`--stream-data` option will remain
  1866 + The :qpdf:ref:`--stream-data` option will remain
1870 1867 available.
1871 1868  
1872 1869 - When running :command:`qpdf --check` with other
... ... @@ -1877,8 +1874,8 @@ For a detailed list of changes, please see the file
1877 1874 reference table, or other similar operations.
1878 1875  
1879 1876 - Process :command:`--pages` earlier so that other
1880   - options like :samp:`--show-pages` or
1881   - :samp:`--split-pages` can operate on the file
  1877 + options like :qpdf:ref:`--show-pages` or
  1878 + :qpdf:ref:`--split-pages` can operate on the file
1882 1879 after page splitting/merging has occurred.
1883 1880  
1884 1881 - API Changes. All new API calls are documented in their respective
... ... @@ -1911,7 +1908,7 @@ For a detailed list of changes, please see the file
1911 1908 ``QPDFWriter`` methods.
1912 1909  
1913 1910 6.0.0: November 10, 2015
1914   - - Implement :samp:`--deterministic-id` command-line
  1911 + - Implement :qpdf:ref:`--deterministic-id` command-line
1915 1912 option and ``QPDFWriter::setDeterministicID`` as well as C API
1916 1913 function ``qpdf_set_deterministic_ID`` for generating a
1917 1914 deterministic ID for non-encrypted files. When this option is
... ... @@ -2024,12 +2021,12 @@ For a detailed list of changes, please see the file
2024 2021 :file:`QPDFObjectHandle.hh` for additional
2025 2022 notes.
2026 2023  
2027   - - Add :samp:`--show-npages` command-line option to
  2024 + - Add :qpdf:ref:`--show-npages` command-line option to
2028 2025 the :command:`qpdf` command to show the number of
2029 2026 pages in a file.
2030 2027  
2031 2028 - Allow omission of the page range within
2032   - :samp:`--pages` for the
  2029 + :qpdf:ref:`--pages` for the
2033 2030 :command:`qpdf` command. When omitted, the page
2034 2031 range is implicitly taken to be all the pages in the file.
2035 2032  
... ... @@ -2156,8 +2153,9 @@ For a detailed list of changes, please see the file
2156 2153 ``QPDFWriter::setMinimumPDFVersion`` and
2157 2154 ``QPDFWriter::forcePDFVersion`` that accept an extension level,
2158 2155 and extended syntax for specifying forced and minimum versions on
2159   - the command line as described in :ref:`advanced-transformation`. Corresponding functions
2160   - have been added to the C API as well.
  2156 + the command line as described in :qpdf:ref:`--force-version` and
  2157 + :qpdf:ref:`--min-version`. Corresponding functions have been added
  2158 + to the C API as well.
2161 2159  
2162 2160 - Minor fixes to prevent qpdf from referencing objects in the file
2163 2161 that are not referenced in the file's overall structure. Most
... ... @@ -2213,12 +2211,12 @@ For a detailed list of changes, please see the file
2213 2211 ``QPDFWriter``.
2214 2212  
2215 2213 - Removed the method ``decodeStreams``. This method was used by
2216   - the :samp:`--check` option of the
  2214 + the :qpdf:ref:`--check` option of the
2217 2215 :command:`qpdf` command-line tool to force all
2218 2216 streams in the file to be decoded, but it also suffered from
2219 2217 the problem of opening otherwise unreferenced streams and thus
2220 2218 could report false positive. The
2221   - :samp:`--check` option now causes qpdf to go
  2219 + :qpdf:ref:`--check` option now causes qpdf to go
2222 2220 through all the motions of writing a new file based on the
2223 2221 original one, so it will always reference and check exactly
2224 2222 those parts of a file that any ordinary viewer would check.
... ... @@ -2307,7 +2305,7 @@ For a detailed list of changes, please see the file
2307 2305  
2308 2306 - Options have been added to the :command:`qpdf`
2309 2307 command-line tool for copying encryption parameters from another
2310   - file. See :ref:`basic-options`.
  2308 + file. (QXXXQ Link)
2311 2309  
2312 2310 - New methods have been added to the ``QPDF`` object for adding and
2313 2311 removing pages. See :ref:`adding-and-remove-pages`.
... ... @@ -2571,7 +2569,7 @@ For a detailed list of changes, please see the file
2571 2569 permissions, it does make them available so that applications that
2572 2570 use qpdf can enforce permissions.
2573 2571  
2574   - - The :samp:`--check` option to
  2572 + - The :qpdf:ref:`--check` option to
2575 2573 :command:`qpdf` has been extended to include some
2576 2574 additional information.
2577 2575  
... ...
manual/weak-crypto.rst
... ... @@ -13,12 +13,12 @@ cryptography algorithm, and MD5, which is a weak hashing algorithm. In
13 13 version 10.4, qpdf generates warnings for some (but not all) cases of
14 14 writing files with weak cryptography when invoked from the command-line.
15 15 These warnings can be suppressed using the
16   -:samp:`--allow-weak-crypto` option.
  16 +:qpdf:ref:`--allow-weak-crypto` option.
17 17  
18 18 It is planned for qpdf version 11 to be stricter, making it an error to
19 19 write files with insecure cryptography from the command-line tool in
20 20 most cases without specifying the
21   -:samp:`--allow-weak-crypto` flag and also to require
  21 +:qpdf:ref:`--allow-weak-crypto` flag and also to require
22 22 explicit steps when using the C++ library to enable use of insecure
23 23 cryptography.
24 24  
... ...