Commit 25988e8d108ac036329c2a5e8b628d9ac4920d0a

Authored by Jay Berkenbilt
1 parent cc108a7f

Bug fix: content normalizer should not add trailing newline

Adding a trailing newline in content normalization damages files whose
contents are split across streams in the middle of tokens. Let
QPDFWriter add the newline with the indicator to ignore the newline,
which it already does. This changes the way some qdf files look.
ChangeLog
@@ -62,6 +62,22 @@ @@ -62,6 +62,22 @@
62 QPDFObjectHandle::parsePageContents() method in favor of the older 62 QPDFObjectHandle::parsePageContents() method in favor of the older
63 QPDFObjectHandle::parseContentStream() method. 63 QPDFObjectHandle::parseContentStream() method.
64 64
  65 + * Bug fix: change where the trailing newline is added to a stream
  66 + in QDF mode when content normalization is enabled (the default for
  67 + QDF mode). Before, the content normalizer ensured that the output
  68 + ended with a trailing newline, but this had the undesired side
  69 + effect of including the newline in the stream data for purposes of
  70 + length computation. QPDFWriter already appends a newline without
  71 + counting in length for better readability. Ordinarily this makes
  72 + no difference, but in the rare case of a page's contents being
  73 + split in the middle of a token, the old behavior could cause the
  74 + extra newline to be interprted as part of the token. This bug
  75 + could only be triggered in qdf mode, which is a mode intended for
  76 + manual inspection of PDF files' contents, so it is very unlikely
  77 + to have caused any actual problems for people using qpdf for
  78 + production use. Even if it did, it would be very unusual for a PDF
  79 + file to actually be adversely affected by this issue.
  80 +
65 2018-02-04 Jay Berkenbilt <ejb@ql.org> 81 2018-02-04 Jay Berkenbilt <ejb@ql.org>
66 82
67 * Add QPDFWriter::setLinearizationPass1Filename method and 83 * Add QPDFWriter::setLinearizationPass1Filename method and
examples/qtest/mod-info/files/2.qdf
No preview for this file type
libqpdf/Pl_QPDFTokenizer.cc
@@ -149,10 +149,6 @@ Pl_QPDFTokenizer::finish() @@ -149,10 +149,6 @@ Pl_QPDFTokenizer::finish()
149 writeNext(&this->char_to_unread, 1); 149 writeNext(&this->char_to_unread, 1);
150 } 150 }
151 } 151 }
152 - if (! this->just_wrote_nl)  
153 - {  
154 - writeNext("\n", 1);  
155 - }  
156 152
157 getNext()->finish(); 153 getNext()->finish();
158 } 154 }
qpdf/qtest/qpdf/good14.out
@@ -13,8 +13,7 @@ three lines @@ -13,8 +13,7 @@ three lines
13 <8a8b> 13 <8a8b>
14 (ab) 14 (ab)
15 <8c><dd> ) > 15 <8c><dd> ) >
16 -<610062> (MOO)  
17 --- stream 1 -- 16 +<610062> (MOO)-- stream 1 --
18 This stream does end with a newline. 17 This stream does end with a newline.
19 // tests: 18 // tests:
20 // bad tokens preserved 19 // bad tokens preserved
@@ -35,16 +34,14 @@ This stream does end with a newline. @@ -35,16 +34,14 @@ This stream does end with a newline.
35 -- stream 2 -- 34 -- stream 2 --
36 (This stream ends with a \001 bad token 35 (This stream ends with a \001 bad token
37 -- stream 3 -- 36 -- stream 3 --
38 -<AB X  
39 --- stream 4 -- 37 +<AB X-- stream 4 --
40 (ends with a name) 38 (ends with a name)
41 -/ThisMustBeLast  
42 --- stream 5 -- 39 +/ThisMustBeLast-- stream 5 --
43 % This stream has an inline image marker that is not terminated 40 % This stream has an inline image marker that is not terminated
44 (Potato) 41 (Potato)
45 42
46 BI 43 BI
47 ID 44 ID
48 -<506f7 45 +<506f7
49 461746f> 46 461746f>
50 test 3 done 47 test 3 done
qpdf/qtest/qpdf/newline-before-endstream-nl-qdf.pdf
@@ -144,13 +144,12 @@ BT @@ -144,13 +144,12 @@ BT
144 72 720 Td 144 72 720 Td
145 (Potato) Tj 145 (Potato) Tj
146 ET 146 ET
147 -  
148 endstream 147 endstream
149 endobj 148 endobj
150 149
151 %QDF: ignore_newline 150 %QDF: ignore_newline
152 11 0 obj 151 11 0 obj
153 -128 152 +127
154 endobj 153 endobj
155 154
156 %% Original object ID: 10 0 155 %% Original object ID: 10 0
@@ -186,13 +185,12 @@ BT @@ -186,13 +185,12 @@ BT
186 72 720 Td 185 72 720 Td
187 (Potato) Tj 186 (Potato) Tj
188 ET 187 ET
189 -  
190 endstream 188 endstream
191 endobj 189 endobj
192 190
193 %QDF: ignore_newline 191 %QDF: ignore_newline
194 15 0 obj 192 15 0 obj
195 -132 193 +131
196 endobj 194 endobj
197 195
198 %% Original object ID: 13 0 196 %% Original object ID: 13 0
@@ -227,18 +225,18 @@ xref @@ -227,18 +225,18 @@ xref
227 0000001185 00000 n 225 0000001185 00000 n
228 0000001330 00000 n 226 0000001330 00000 n
229 0000001415 00000 n 227 0000001415 00000 n
230 -0000001622 00000 n  
231 -0000001671 00000 n  
232 -0000001818 00000 n  
233 -0000001905 00000 n  
234 -0000002116 00000 n  
235 -0000002165 00000 n  
236 -0000002312 00000 n 228 +0000001621 00000 n
  229 +0000001670 00000 n
  230 +0000001817 00000 n
  231 +0000001904 00000 n
  232 +0000002114 00000 n
  233 +0000002163 00000 n
  234 +0000002310 00000 n
237 trailer << 235 trailer <<
238 /Root 1 0 R 236 /Root 1 0 R
239 /Size 18 237 /Size 18
240 /ID [<ff82013f9cede898ae8db2f2f177aa1d><31415926535897932384626433832795>] 238 /ID [<ff82013f9cede898ae8db2f2f177aa1d><31415926535897932384626433832795>]
241 >> 239 >>
242 startxref 240 startxref
243 -2348 241 +2346
244 %%EOF 242 %%EOF
qpdf/qtest/qpdf/newline-before-endstream-qdf.pdf
@@ -145,8 +145,9 @@ ET @@ -145,8 +145,9 @@ ET
145 endstream 145 endstream
146 endobj 146 endobj
147 147
  148 +%QDF: ignore_newline
148 11 0 obj 149 11 0 obj
149 -128 150 +127
150 endobj 151 endobj
151 152
152 %% Original object ID: 10 0 153 %% Original object ID: 10 0
@@ -185,8 +186,9 @@ ET @@ -185,8 +186,9 @@ ET
185 endstream 186 endstream
186 endobj 187 endobj
187 188
  189 +%QDF: ignore_newline
188 15 0 obj 190 15 0 obj
189 -132 191 +131
190 endobj 192 endobj
191 193
192 %% Original object ID: 13 0 194 %% Original object ID: 13 0
@@ -221,18 +223,18 @@ xref @@ -221,18 +223,18 @@ xref
221 0000001163 00000 n 223 0000001163 00000 n
222 0000001308 00000 n 224 0000001308 00000 n
223 0000001393 00000 n 225 0000001393 00000 n
224 -0000001578 00000 n  
225 -0000001627 00000 n  
226 -0000001774 00000 n  
227 -0000001861 00000 n  
228 -0000002050 00000 n  
229 -0000002099 00000 n  
230 -0000002246 00000 n 226 +0000001599 00000 n
  227 +0000001648 00000 n
  228 +0000001795 00000 n
  229 +0000001882 00000 n
  230 +0000002092 00000 n
  231 +0000002141 00000 n
  232 +0000002288 00000 n
231 trailer << 233 trailer <<
232 /Root 1 0 R 234 /Root 1 0 R
233 /Size 18 235 /Size 18
234 /ID [<ff82013f9cede898ae8db2f2f177aa1d><31415926535897932384626433832795>] 236 /ID [<ff82013f9cede898ae8db2f2f177aa1d><31415926535897932384626433832795>]
235 >> 237 >>
236 startxref 238 startxref
237 -2282 239 +2324
238 %%EOF 240 %%EOF
qpdf/qtest/qpdf/stream-line-enders.qdf
@@ -106,8 +106,9 @@ ET @@ -106,8 +106,9 @@ ET
106 endstream 106 endstream
107 endobj 107 endobj
108 108
  109 +%QDF: ignore_newline
109 11 0 obj 110 11 0 obj
110 -12 111 +11
111 endobj 112 endobj
112 113
113 %% Original object ID: 8 0 114 %% Original object ID: 8 0
@@ -142,14 +143,14 @@ xref @@ -142,14 +143,14 @@ xref
142 0000000799 00000 n 143 0000000799 00000 n
143 0000000866 00000 n 144 0000000866 00000 n
144 0000000935 00000 n 145 0000000935 00000 n
145 -0000001004 00000 n  
146 -0000001051 00000 n  
147 -0000001197 00000 n 146 +0000001025 00000 n
  147 +0000001072 00000 n
  148 +0000001218 00000 n
148 trailer << 149 trailer <<
149 /Root 1 0 R 150 /Root 1 0 R
150 /Size 14 151 /Size 14
151 /ID [<08aa98c73f8a7262d77c8328772c3989><31415926535897932384626433832795>] 152 /ID [<08aa98c73f8a7262d77c8328772c3989><31415926535897932384626433832795>]
152 >> 153 >>
153 startxref 154 startxref
154 -1233 155 +1254
155 %%EOF 156 %%EOF