OpenSystemsDevelopment / qpdf

26 Mar, 2025

1 commit

Add new CLI option --remove-structure ...
464d94af
```
... to remove the /Root /StructTreeRoot and /MarkInfo entries.
```
m-holger authored
2025-03-26 23:30:44 +0000
Browse Dir »

25 Mar, 2025

1 commit

Fix parsing of object streams ...

... containing objects with no white-space between them.

To enforce the rule that objects end at the start-offset of the next
object, each object is parsed in it own object stream.

To facilitate this, a new private API input source is::OffsetBuffer has
been added which only contains the object but reports offsets relative to
the start of the object stream. This is adapted from OffsetInputSource by
changing the direction of the offset, endowing it with its own
BufferInputSource and striooing out checks duplicated in BufferInputSource.

Fixes the expected failure in the test case added in #1266.

authored

2025-03-25 10:40:01 +0000

Browse Dir »

24 Mar, 2025

1 commit

Fix object stream error/warning messages reporting wrong object id ...

1bce5c4f

This was due to the use of last_object_description, which is not set for
the object stream itself.

Also, modify the messages introduced #1391 and #1392 to report the supposed
offset of the objects.

authored

2025-03-24 21:57:16 +0000

Browse Dir »

10 Mar, 2025

2 commits

Refine recovery from missing startxref (fixes #1335) ...
7927241d
```
If startxref cannot be found in the last 1024 try finding it in the
whole file and check whether it is valid.
```
m-holger authored
2025-03-10 18:26:14 +0000
Browse Dir »

Refactor xref table reconstruction (Fixes #1362) ...

649709a8

Split reconstruction into three passes - scanning of input for objects and
trailer, insertion of objects into the xref table, and loading the trailer.

This allows insertion to take place in the usual reverse order and removes
the need for a separate insertReconstructedXrefEntry method.

It also allows trailer to be tried from most recent to oldest.

Ignore any found trailers without /Root entry.

authored

2025-03-10 15:12:28 +0000

Browse Dir »

07 Mar, 2025

1 commit

Enhance error checking of object stream object ids and offsets ...

f06209ca

The original test file contains multiple entries with id 0 and offset 0.
One entry has been modified such that the id is valid (6).

Object streams with invalid offsets are a source of unreproduceable
oss-fuzz time-outs.

authored

2025-03-07 20:27:54 +0000

Browse Dir »

15 Feb, 2025

1 commit

Exclude cygwin from fix-qdf pipe test ...
da42078d
```
Also add debugging information so we can save time if $^O used in
GitHub Actions changes again.
```
Jay Berkenbilt authored
2025-02-15 10:52:32 -0500
Browse Dir »

07 Feb, 2025

1 commit

Refine QPDFParser error handling ...

8df3de5c

Reduce the container size for which a single bad token will cause a failure
from 100,000 to 5,000.

Count missing dictionary keys as errors.

authored

2025-02-07 23:41:56 +0000

Browse Dir »

04 Feb, 2025

1 commit

Add zopfli support (fixes #1323) ...
133da3b6
```
This requires a special build option.
```
Jay Berkenbilt authored
2025-02-04 06:17:34 -0500
Browse Dir »

03 Feb, 2025

1 commit

fix-qdf: accept optional output file (fixes #1330)
a2fc5b52

Jay Berkenbilt authored
2025-02-03 06:42:22 -0500
Browse Dir »

02 Feb, 2025

2 commits

Refine xref reconstruction (fixes #1335) ...

ca3ea2e3

When recovering XRef streams, start with the stream with the largest
/Size rather than the largest offset.

Also, if reconstruction fails to find a trailer with a valid /Root entry
search for a root object.

authored

2025-02-02 21:14:08 +0000

Browse Dir »

Merge pull request #1340 from m-holger/i1286 ...
aa583f29
```
Change QPDFWriter stream_decode_level default to qpdf_dl_generalized (fixes #1286)
```
m-holger authored
2025-02-02 21:03:04 +0000
Browse Dir »

01 Feb, 2025

1 commit

CLI reject flags with parameters (fixes #1329)
985cdf91

m-holger authored
2025-02-01 12:34:57 +0000
Browse Dir »

31 Jan, 2025

3 commits

Add new object stream test case ...
c026b511
```
Exercise stream containing objects with no white-space between them.
```
m-holger authored
2025-01-31 19:22:06 +0000
Browse Dir »
Change QPDFWriter stream_decode_level default to qpdf_dl_generalized ...
718b1400
```
Also, fix disabling of preserve_encryption to be ignore
stream_decode_level, but disable preserve_encryption if compress_streams is
false.

Fixes #1286
```
m-holger authored
2025-01-31 16:09:07 +0000
Browse Dir »
In QPDFWriter::willFilterStream on runtime error on first attempt retry ...
ff0affd8
```
without filtering
```
m-holger authored
2025-01-31 15:34:02 +0000
Browse Dir »

28 Jan, 2025

4 commits

Add copy annotation test ...
73af7567
```
Test fixing /P entry.
```
m-holger authored
2025-01-28 16:26:21 +0000
Browse Dir »
Merge pull request #1307 from m-holger/pages ...
bde5a446
```
Fix QPDF::getAllPagesInternal warning
```
m-holger authored
2025-01-28 15:59:52 +0000
Browse Dir »
Fix QPDF::copyForeignObject warning ...
cc95f473
```
Provide correct obj_gen and offset.
```
m-holger authored
2025-01-28 11:01:18 +0000
Browse Dir »
Fix QPDF::getAllPagesInternal warning ...
b7bf9f3d
```
Provide correct obj_gen.
```
m-holger authored
2025-01-28 10:15:15 +0000
Browse Dir »

16 Jan, 2025

2 commits

Revert "Merge pull request #1272 from m-holger/xref_table" ...
0d5c57c1
```
This reverts commit ff2a78f579ebdd06b417e34260a17dba06e71137, reversing
changes made to 8f54319f7a6514110f4b05cbbf1cb1c9fc8cb6a0.
```
m-holger authored
2025-01-16 16:40:08 +0000
Browse Dir »
Revert "Merge pull request #1289 from m-holger/fuzz" ...
f1800410
```
This reverts commit 0e92cf6bf399249c603c3d0212e898fd29e71fcd, reversing
changes made to 7d34b89a69e8e89c098dd373442f7df809c28eff.
```
m-holger authored
2025-01-16 16:36:48 +0000
Browse Dir »

05 Jan, 2025

1 commit

Tweak test files to work around fixed ghostscript bug ...

531f6877

Ghostscript 10.0.2 failed to handle the files changed in this commit,
but ghostscript 10.0.4 handles them fine as do earlier versions. These
files all have hybird xref in the form of a file with an xref table
appended with a section that has an xref stream. They all have
/PageLabels pointing to 107 0 R in the original file, with 107 higher
than the highest object. The spec says that this should be treated as
null, which results in /PageLabels null, which results in ghostscript
errors in that version. While ghostscript 10.0.2 may be handling the
file incorrectly, the file does something that's not really kosher,
and it's easier to fix the files, which had not been changed since the
very first open source release of qpdf, than to try to work around the
issue.

This was discovered with the GitHub actions runner was bumped to
Ubuntu 24.04, which contains the buggy version of ghostscript. I was
not able to find a specific ghostscript issue that addressed this, but
the problem went away in either 10.0.3 or 10.0.4.

Commenting out /PageLabels without changing offsets was a pragmatic
move to avoid having to regenerate the xref tables manually. I just
had to manually edit the binary xref stream to change the offset of
one item (the new object 1), which I put at the end to avoid breaking
other things.

authored

2025-01-05 17:29:03 -0500

Browse Dir »

27 Sep, 2024

1 commit

Merge pull request #1274 from m-holger/meta ...
50d385c8
```
Add new commands --remove-metadata and --remove-info
```
m-holger authored
2024-09-27 11:26:34 +0100
Browse Dir »

20 Sep, 2024

1 commit

Add sanity check on trailer /Size entry
21f176d3

m-holger authored
2024-09-20 15:28:49 +0100
Browse Dir »

18 Sep, 2024

6 commits

Refactor Xref_table::subsections ...

28c13f54

Optimistically read subsection headers without reading individual object
entries, assuming that they are 20 bytes long as per the PDF spec. If
problems are encountered, fall back to calling bad_subsections.

authored

2024-09-18 10:25:38 +0100

Browse Dir »

Change Xref_table::table to std::vector ...
c0020cb1
```
Temporarily disable 3 specific-bugs tests. Remove 'xref size mismatch'
test.
```
m-holger authored
2024-09-18 10:25:38 +0100
Browse Dir »

Refactor Xref_table::reconstruct ...

91822ae6

Split reconstruction into two passes - scanning of input for objects and
insertion of objects into the xref table. This allows insertion to take
place in the usual reverse order and removes the need for a separate
insert_reconstructed method.

authored

2024-09-18 10:25:38 +0100

Browse Dir »

Add new methods Xref_table::subsections ...
ed656194
```
Calculate all subsections before reading subsection entries.

Duplicates some warnings for the time being.
```
m-holger authored
2024-09-18 10:25:38 +0100
Browse Dir »
Move QPDF::reconstruct_xref to QPDF::Xref_table ...
3fbff845
```
Also, when recovering trailer from xref streams, pick the last valid
trailer encountered rather than the first.
```
m-holger authored
2024-09-18 10:25:37 +0100
Browse Dir »
Amend "recover file with xref stream" test ...
d42fda60
```
Change first xref stream dictionary to point to an invalid root in order
to detect failure to recover the last valid trailer.
```
m-holger authored
2024-09-18 10:25:37 +0100
Browse Dir »

25 Aug, 2024

1 commit

Add new commands --remove-metadata and --remove-info
8cb9bce7

m-holger authored
2024-08-25 13:10:11 +0100
Browse Dir »

20 Aug, 2024

1 commit

Fix QPDF::recoverStreamLength ...

c02cb9a7

Ensure the the recovered stream end is not part of a different object.

Test file is bad24.pdf with stream 4 'endstream' corrupted.

authored

2024-08-20 15:14:01 +0100

Browse Dir »

06 Aug, 2024

1 commit

Refactor the creation of unresolved objects ...

06001ed2

Create unresolved objects only for objects in the xref table (except during
parsing of the xref table). Do not add indirect nulls into the the object
cache as the result of a cache miss during a call to getObject except
during parsing or creation/updating from JSON. To support this behaviour,
add new private methods getObjectForParser and getObjectForJSON.

As a result of this change, dangling references are treated as direct nulls
rather than indirect nulls.

authored

2024-08-06 12:22:09 +0100

Browse Dir »

14 Jul, 2024

1 commit

Fix QPDFOutlineDocumentHelper::resolveNamedDest (fixes #1238) ...
f3cbaafc
```
Handle case where named destination is a dictionary with /D entry.

Test case is hand-edited outlines-with-old-root-dests.pdf with modified
object 107.
```
m-holger authored
2024-07-14 12:15:45 +0100
Browse Dir »

11 Jul, 2024

1 commit

Add further sanity check to QPDF::reconstruct_xref ...
e914bbbb
```
If reconstruct_xref generates more than 1000 warnings give up because the
file is so severely damaged that there is very little point continuing.
```
m-holger authored
2024-07-11 13:25:07 +0100
Browse Dir »

09 Jul, 2024

1 commit

Add additional sanity check during xref reconstruction ...
2e378d92
```
Check that xref table is not empty after recovery. Empty xref tables
disable other sanity checks.
```
m-holger authored
2024-07-09 17:01:44 +0100
Browse Dir »

04 Jul, 2024

1 commit

Add test case for self-referential object streams ...
f0ded6bc
```
Previous test case was lost in #1221. Test file was created from
object-stream.pdf by adding a reference to itself into object stream 1 0.
```
m-holger authored
2024-07-04 20:40:47 +0100
Browse Dir »

02 Jul, 2024

1 commit

Add additional object id sanity checks ...
6d640c56
```
Ensure objects with impossibly large ids are ignored.
```
m-holger authored
2024-07-02 01:16:23 +0100
Browse Dir »

25 Jun, 2024

1 commit

In QPDF::reconstruct_xref add sanity check for object ids
4a8c821e

m-holger authored
2024-06-25 15:46:47 +0100
Browse Dir »