OpenSystemsDevelopment / qpdf

20 May, 2022

7 commits

Major rework -- see long comments ...

* Replace --create-from-json=file with --json-input, which causes the
  regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
  data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
  stream data.

authored

2022-05-20 09:16:25 -0400

Browse Dir »

Add QUtil::FileCloser to the public API
23fc6756

Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »
Support stream data -- not tested ...
0fe8d447
```
There are no automated tests yet, but committing work so far in
preparation for some refactoring.
```
Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »

replaceStreamData: accept uninitialized filter/decode_parms ...

63c7eefe

These mean to leave the original values alone. This is needed for
reconstructing streams from JSON given that the stream data and stream
dictionary may appear in any order in the JSON.

authored

2022-05-20 09:16:25 -0400

Browse Dir »

Back out fluent QPDFObjectHandle methods. Keep the andGet methods. ...
56f1b411
```
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
```
Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »
Parse objects; stream data is not yet handled
7e7a9c43

Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »
Add private methods for reserving specific objects
9064542b

Jay Berkenbilt authored
2022-05-20 07:54:09 -0400
Browse Dir »

16 May, 2022

6 commits

Implement top-level qpdf json parsing
7fa5d177

Jay Berkenbilt authored
2022-05-16 13:41:40 -0400
Browse Dir »
Add scaffolding for QPDF JSON reactor
8d42eb26

Jay Berkenbilt authored
2022-05-16 13:41:40 -0400
Browse Dir »
Add --create-from-json and --update-from-json arguments ...
4fe2e06b
```
Also add stubs for top-level QPDF methods (createFromJSON,
updateFromJSON)
```
Jay Berkenbilt authored
2022-05-16 13:41:40 -0400
Browse Dir »
Remove offset from missing /Root error ...
9a0e9a1a
```
The last offset is irrelevant to not being able to find /Root.
```
Jay Berkenbilt authored
2022-05-16 13:39:26 -0400
Browse Dir »

Improve handling of replacing stream data with empty strings ...

051ae7c2

When an empty string was passed to replaceStreamData, the code was
passing a null pointer to memcpy. Since a 0 size was also passed, this
was harmless, but it triggers sanitizer errors. The code properly
handles a null pointer as the buffer in other places.

authored

2022-05-16 13:39:26 -0400

Browse Dir »

Add QUtil::is_long_long
60ec94a7

Jay Berkenbilt authored
2022-05-16 13:39:26 -0400
Browse Dir »

14 May, 2022

1 commit

JSON reactor: improve handling of nested containers ...

4c7cfd5c

Call the parent container's item method before calling the child
item's start method so we can easily know the current nesting level
when nested items are added.

authored

2022-05-14 17:35:06 -0400

Browse Dir »

08 May, 2022

6 commits

Add maxobjectid to JSON
2a2f7f1b

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Add --to-json option
e9390aea

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Implement JSON v2 output
c76536dd

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »

Fix typo in json output key name ...

15272662

moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.

authored

2022-05-08 13:45:20 -0400

Browse Dir »

Implement JSON v2 for Stream ...
1bc8abfd
```
Not fully exercised in this commit
```
Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Implement JSON v2 for String ...
3246923c
```
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
```
Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »

07 May, 2022

9 commits

Prepare code for JSON v2 ...
16f4f94c
```
Update getJSON() methods and calls to them
```
Jay Berkenbilt authored
2022-05-07 11:12:01 -0400
Browse Dir »

Objectinfo json: write incrementally and in numeric order ...

a9fbbd5d

This script was used on test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objectinfo' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objectinfo'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjectinfo['trailer'] = trailer
    data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Objects json: write incrementally and in numeric order ...

948de609

The following script was used to adjust test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objects' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objects'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjects = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjects['trailer'] = trailer
    data['objects'] = newobjects
print(json_dumps(data))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Pages json: write each page incrementally
f50274ef

Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »

Top-level json: write incrementally ...

dc9b7287

This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    newdata = {}
    for i in ('version', 'parameters', 'pages', 'pagelabels',
              'acroform', 'attachments', 'encrypt', 'outlines',
              'objects', 'objectinfo'):
        if i in data:
            newdata[i] = data[i]
print(json_dumps(newdata))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Test json against schema only on demand ...
7f65a5c2
```
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
```
Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »
Add next to Pl_String and fix comments
a3c99803

Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »
Add --test-json-schema command-line option
b361c5ce

Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »
QPDFJob: have doJSON write to a pipeline
7604ac5c

Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »

06 May, 2022

1 commit

JSON: add blob type that generates base64-encoded binary data
0500d434

Jay Berkenbilt authored
2022-05-06 19:14:52 -0400
Browse Dir »

04 May, 2022

5 commits

Change JSON parser to parse from an InputSource
05fda4af

Jay Berkenbilt authored
2022-05-04 12:07:11 -0400
Browse Dir »
Add new FileInputSource constructors
e5f3910c

Jay Berkenbilt authored
2022-05-04 12:07:11 -0400
Browse Dir »
JSON: add write methods and implement unparse() in terms of those
e2596359

Jay Berkenbilt authored
2022-05-04 12:07:11 -0400
Browse Dir »
Make "objects" and "pages" consistent in JSON output
8b25de24

Jay Berkenbilt authored
2022-05-04 08:32:44 -0400
Browse Dir »
Don't call pushInheritedAttributesToPage in json mode ...
6b576797
```
We used to have to do that, but for quite some time, the code that
gets images has no longer required it.
```
Jay Berkenbilt authored
2022-05-04 07:11:13 -0400
Browse Dir »

03 May, 2022

5 commits

Add new Pl_String Pipeline
f4206a09

Jay Berkenbilt authored
2022-05-03 18:54:51 -0400
Browse Dir »
Add new Pl_OStream Pipeline
16139d97

Jay Berkenbilt authored
2022-05-03 18:54:51 -0400
Browse Dir »
Make use of the new Pipeline methods in some places
21d6e323

Jay Berkenbilt authored
2022-05-03 18:31:23 -0400
Browse Dir »
Add new Pipeline convenience methods
f1c6bb97

Jay Berkenbilt authored
2022-05-03 18:31:22 -0400
Browse Dir »
Make Pipeline::write take an unsigned char const* (API change)
59f3e09e

Jay Berkenbilt authored
2022-05-03 18:31:22 -0400
Browse Dir »