OpenSystemsDevelopment / qpdf

20 May, 2022

3 commits

Rename all test files: _ to -
6c5e5906

Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »

Major rework -- see long comments ...

* Replace --create-from-json=file with --json-input, which causes the
  regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
  data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
  stream data.

authored

2022-05-20 09:16:25 -0400

Browse Dir »

Parse objects; stream data is not yet handled
7e7a9c43

Jay Berkenbilt authored
2022-05-20 09:16:25 -0400
Browse Dir »

16 May, 2022

2 commits

Implement top-level qpdf json parsing
7fa5d177

Jay Berkenbilt authored
2022-05-16 13:41:40 -0400
Browse Dir »
Remove offset from missing /Root error ...
9a0e9a1a
```
The last offset is irrelevant to not being able to find /Root.
```
Jay Berkenbilt authored
2022-05-16 13:39:26 -0400
Browse Dir »

14 May, 2022

1 commit

Split qpdf.test into multiple test suites ...
173b944e
```
This makes it a lot easier to run parts of the test suite.
```
Jay Berkenbilt authored
2022-05-14 17:35:06 -0400
Browse Dir »

08 May, 2022

10 commits

Add maxobjectid to JSON
2a2f7f1b

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Add --to-json option
e9390aea

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Test inline stream data with different decode levels
2e87d593

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Test json v2 with invalid stream data
f08f3989

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Implement JSON v2 output
c76536dd

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »

Apply script across future v2 test files ...

bdfc4da5

There is one unexpected pass in this commit. This script was applied
to the files changed in this commit:

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    data['version'] = 2
    objectinfo = {}
    if 'objectinfo' in data:
        objectinfo = data['objectinfo']
        del data['objectinfo']
    if 'objects' not in data:
        continue
    qpdf = {'jsonversion': 2, 'pdfversion': '1.3', 'objects': {}}
    for k, v in data['objects'].items():
        is_stream = objectinfo.get(k, {}).get('stream', {}).get('is', False)
        if k.endswith(' R'):
            k = 'obj:' + k
        if is_stream:
            v = {'stream': {'dict': v}}
        else:
            v = {'value': v}
        qpdf['objects'][k] = v
    data['qpdf'] = qpdf
    del data['objects']
print(json_dumps(data))
----------

authored

2022-05-08 13:45:20 -0400

Browse Dir »

Prepare test suite for json v2
8d348974

Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »

Fix typo in json output key name ...

15272662

moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.

authored

2022-05-08 13:45:20 -0400

Browse Dir »

Implement JSON v2 for Stream ...
1bc8abfd
```
Not fully exercised in this commit
```
Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »
Implement JSON v2 for String ...
3246923c
```
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
```
Jay Berkenbilt authored
2022-05-08 13:45:20 -0400
Browse Dir »

07 May, 2022

5 commits

Prepare code for JSON v2 ...
16f4f94c
```
Update getJSON() methods and calls to them
```
Jay Berkenbilt authored
2022-05-07 11:12:01 -0400
Browse Dir »

Objectinfo json: write incrementally and in numeric order ...

a9fbbd5d

This script was used on test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objectinfo' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objectinfo'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjectinfo['trailer'] = trailer
    data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Objects json: write incrementally and in numeric order ...

948de609

The following script was used to adjust test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objects' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objects'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjects = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjects['trailer'] = trailer
    data['objects'] = newobjects
print(json_dumps(data))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Top-level json: write incrementally ...

dc9b7287

This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    newdata = {}
    for i in ('version', 'parameters', 'pages', 'pagelabels',
              'acroform', 'attachments', 'encrypt', 'outlines',
              'objects', 'objectinfo'):
        if i in data:
            newdata[i] = data[i]
print(json_dumps(newdata))
----------

authored

2022-05-07 08:26:31 -0400

Browse Dir »

Test json against schema only on demand ...
7f65a5c2
```
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
```
Jay Berkenbilt authored
2022-05-07 08:26:31 -0400
Browse Dir »