Copilot Coding Agent Instructions for qpdf
Repository Summary
qpdf is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption, page splitting/merging, and PDF file inspection. Version: 12.3.0.
Project Type: C++ library and CLI tool (C++20 standard)
Build System: CMake 3.16+ with Ninja generator
External Dependencies: zlib, libjpeg, OpenSSL, GnuTLS (crypto providers)
Build Instructions
Quick Build (Recommended)
# Install dependencies (Ubuntu/Debian)
sudo apt-get install build-essential cmake ninja-build zlib1g-dev libjpeg-dev libgnutls28-dev libssl-dev
# Configure and build
cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build -j$(nproc)
# Run tests
cd build && ctest --output-on-failure
Using CMake Presets (Maintainer Mode)
cmake --preset maintainer # Configure
cmake --build --preset maintainer # Build
ctest --preset maintainer # Test
Available presets: maintainer, maintainer-debug, maintainer-coverage, maintainer-profile, debug, release, sanitizers, msvc, msvc-release. Use cmake --list-presets to see all options.
Build Notes
- Always build out-of-source in a subdirectory (e.g.,
build/). In-source builds are explicitly blocked. - Build time: approximately 2-3 minutes on typical CI runners.
- Test suite time: approximately 1 minute for all 7 test groups.
- The
MAINTAINER_MODEcmake option enables stricter checks and auto-generation of job files.
Running Tests
cd build
# Run all tests
ctest --output-on-failure
# Run specific test groups
ctest -R qpdf # Main qpdf CLI tests (~43 seconds)
ctest -R libtests # Library unit tests (~8 seconds)
ctest -R examples # Example code tests
ctest -R fuzz # Fuzzer tests
# Run with verbose output
ctest --verbose
Test Framework: Tests use qtest (a Perl-based test framework). Tests are invoked via ctest and compare outputs against expected files. Test coverage uses QTC::TC macros.
Code Formatting
./format-code # Formats all C/C++ files with clang-format
- Requires clang-format version 20 or higher.
- Configuration:
.clang-formatin the repository root. - Always run before committing changes to C/C++ files.
Project Layout
Key Directories
| Directory | Purpose |
|---|---|
libqpdf/ |
Core library implementation (*.cc files) |
include/qpdf/ |
Public headers (QPDF.hh, QPDFObjectHandle.hh, QPDFWriter.hh) |
qpdf/ |
CLI executable and main test driver |
libtests/ |
Library unit tests |
examples/ |
Example programs demonstrating API usage |
fuzz/ |
Fuzzer test programs for oss-fuzz |
manual/ |
Documentation (reStructuredText for Sphinx) |
build-scripts/ |
CI and build automation scripts |
Important Files
| File | Purpose |
|---|---|
CMakeLists.txt |
Main build configuration |
CMakePresets.json |
Predefined build configurations |
job.yml |
Command-line argument definitions (auto-generates code) |
generate_auto_job |
Python script that generates argument parsing code |
.clang-format |
Code formatting rules |
README-maintainer.md |
Detailed maintainer and coding guidelines |
Auto-Generated Files
When modifying job.yml or CLI options, regenerate with:
./generate_auto_job --generate
# Or build with: cmake -DGENERATE_AUTO_JOB=ON
CI Workflows (.github/workflows/)
main.yml (Primary CI)
- Prebuild: Documentation and external libs preparation
- Linux: Full build and test with image comparison
- Windows: MSVC and MinGW builds (32/64-bit)
- macOS: macOS build
- AppImage: Linux AppImage generation
- Sanitizers: AddressSanitizer and UndefinedBehaviorSanitizer tests
- CodeCov: Coverage reporting
- pikepdf: Compatibility testing with pikepdf Python library
Coding Conventions
Must Follow
- Assertions: Test code should include
qpdf/assert_test.hfirst. Debug code should includeqpdf/assert_debug.hand useqpdf_assert_debuginstead ofassert. Useqpdf_expect,qpdf_ensures,qpdf_invariantfor pre/post-conditions. Never use rawassert(). Thecheck-asserttest enforces this. - Use
QIntCfor type conversions - Required for safe integer casting. - Avoid
operator[]- Use.at()for std::string and std::vector (see README-hardening.md). - Include order: Include the class's own header first, then a blank line, then other includes.
- Use
std::to_stringinstead of QUtil::int_to_string.
New Code Style (See libqpdf/qpdf/AcroForm.hh FormNode class for examples)
- PIMPL Pattern: New public classes should use the PIMPL (Pointer to Implementation) pattern with a full implementation class. See
QPDFAcroFormDocumentHelper::Membersas an example. - Avoid
this->: Do not usethis->and remove it when updating existing code. - QTC::TC Calls: Remove simple
QTC::TCcalls (those with 2 parameters) unless they are the only executable statement in a branch.- When removing a
QTC::TCcall:- Use the first parameter to find the corresponding
.testcovfile. - Remove the line in the
.testcov(or related coverage file) that includes the second parameter.
- Use the first parameter to find the corresponding
- When removing a
- Doxygen Comments: Use
///style comments with appropriate tags (@brief,@param,@return,@tparam,@since).cpp /// @brief Retrieves the field value. /// /// @param inherit If true, traverse parent hierarchy. /// @return The field value or empty string if not found. std::string value() const; - Member Variables: Use trailing underscores for member variables (e.g.,
cache_valid_,fields_). - Naming Conventions:
- Use
snake_casefor new function and variable names (e.g.,fully_qualified_name(),root_field()). - Exception: PDF dictionary entry accessors and variables use the exact capitalization from the PDF spec (e.g.,
FT(),TU(),DV()for/FT,/TU,/DV).
- Use
- Getters/Setters: Simple getters/setters use the attribute name without "get" or "set" prefixes:
cpp String TU() const { return {get("/TU")}; }Note: Names likesetFieldAttribute()are legacy naming; new code should usesnake_case(e.g.,set_field_attribute()).
The qpdf API is being actively updated. Prefer the new internal APIs in code in the libqpdf and libtests directories:
- New APIs are initially private - New API additions are for internal qpdf use only initially. Do not use in code in other directories, e.g. examples
- Prefer typed handles - Use
BaseHandlemethods and typed object handles (Integer,Array,Dictionary,String) over genericQPDFObjectHandle - Use PIMPL pattern - Prefer private implementation classes (
Membersclasses) for internal use - Array semantics - Array methods treat scalars as single-element arrays and null as empty array (per PDF spec)
- Map semantics - Map methods treat null values as missing entries (per PDF spec)
- Object references - Methods often return references; avoid unnecessary copying but copy if reference may become stale
- Thread safety - Object handles cannot be shared across threads
Style
- Column limit: 100 characters
- Braces on their own lines for classes/functions
- Use
// line-breakcomment to prevent clang-format from joining lines - Use
// clang-format off/onfor blocks that shouldn't be formatted
Adding Command-Line Arguments
- Add option to
job.yml(top half for CLI, bottom half for JSON schema) - Add documentation in
manual/cli.rstwith.. qpdf:option::directive - Implement the Config method in
libqpdf/QPDFJob_config.cc - Build with
-DGENERATE_AUTO_JOB=1or run./generate_auto_job --generate
Adding Global Options and Limits
Global options and limits are qpdf-wide settings in the qpdf::global namespace that affect behavior across all operations. See README-maintainer.md section "HOW TO ADD A GLOBAL OPTION OR LIMIT" for complete details.
Quick Reference for Global Options
Global options are boolean settings (e.g., inspection_mode, preserve_invalid_attributes):
- Add enum: Add
qpdf_p_option_nametoqpdf_param_eenum ininclude/qpdf/Constants.h(use0x11xxxrange) - Add members: Add
bool option_name_{false};and optionallybool option_name_set_{false};toOptionsclass inlibqpdf/qpdf/global_private.hh - Add methods: Add static getter/setter to
Optionsclass in same file - Add cases: Add cases to
qpdf_global_get_uint32()andqpdf_global_set_uint32()inlibqpdf/global.cc - Add public API: Add inline getter/setter with Doxygen docs in
include/qpdf/global.hhundernamespace options - Add tests: Add tests in
libtests/objects.cc - CLI integration (optional): Add to
job.ymlglobal section, regenerate, implement inQPDFJob_config.cc, document inmanual/cli.rst
Quick Reference for Global Limits
Global limits are uint32_t values (e.g., parser_max_nesting, parser_max_errors):
- Similar steps to options, but use
Limitsclass instead ofOptionsclass - Place enum in
0x13xxx(parser) or0x14xxx(stream) range - Add to
namespace limitsinglobal.hh - Consider interaction with
disable_defaults()and add_set_flag if needed
Quick Reference for Global State
Global state items are read-only values (e.g., version_major, invalid_attribute_errors):
- Add enum: Add
qpdf_p_state_itemto enum in Constants.h (use0x10xxxrange for global state) - Add member: Add
uint32_t state_item_{initial_value};toStateclass inglobal_private.hh - Add getter: Add
static uint32_t const& state_item()getter inStateclass - For error counters: Also add
static void error_type()incrementer method - Add public API: Add read-only getter at top level of
qpdf::globalnamespace inglobal.hh - Add case: Add case to
qpdf_global_get_uint32()inglobal.cc(read-only, no setter) - Add tests: Add tests in
libtests/objects.cc - For error counters: Add warning in
QPDFJob.ccand callglobal::State::error_type()where errors occur
Example
The preserve_invalid_attributes feature demonstrates all patterns:
- Commit 1: Global option (C++ API)
- Commit 2: CLI integration
- Commit 3: Error tracking (
invalid_attribute_errorscounter in State class)
Pull Request Review Guidelines
When reviewing pull requests and providing feedback with recommended changes:
Open a new pull request with your comments and recommended changes - Do not comment on the existing PR. Create a new PR that:
- Forks from the PR branch being reviewed
- Includes your recommended changes as commits
- Links back to the original PR in the description
- Explains each change clearly in commit messages
This approach allows:
- The original author to review, discuss, and merge your suggestions
- Changes to be tested in CI before being accepted
- A clear history of who made which changes
- Easy cherry-picking of specific suggestions
Validation Checklist
Before submitting changes:
- [ ]
cmake --build buildsucceeds without warnings (WERROR is ON in maintainer mode) - [ ]
ctest --output-on-failure- all tests pass - [ ]
./format-code- code is properly formatted - [ ]
./spell-check- no spelling errors (requires cspell:npm install -g cspell)
Troubleshooting
Common Issues
- "clang-format version >= 20 is required": The
format-codescript automatically triesclang-format-20if available. Install clang-format 20 or newer via your package manager. - Build fails in source directory: Always use out-of-source builds (
cmake -B build). - Tests fail with file comparison errors: May be due to zlib version differences. Use
qpdf-test-comparefor comparisons. - generate_auto_job errors: Ensure Python 3 and PyYAML are installed.
Environment Variables for Extended Tests
QPDF_TEST_COMPARE_IMAGES=1: Enable image comparison testsQPDF_LARGE_FILE_TEST_PATH=/path: Enable large file tests (needs 11GB free)
Trust These Instructions
These instructions have been validated against the actual repository. Only search for additional information if:
- Instructions appear outdated or incomplete
- Build commands fail unexpectedly
- Test patterns don't match current code structure