From 5e8cfc2d2ea2e1b1027ac9496e6fcc4af7be344f Mon Sep 17 00:00:00 2001 From: m-holger Date: Wed, 10 Dec 2025 16:34:04 +0000 Subject: [PATCH] Add copilot-instructions.md --- .github/copilot-instructions.md | 256 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 256 insertions(+), 0 deletions(-) create mode 100644 .github/copilot-instructions.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..86d0ae0 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,256 @@ +# Copilot Coding Agent Instructions for qpdf + +## Repository Summary +qpdf is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption, page splitting/merging, and PDF file inspection. Version: 12.3.0. + +**Project Type:** C++ library and CLI tool (C++20 standard) +**Build System:** CMake 3.16+ with Ninja generator +**External Dependencies:** zlib, libjpeg, OpenSSL, GnuTLS (crypto providers) + +## Build Instructions + +### Quick Build (Recommended) +```bash +# Install dependencies (Ubuntu/Debian) +sudo apt-get install build-essential cmake ninja-build zlib1g-dev libjpeg-dev libgnutls28-dev libssl-dev + +# Configure and build +cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo +cmake --build build -j$(nproc) + +# Run tests +cd build && ctest --output-on-failure +``` + +### Using CMake Presets (Maintainer Mode) +```bash +cmake --preset maintainer # Configure +cmake --build --preset maintainer # Build +ctest --preset maintainer # Test +``` + +Available presets: `maintainer`, `maintainer-debug`, `maintainer-coverage`, `maintainer-profile`, `debug`, `release`, `sanitizers`, `msvc`, `msvc-release`. Use `cmake --list-presets` to see all options. + +### Build Notes +- **Always build out-of-source** in a subdirectory (e.g., `build/`). In-source builds are explicitly blocked. +- Build time: approximately 2-3 minutes on typical CI runners. +- Test suite time: approximately 1 minute for all 7 test groups. +- The `MAINTAINER_MODE` cmake option enables stricter checks and auto-generation of job files. + +## Running Tests +```bash +cd build + +# Run all tests +ctest --output-on-failure + +# Run specific test groups +ctest -R qpdf # Main qpdf CLI tests (~43 seconds) +ctest -R libtests # Library unit tests (~8 seconds) +ctest -R examples # Example code tests +ctest -R fuzz # Fuzzer tests + +# Run with verbose output +ctest --verbose +``` + +**Test Framework:** Tests use `qtest` (a Perl-based test framework). Tests are invoked via `ctest` and compare outputs against expected files. Test coverage uses `QTC::TC` macros. + +## Code Formatting +```bash +./format-code # Formats all C/C++ files with clang-format +``` +- Requires **clang-format version 20 or higher**. +- Configuration: `.clang-format` in the repository root. +- Always run before committing changes to C/C++ files. + +## Project Layout + +### Key Directories +| Directory | Purpose | +|-----------|---------| +| `libqpdf/` | Core library implementation (*.cc files) | +| `include/qpdf/` | Public headers (QPDF.hh, QPDFObjectHandle.hh, QPDFWriter.hh) | +| `qpdf/` | CLI executable and main test driver | +| `libtests/` | Library unit tests | +| `examples/` | Example programs demonstrating API usage | +| `fuzz/` | Fuzzer test programs for oss-fuzz | +| `manual/` | Documentation (reStructuredText for Sphinx) | +| `build-scripts/` | CI and build automation scripts | + +### Important Files +| File | Purpose | +|------|---------| +| `CMakeLists.txt` | Main build configuration | +| `CMakePresets.json` | Predefined build configurations | +| `job.yml` | Command-line argument definitions (auto-generates code) | +| `generate_auto_job` | Python script that generates argument parsing code | +| `.clang-format` | Code formatting rules | +| `README-maintainer.md` | Detailed maintainer and coding guidelines | + +### Auto-Generated Files +When modifying `job.yml` or CLI options, regenerate with: +```bash +./generate_auto_job --generate +# Or build with: cmake -DGENERATE_AUTO_JOB=ON +``` + +## CI Workflows (.github/workflows/) + +### main.yml (Primary CI) +- **Prebuild**: Documentation and external libs preparation +- **Linux**: Full build and test with image comparison +- **Windows**: MSVC and MinGW builds (32/64-bit) +- **macOS**: macOS build +- **AppImage**: Linux AppImage generation +- **Sanitizers**: AddressSanitizer and UndefinedBehaviorSanitizer tests +- **CodeCov**: Coverage reporting +- **pikepdf**: Compatibility testing with pikepdf Python library + +## Coding Conventions + +### Must Follow +1. **Assertions**: Test code should include `qpdf/assert_test.h` first. Debug code should include `qpdf/assert_debug.h` and use `qpdf_assert_debug` instead of `assert`. Use `qpdf_expect`, `qpdf_ensures`, `qpdf_invariant` for pre/post-conditions. Never use raw `assert()`. The `check-assert` test enforces this. +2. **Use `QIntC` for type conversions** - Required for safe integer casting. +3. **Avoid `operator[]`** - Use `.at()` for std::string and std::vector (see README-hardening.md). +4. **Include order**: Include the class's own header first, then a blank line, then other includes. +5. **Use `std::to_string`** instead of QUtil::int_to_string. + +### New Code Style (See `libqpdf/qpdf/AcroForm.hh` FormNode class for examples) +1. **PIMPL Pattern**: New public classes should use the PIMPL (Pointer to Implementation) pattern with a full implementation class. See `QPDFAcroFormDocumentHelper::Members` as an example. +2. **Avoid `this->`**: Do not use `this->` and remove it when updating existing code. +3. **QTC::TC Calls**: Remove simple `QTC::TC` calls (those with 2 parameters) unless they are the only executable statement in a branch. + - When removing a `QTC::TC` call: + - Use the first parameter to find the corresponding `.testcov` file. + - Remove the line in the `.testcov` (or related coverage file) that includes the second parameter. +4. **Doxygen Comments**: Use `///` style comments with appropriate tags (`@brief`, `@param`, `@return`, `@tparam`, `@since`). + ```cpp + /// @brief Retrieves the field value. + /// + /// @param inherit If true, traverse parent hierarchy. + /// @return The field value or empty string if not found. + std::string value() const; + ``` +5. **Member Variables**: Use trailing underscores for member variables (e.g., `cache_valid_`, `fields_`). +6. **Naming Conventions**: + - Use `snake_case` for new function and variable names (e.g., `fully_qualified_name()`, `root_field()`). + - **Exception**: PDF dictionary entry accessors and variables use the exact capitalization from the PDF spec (e.g., `FT()`, `TU()`, `DV()` for `/FT`, `/TU`, `/DV`). +7. **Getters/Setters**: Simple getters/setters use the attribute name without "get" or "set" prefixes: + ```cpp + String TU() const { return {get("/TU")}; } + ``` + Note: Names like `setFieldAttribute()` are legacy naming; new code should use `snake_case` (e.g., `set_field_attribute()`). + +The qpdf API is being actively updated. Prefer the new internal APIs in code in the libqpdf and +libtests directories: + +8. **New APIs are initially private** - New API additions are for internal qpdf use only + initially. Do not use in code in other directories, e.g. examples +9. **Prefer typed handles** - Use `BaseHandle` methods and typed object handles (`Integer`, + `Array`, `Dictionary`, `String`) over generic `QPDFObjectHandle` +10. **Use PIMPL pattern** - Prefer private implementation classes (`Members` classes) for + internal use +11. **Array semantics** - Array methods treat scalars as single-element arrays and null as empty + array (per PDF spec) +12. **Map semantics** - Map methods treat null values as missing entries (per PDF spec) +13. **Object references** - Methods often return references; avoid unnecessary copying but copy + if reference may become stale +14. **Thread safety** - Object handles cannot be shared across threads + +### Style +- Column limit: 100 characters +- Braces on their own lines for classes/functions +- Use `// line-break` comment to prevent clang-format from joining lines +- Use `// clang-format off/on` for blocks that shouldn't be formatted + +## Adding Command-Line Arguments +1. Add option to `job.yml` (top half for CLI, bottom half for JSON schema) +2. Add documentation in `manual/cli.rst` with `.. qpdf:option::` directive +3. Implement the Config method in `libqpdf/QPDFJob_config.cc` +4. Build with `-DGENERATE_AUTO_JOB=1` or run `./generate_auto_job --generate` + +## Adding Global Options and Limits + +Global options and limits are qpdf-wide settings in the `qpdf::global` namespace that affect behavior across all operations. See `README-maintainer.md` section "HOW TO ADD A GLOBAL OPTION OR LIMIT" for complete details. + +### Quick Reference for Global Options + +Global options are boolean settings (e.g., `inspection_mode`, `preserve_invalid_attributes`): + +1. **Add enum**: Add `qpdf_p_option_name` to `qpdf_param_e` enum in `include/qpdf/Constants.h` (use `0x11xxx` range) +2. **Add members**: Add `bool option_name_{false};` and optionally `bool option_name_set_{false};` to `Options` class in `libqpdf/qpdf/global_private.hh` +3. **Add methods**: Add static getter/setter to `Options` class in same file +4. **Add cases**: Add cases to `qpdf_global_get_uint32()` and `qpdf_global_set_uint32()` in `libqpdf/global.cc` +5. **Add public API**: Add inline getter/setter with Doxygen docs in `include/qpdf/global.hh` under `namespace options` +6. **Add tests**: Add tests in `libtests/objects.cc` +7. **CLI integration** (optional): Add to `job.yml` global section, regenerate, implement in `QPDFJob_config.cc`, document in `manual/cli.rst` + +### Quick Reference for Global Limits + +Global limits are uint32_t values (e.g., `parser_max_nesting`, `parser_max_errors`): + +- Similar steps to options, but use `Limits` class instead of `Options` class +- Place enum in `0x13xxx` (parser) or `0x14xxx` (stream) range +- Add to `namespace limits` in `global.hh` +- Consider interaction with `disable_defaults()` and add `_set_` flag if needed + +### Quick Reference for Global State + +Global state items are read-only values (e.g., `version_major`, `invalid_attribute_errors`): + +1. **Add enum**: Add `qpdf_p_state_item` to enum in Constants.h (use `0x10xxx` range for global state) +2. **Add member**: Add `uint32_t state_item_{initial_value};` to `State` class in `global_private.hh` +3. **Add getter**: Add `static uint32_t const& state_item()` getter in `State` class +4. **For error counters**: Also add `static void error_type()` incrementer method +5. **Add public API**: Add read-only getter at top level of `qpdf::global` namespace in `global.hh` +6. **Add case**: Add case to `qpdf_global_get_uint32()` in `global.cc` (read-only, no setter) +7. **Add tests**: Add tests in `libtests/objects.cc` +8. **For error counters**: Add warning in `QPDFJob.cc` and call `global::State::error_type()` where errors occur + +### Example + +The `preserve_invalid_attributes` feature demonstrates all patterns: +- Commit 1: Global option (C++ API) +- Commit 2: CLI integration +- Commit 3: Error tracking (`invalid_attribute_errors` counter in State class) + +## Pull Request Review Guidelines + +When reviewing pull requests and providing feedback with recommended changes: + +1. **Open a new pull request with your comments and recommended changes** - Do not comment on the existing PR. Create a new PR that: + - Forks from the PR branch being reviewed + - Includes your recommended changes as commits + - Links back to the original PR in the description + - Explains each change clearly in commit messages + +2. This approach allows: + - The original author to review, discuss, and merge your suggestions + - Changes to be tested in CI before being accepted + - A clear history of who made which changes + - Easy cherry-picking of specific suggestions + +## Validation Checklist +Before submitting changes: +- [ ] `cmake --build build` succeeds without warnings (WERROR is ON in maintainer mode) +- [ ] `ctest --output-on-failure` - all tests pass +- [ ] `./format-code` - code is properly formatted +- [ ] `./spell-check` - no spelling errors (requires cspell: `npm install -g cspell`) + +## Troubleshooting + +### Common Issues +1. **"clang-format version >= 20 is required"**: The `format-code` script automatically tries `clang-format-20` if available. Install clang-format 20 or newer via your package manager. +2. **Build fails in source directory**: Always use out-of-source builds (`cmake -B build`). +3. **Tests fail with file comparison errors**: May be due to zlib version differences. Use `qpdf-test-compare` for comparisons. +4. **generate_auto_job errors**: Ensure Python 3 and PyYAML are installed. + +### Environment Variables for Extended Tests +- `QPDF_TEST_COMPARE_IMAGES=1`: Enable image comparison tests +- `QPDF_LARGE_FILE_TEST_PATH=/path`: Enable large file tests (needs 11GB free) + +## Trust These Instructions +These instructions have been validated against the actual repository. Only search for additional information if: +- Instructions appear outdated or incomplete +- Build commands fail unexpectedly +- Test patterns don't match current code structure -- libgit2 0.21.4