Commit 5e8cfc2d2ea2e1b1027ac9496e6fcc4af7be344f
1 parent
35ca7efe
Add copilot-instructions.md
Showing
1 changed file
with
256 additions
and
0 deletions
.github/copilot-instructions.md
0 → 100644
| 1 | +# Copilot Coding Agent Instructions for qpdf | |
| 2 | + | |
| 3 | +## Repository Summary | |
| 4 | +qpdf is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption, page splitting/merging, and PDF file inspection. Version: 12.3.0. | |
| 5 | + | |
| 6 | +**Project Type:** C++ library and CLI tool (C++20 standard) | |
| 7 | +**Build System:** CMake 3.16+ with Ninja generator | |
| 8 | +**External Dependencies:** zlib, libjpeg, OpenSSL, GnuTLS (crypto providers) | |
| 9 | + | |
| 10 | +## Build Instructions | |
| 11 | + | |
| 12 | +### Quick Build (Recommended) | |
| 13 | +```bash | |
| 14 | +# Install dependencies (Ubuntu/Debian) | |
| 15 | +sudo apt-get install build-essential cmake ninja-build zlib1g-dev libjpeg-dev libgnutls28-dev libssl-dev | |
| 16 | + | |
| 17 | +# Configure and build | |
| 18 | +cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo | |
| 19 | +cmake --build build -j$(nproc) | |
| 20 | + | |
| 21 | +# Run tests | |
| 22 | +cd build && ctest --output-on-failure | |
| 23 | +``` | |
| 24 | + | |
| 25 | +### Using CMake Presets (Maintainer Mode) | |
| 26 | +```bash | |
| 27 | +cmake --preset maintainer # Configure | |
| 28 | +cmake --build --preset maintainer # Build | |
| 29 | +ctest --preset maintainer # Test | |
| 30 | +``` | |
| 31 | + | |
| 32 | +Available presets: `maintainer`, `maintainer-debug`, `maintainer-coverage`, `maintainer-profile`, `debug`, `release`, `sanitizers`, `msvc`, `msvc-release`. Use `cmake --list-presets` to see all options. | |
| 33 | + | |
| 34 | +### Build Notes | |
| 35 | +- **Always build out-of-source** in a subdirectory (e.g., `build/`). In-source builds are explicitly blocked. | |
| 36 | +- Build time: approximately 2-3 minutes on typical CI runners. | |
| 37 | +- Test suite time: approximately 1 minute for all 7 test groups. | |
| 38 | +- The `MAINTAINER_MODE` cmake option enables stricter checks and auto-generation of job files. | |
| 39 | + | |
| 40 | +## Running Tests | |
| 41 | +```bash | |
| 42 | +cd build | |
| 43 | + | |
| 44 | +# Run all tests | |
| 45 | +ctest --output-on-failure | |
| 46 | + | |
| 47 | +# Run specific test groups | |
| 48 | +ctest -R qpdf # Main qpdf CLI tests (~43 seconds) | |
| 49 | +ctest -R libtests # Library unit tests (~8 seconds) | |
| 50 | +ctest -R examples # Example code tests | |
| 51 | +ctest -R fuzz # Fuzzer tests | |
| 52 | + | |
| 53 | +# Run with verbose output | |
| 54 | +ctest --verbose | |
| 55 | +``` | |
| 56 | + | |
| 57 | +**Test Framework:** Tests use `qtest` (a Perl-based test framework). Tests are invoked via `ctest` and compare outputs against expected files. Test coverage uses `QTC::TC` macros. | |
| 58 | + | |
| 59 | +## Code Formatting | |
| 60 | +```bash | |
| 61 | +./format-code # Formats all C/C++ files with clang-format | |
| 62 | +``` | |
| 63 | +- Requires **clang-format version 20 or higher**. | |
| 64 | +- Configuration: `.clang-format` in the repository root. | |
| 65 | +- Always run before committing changes to C/C++ files. | |
| 66 | + | |
| 67 | +## Project Layout | |
| 68 | + | |
| 69 | +### Key Directories | |
| 70 | +| Directory | Purpose | | |
| 71 | +|-----------|---------| | |
| 72 | +| `libqpdf/` | Core library implementation (*.cc files) | | |
| 73 | +| `include/qpdf/` | Public headers (QPDF.hh, QPDFObjectHandle.hh, QPDFWriter.hh) | | |
| 74 | +| `qpdf/` | CLI executable and main test driver | | |
| 75 | +| `libtests/` | Library unit tests | | |
| 76 | +| `examples/` | Example programs demonstrating API usage | | |
| 77 | +| `fuzz/` | Fuzzer test programs for oss-fuzz | | |
| 78 | +| `manual/` | Documentation (reStructuredText for Sphinx) | | |
| 79 | +| `build-scripts/` | CI and build automation scripts | | |
| 80 | + | |
| 81 | +### Important Files | |
| 82 | +| File | Purpose | | |
| 83 | +|------|---------| | |
| 84 | +| `CMakeLists.txt` | Main build configuration | | |
| 85 | +| `CMakePresets.json` | Predefined build configurations | | |
| 86 | +| `job.yml` | Command-line argument definitions (auto-generates code) | | |
| 87 | +| `generate_auto_job` | Python script that generates argument parsing code | | |
| 88 | +| `.clang-format` | Code formatting rules | | |
| 89 | +| `README-maintainer.md` | Detailed maintainer and coding guidelines | | |
| 90 | + | |
| 91 | +### Auto-Generated Files | |
| 92 | +When modifying `job.yml` or CLI options, regenerate with: | |
| 93 | +```bash | |
| 94 | +./generate_auto_job --generate | |
| 95 | +# Or build with: cmake -DGENERATE_AUTO_JOB=ON | |
| 96 | +``` | |
| 97 | + | |
| 98 | +## CI Workflows (.github/workflows/) | |
| 99 | + | |
| 100 | +### main.yml (Primary CI) | |
| 101 | +- **Prebuild**: Documentation and external libs preparation | |
| 102 | +- **Linux**: Full build and test with image comparison | |
| 103 | +- **Windows**: MSVC and MinGW builds (32/64-bit) | |
| 104 | +- **macOS**: macOS build | |
| 105 | +- **AppImage**: Linux AppImage generation | |
| 106 | +- **Sanitizers**: AddressSanitizer and UndefinedBehaviorSanitizer tests | |
| 107 | +- **CodeCov**: Coverage reporting | |
| 108 | +- **pikepdf**: Compatibility testing with pikepdf Python library | |
| 109 | + | |
| 110 | +## Coding Conventions | |
| 111 | + | |
| 112 | +### Must Follow | |
| 113 | +1. **Assertions**: Test code should include `qpdf/assert_test.h` first. Debug code should include `qpdf/assert_debug.h` and use `qpdf_assert_debug` instead of `assert`. Use `qpdf_expect`, `qpdf_ensures`, `qpdf_invariant` for pre/post-conditions. Never use raw `assert()`. The `check-assert` test enforces this. | |
| 114 | +2. **Use `QIntC` for type conversions** - Required for safe integer casting. | |
| 115 | +3. **Avoid `operator[]`** - Use `.at()` for std::string and std::vector (see README-hardening.md). | |
| 116 | +4. **Include order**: Include the class's own header first, then a blank line, then other includes. | |
| 117 | +5. **Use `std::to_string`** instead of QUtil::int_to_string. | |
| 118 | + | |
| 119 | +### New Code Style (See `libqpdf/qpdf/AcroForm.hh` FormNode class for examples) | |
| 120 | +1. **PIMPL Pattern**: New public classes should use the PIMPL (Pointer to Implementation) pattern with a full implementation class. See `QPDFAcroFormDocumentHelper::Members` as an example. | |
| 121 | +2. **Avoid `this->`**: Do not use `this->` and remove it when updating existing code. | |
| 122 | +3. **QTC::TC Calls**: Remove simple `QTC::TC` calls (those with 2 parameters) unless they are the only executable statement in a branch. | |
| 123 | + - When removing a `QTC::TC` call: | |
| 124 | + - Use the first parameter to find the corresponding `.testcov` file. | |
| 125 | + - Remove the line in the `.testcov` (or related coverage file) that includes the second parameter. | |
| 126 | +4. **Doxygen Comments**: Use `///` style comments with appropriate tags (`@brief`, `@param`, `@return`, `@tparam`, `@since`). | |
| 127 | + ```cpp | |
| 128 | + /// @brief Retrieves the field value. | |
| 129 | + /// | |
| 130 | + /// @param inherit If true, traverse parent hierarchy. | |
| 131 | + /// @return The field value or empty string if not found. | |
| 132 | + std::string value() const; | |
| 133 | + ``` | |
| 134 | +5. **Member Variables**: Use trailing underscores for member variables (e.g., `cache_valid_`, `fields_`). | |
| 135 | +6. **Naming Conventions**: | |
| 136 | + - Use `snake_case` for new function and variable names (e.g., `fully_qualified_name()`, `root_field()`). | |
| 137 | + - **Exception**: PDF dictionary entry accessors and variables use the exact capitalization from the PDF spec (e.g., `FT()`, `TU()`, `DV()` for `/FT`, `/TU`, `/DV`). | |
| 138 | +7. **Getters/Setters**: Simple getters/setters use the attribute name without "get" or "set" prefixes: | |
| 139 | + ```cpp | |
| 140 | + String TU() const { return {get("/TU")}; } | |
| 141 | + ``` | |
| 142 | + Note: Names like `setFieldAttribute()` are legacy naming; new code should use `snake_case` (e.g., `set_field_attribute()`). | |
| 143 | + | |
| 144 | +The qpdf API is being actively updated. Prefer the new internal APIs in code in the libqpdf and | |
| 145 | +libtests directories: | |
| 146 | + | |
| 147 | +8. **New APIs are initially private** - New API additions are for internal qpdf use only | |
| 148 | + initially. Do not use in code in other directories, e.g. examples | |
| 149 | +9. **Prefer typed handles** - Use `BaseHandle` methods and typed object handles (`Integer`, | |
| 150 | + `Array`, `Dictionary`, `String`) over generic `QPDFObjectHandle` | |
| 151 | +10. **Use PIMPL pattern** - Prefer private implementation classes (`Members` classes) for | |
| 152 | + internal use | |
| 153 | +11. **Array semantics** - Array methods treat scalars as single-element arrays and null as empty | |
| 154 | + array (per PDF spec) | |
| 155 | +12. **Map semantics** - Map methods treat null values as missing entries (per PDF spec) | |
| 156 | +13. **Object references** - Methods often return references; avoid unnecessary copying but copy | |
| 157 | + if reference may become stale | |
| 158 | +14. **Thread safety** - Object handles cannot be shared across threads | |
| 159 | + | |
| 160 | +### Style | |
| 161 | +- Column limit: 100 characters | |
| 162 | +- Braces on their own lines for classes/functions | |
| 163 | +- Use `// line-break` comment to prevent clang-format from joining lines | |
| 164 | +- Use `// clang-format off/on` for blocks that shouldn't be formatted | |
| 165 | + | |
| 166 | +## Adding Command-Line Arguments | |
| 167 | +1. Add option to `job.yml` (top half for CLI, bottom half for JSON schema) | |
| 168 | +2. Add documentation in `manual/cli.rst` with `.. qpdf:option::` directive | |
| 169 | +3. Implement the Config method in `libqpdf/QPDFJob_config.cc` | |
| 170 | +4. Build with `-DGENERATE_AUTO_JOB=1` or run `./generate_auto_job --generate` | |
| 171 | + | |
| 172 | +## Adding Global Options and Limits | |
| 173 | + | |
| 174 | +Global options and limits are qpdf-wide settings in the `qpdf::global` namespace that affect behavior across all operations. See `README-maintainer.md` section "HOW TO ADD A GLOBAL OPTION OR LIMIT" for complete details. | |
| 175 | + | |
| 176 | +### Quick Reference for Global Options | |
| 177 | + | |
| 178 | +Global options are boolean settings (e.g., `inspection_mode`, `preserve_invalid_attributes`): | |
| 179 | + | |
| 180 | +1. **Add enum**: Add `qpdf_p_option_name` to `qpdf_param_e` enum in `include/qpdf/Constants.h` (use `0x11xxx` range) | |
| 181 | +2. **Add members**: Add `bool option_name_{false};` and optionally `bool option_name_set_{false};` to `Options` class in `libqpdf/qpdf/global_private.hh` | |
| 182 | +3. **Add methods**: Add static getter/setter to `Options` class in same file | |
| 183 | +4. **Add cases**: Add cases to `qpdf_global_get_uint32()` and `qpdf_global_set_uint32()` in `libqpdf/global.cc` | |
| 184 | +5. **Add public API**: Add inline getter/setter with Doxygen docs in `include/qpdf/global.hh` under `namespace options` | |
| 185 | +6. **Add tests**: Add tests in `libtests/objects.cc` | |
| 186 | +7. **CLI integration** (optional): Add to `job.yml` global section, regenerate, implement in `QPDFJob_config.cc`, document in `manual/cli.rst` | |
| 187 | + | |
| 188 | +### Quick Reference for Global Limits | |
| 189 | + | |
| 190 | +Global limits are uint32_t values (e.g., `parser_max_nesting`, `parser_max_errors`): | |
| 191 | + | |
| 192 | +- Similar steps to options, but use `Limits` class instead of `Options` class | |
| 193 | +- Place enum in `0x13xxx` (parser) or `0x14xxx` (stream) range | |
| 194 | +- Add to `namespace limits` in `global.hh` | |
| 195 | +- Consider interaction with `disable_defaults()` and add `_set_` flag if needed | |
| 196 | + | |
| 197 | +### Quick Reference for Global State | |
| 198 | + | |
| 199 | +Global state items are read-only values (e.g., `version_major`, `invalid_attribute_errors`): | |
| 200 | + | |
| 201 | +1. **Add enum**: Add `qpdf_p_state_item` to enum in Constants.h (use `0x10xxx` range for global state) | |
| 202 | +2. **Add member**: Add `uint32_t state_item_{initial_value};` to `State` class in `global_private.hh` | |
| 203 | +3. **Add getter**: Add `static uint32_t const& state_item()` getter in `State` class | |
| 204 | +4. **For error counters**: Also add `static void error_type()` incrementer method | |
| 205 | +5. **Add public API**: Add read-only getter at top level of `qpdf::global` namespace in `global.hh` | |
| 206 | +6. **Add case**: Add case to `qpdf_global_get_uint32()` in `global.cc` (read-only, no setter) | |
| 207 | +7. **Add tests**: Add tests in `libtests/objects.cc` | |
| 208 | +8. **For error counters**: Add warning in `QPDFJob.cc` and call `global::State::error_type()` where errors occur | |
| 209 | + | |
| 210 | +### Example | |
| 211 | + | |
| 212 | +The `preserve_invalid_attributes` feature demonstrates all patterns: | |
| 213 | +- Commit 1: Global option (C++ API) | |
| 214 | +- Commit 2: CLI integration | |
| 215 | +- Commit 3: Error tracking (`invalid_attribute_errors` counter in State class) | |
| 216 | + | |
| 217 | +## Pull Request Review Guidelines | |
| 218 | + | |
| 219 | +When reviewing pull requests and providing feedback with recommended changes: | |
| 220 | + | |
| 221 | +1. **Open a new pull request with your comments and recommended changes** - Do not comment on the existing PR. Create a new PR that: | |
| 222 | + - Forks from the PR branch being reviewed | |
| 223 | + - Includes your recommended changes as commits | |
| 224 | + - Links back to the original PR in the description | |
| 225 | + - Explains each change clearly in commit messages | |
| 226 | + | |
| 227 | +2. This approach allows: | |
| 228 | + - The original author to review, discuss, and merge your suggestions | |
| 229 | + - Changes to be tested in CI before being accepted | |
| 230 | + - A clear history of who made which changes | |
| 231 | + - Easy cherry-picking of specific suggestions | |
| 232 | + | |
| 233 | +## Validation Checklist | |
| 234 | +Before submitting changes: | |
| 235 | +- [ ] `cmake --build build` succeeds without warnings (WERROR is ON in maintainer mode) | |
| 236 | +- [ ] `ctest --output-on-failure` - all tests pass | |
| 237 | +- [ ] `./format-code` - code is properly formatted | |
| 238 | +- [ ] `./spell-check` - no spelling errors (requires cspell: `npm install -g cspell`) | |
| 239 | + | |
| 240 | +## Troubleshooting | |
| 241 | + | |
| 242 | +### Common Issues | |
| 243 | +1. **"clang-format version >= 20 is required"**: The `format-code` script automatically tries `clang-format-20` if available. Install clang-format 20 or newer via your package manager. | |
| 244 | +2. **Build fails in source directory**: Always use out-of-source builds (`cmake -B build`). | |
| 245 | +3. **Tests fail with file comparison errors**: May be due to zlib version differences. Use `qpdf-test-compare` for comparisons. | |
| 246 | +4. **generate_auto_job errors**: Ensure Python 3 and PyYAML are installed. | |
| 247 | + | |
| 248 | +### Environment Variables for Extended Tests | |
| 249 | +- `QPDF_TEST_COMPARE_IMAGES=1`: Enable image comparison tests | |
| 250 | +- `QPDF_LARGE_FILE_TEST_PATH=/path`: Enable large file tests (needs 11GB free) | |
| 251 | + | |
| 252 | +## Trust These Instructions | |
| 253 | +These instructions have been validated against the actual repository. Only search for additional information if: | |
| 254 | +- Instructions appear outdated or incomplete | |
| 255 | +- Build commands fail unexpectedly | |
| 256 | +- Test patterns don't match current code structure | ... | ... |