Commit ddbe59179e64c45a375d6886f892059b49fd81b2

Authored by Jay Berkenbilt
1 parent 59a64115

C API: simplify new error handling and improve documentation

ChangeLog
1 1 2021-12-17 Jay Berkenbilt <ejb@ql.org>
2 2  
  3 + * C API: simplify error handling for uncaught errors (never in a
  4 + released version) and clarify documentation in qpdf-c.h around
  5 + error handling. See qpdf-c.h for details, including how to check
  6 + for errors and the new function qpdf_silence_errors.
  7 +
3 8 * C API: expose getTypeCode and getTypeName from QPDFObjectHandle.
4 9 Fixes #597.
5 10  
... ... @@ -40,10 +45,7 @@
40 45 interfaces. Clarify documentation regarding object accessors and
41 46 how type errors and warnings are handled. Many cases that used to
42 47 crash code that used the C API can now be trapped and will be
43   - written stderr if not trapped. The new method
44   - qpdf_register_oh_error_handler can be used to specifically handle
45   - errors that occur when accessing object handles. See qpdf-c.h for
46   - details.
  48 + written stderr if not trapped. See qpdf-c.h for details.
47 49  
48 50 * C API: Add qpdf_oh_new_uninitialized to explicitly create
49 51 uninitialized object handles.
... ...
include/qpdf/qpdf-c.h
... ... @@ -33,6 +33,10 @@
33 33 *
34 34 * There are several things to keep in mind when using the C API.
35 35 *
  36 + * Error handling is tricky because the underlying C++ API uses
  37 + * exception handling. See "ERROR HANDLING" below for a detailed
  38 + * explanation.
  39 + *
36 40 * The C API is not as rich as the C++ API. For any operations
37 41 * that involve actually manipulating PDF objects, you must use
38 42 * the C++ API. The C API is primarily useful for doing basic
... ... @@ -46,8 +50,9 @@
46 50 * multiple threads.
47 51 *
48 52 * All dynamic memory, except for that of the qpdf_data object
49   - * itself, is managed by the library. You must create a qpdf_data
50   - * object using qpdf_init and free it using qpdf_cleanup.
  53 + * itself, is managed by the library unless otherwise noted. You
  54 + * must create a qpdf_data object using qpdf_init and free it
  55 + * using qpdf_cleanup.
51 56 *
52 57 * Many functions return char*. In all cases, the char* values
53 58 * returned are pointers to data inside the qpdf_data object. As
... ... @@ -61,29 +66,6 @@
61 66 * qpdf_get_last_string_length() to get the length of whatever
62 67 * string was just returned.
63 68 *
64   - * Many functions defined here merely set parameters and therefore
65   - * never return error conditions. Functions that access or return
66   - * qpdf_oh object handles may generate warnings but have no way to
67   - * return errors, but the errors may be checked afterwards or
68   - * handled using a registered handler. This is discussed in more
69   - * detail in the section on object handling. Functions that may
70   - * cause PDF files to be read or written may return error
71   - * conditions. Such functions return an error code. If there were
72   - * no errors or warnings, they return QPDF_SUCCESS. If there were
73   - * warnings, the return value has the QPDF_WARNINGS bit set. If
74   - * there were errors, the QPDF_ERRORS bit is set. In other words,
75   - * if there are both warnings and errors, then the return status
76   - * will be QPDF_WARNINGS | QPDF_ERRORS. You may also call the
77   - * qpdf_more_warnings and qpdf_more_errors functions to test
78   - * whether there are unseen warning or error conditions. By
79   - * default, warnings are written to stderr when detected, but this
80   - * behavior can be suppressed. In all cases, errors and warnings
81   - * may be retrieved by calling qpdf_next_warning and
82   - * qpdf_get_error. All exceptions thrown by the C++ interface are
83   - * caught and converted into error messages by the C interface.
84   - * Any exceptions to this are qpdf bugs and should be reported at
85   - * https://github.com/qpdf/qpdf/issues/new.
86   - *
87 69 * Most functions defined here have obvious counterparts that are
88 70 * methods to either QPDF or QPDFWriter. Please see comments in
89 71 * QPDF.hh and QPDFWriter.hh for details on their use. In order
... ... @@ -91,6 +73,72 @@
91 73 * primarily on differences between the C and C++ API.
92 74 */
93 75  
  76 +/* ERROR HANDLING -- changed in qpdf 10.5 */
  77 +
  78 +/* SUMMARY: The only way to know whether a function that does not
  79 + * return an error code has encountered an error is to call
  80 + * qpdf_has_error after each function. You can do this even for
  81 + * functions that do return error codes. You can also call
  82 + * qpdf_silence_errors to prevent qpdf from writing these errors to
  83 + * stderr.
  84 + *
  85 + * DETAILS:
  86 + *
  87 + * There is some complexity in this API's error handling as it tries
  88 + * to adopt C++-style exception handling to work with languages that
  89 + * don't support exceptions, such as C.
  90 + *
  91 + * The data type underlying qpdf_data maintains a list of warnings and
  92 + * a single error. To retrieve warnings, call qpdf_next_warning while
  93 + * qpdf_more_warnings is true. To retrieve the error, call
  94 + * qpdf_get_error when qpdf_has_error is true.
  95 + *
  96 + * There are several things that are important to understand.
  97 + *
  98 + * Some functions return an error code. The value of the error code is
  99 + * made up of a bitwise-OR of QPDF_WARNINGS and QPDF_ERRORS. The
  100 + * QPDF_ERRORS bit is set if there was an error during the *most
  101 + * recent call* to the API. The QPDF_WARNINGS bit is set if there are
  102 + * any warnings that have not yet been retrieved by calling
  103 + * qpdf_more_warnings. It is possible for both its or neither bit to
  104 + * be set.
  105 + *
  106 + * The expected mode of operation is to go through a series of
  107 + * operations, checking for errors after each call, but only checking
  108 + * for warnings at the end. This is similar to how it works in the C++
  109 + * API where warnings are handled in exactly this way but errors
  110 + * result in exceptions being thrown. However, in both the C and C++
  111 + * API, it is possible to check for and handle warnings as they arise.
  112 + *
  113 + * Some functions return values (or void) rather than an error code.
  114 + * This is especially true with the object handling functions. Those
  115 + * functions can still generate errors. To handle errors in those
  116 + * cases, you should explicitly call qpdf_has_error(). Note that, if
  117 + * you want to avoid the inconsistencies in the interface, you can
  118 + * always check for error conditions in this way rather than looking
  119 + * at status return codes.
  120 + *
  121 + * Prior to qpdf 10.5, if one of the functions that does not return an
  122 + * error code encountered an exception, it would cause the entire
  123 + * program to crash. Starting in qpdf 10.5, the default response to an
  124 + * error condition in these situations is to print the error to
  125 + * standard error, issue exactly one warning indicating that such an
  126 + * error occurred, and return a sensible fallback value (0 for
  127 + * numbers, QPDF_FALSE for booleans, "" for strings, or a null or
  128 + * uninitialized object handle). This is better than the old behavior
  129 + * but still undesirable as the best option is to explicitly check for
  130 + * error conditions.
  131 + *
  132 + * To prevent qpdf from writing error messages to stderr in this way,
  133 + * you can call qpdf_silence_errors(). This signals to the qpdf
  134 + * library that you intend to check the error codes yourself.
  135 + *
  136 + * If you encounter a situation where an exception from the C++ code
  137 + * is not properly converted to an error as described above, it is a
  138 + * bug in qpdf, which should be reported at
  139 + * https://github.com/qpdf/qpdf/issues/new.
  140 + */
  141 +
94 142 #include <qpdf/DLL.h>
95 143 #include <qpdf/Types.h>
96 144 #include <qpdf/Constants.h>
... ... @@ -116,6 +164,15 @@ extern &quot;C&quot; {
116 164 # define QPDF_TRUE 1
117 165 # define QPDF_FALSE 0
118 166  
  167 + /* From qpdf 10.5: call this method to signal to the library that
  168 + * you are explicitly handling errors from functions that don't
  169 + * return error codes. Otherwise, the library will print these
  170 + * error conditions to stderr and issue a warning. Prior to 10.5,
  171 + * the program would have crashed from an unhandled exception.
  172 + */
  173 + QPDF_DLL
  174 + void qpdf_silence_errors(qpdf_data qpdf);
  175 +
119 176 /* Returns the version of the qpdf software */
120 177 QPDF_DLL
121 178 char const* qpdf_get_qpdf_version();
... ... @@ -560,48 +617,14 @@ extern &quot;C&quot; {
560 617 * same as letting a QPDFObjectHandle go out of scope in the C++
561 618 * API.
562 619 *
563   - * Important note about error handling:
564   - *
565   - * While many of the functions that operate on the QPDF object
566   - * return error codes, the qpdf_oh functions return values such as
567   - * object handles or data. They have no way to return error codes.
568   - * If they generate warnings, the warnings are handled using the
569   - * error/warning handling functions described above. If the
570   - * underlying C++ call throws an exception, the error handler
571   - * registered with qpdf_register_oh_error_handler() will be
572   - * called. If no handler is registered, the exception is written
573   - * to STDERR. In either case, a sensible fallback value is
574   - * returned (0 for numbers, QPDF_FALSE for booleans, "" for
575   - * strings, or a null object). It is sensible for a C program to
576   - * use setjmp and longjmp with this error handler since the C++
577   - * code has raised an exception, but you can also just set a flag
578   - * and check it after each call.
579   - *
580   - * All conditions under which exceptions would be thrown by object
581   - * accessors are caused by programmer error or major problems such
582   - * as running out of memory or not being able to read the input
583   - * file. If they are ever caused by invalid data in the PDF file,
584   - * it is a bug in qpdf, which should be reported at
585   - * https://github.com/qpdf/qpdf/issues/new.
  620 + * Please see "ERROR HANDLING" above for details on how error
  621 + * conditions are handled.
586 622 */
587 623  
588 624 /* For examples of using this API, see examples/pdf-c-objects.c */
589 625  
590 626 typedef unsigned int qpdf_oh;
591 627  
592   - /* If an exception is thrown by the C++ code when any of the
593   - * qpdf_oh functions are called, the registered handle_error
594   - * function will be called. The value passed to data will be
595   - * passed along to the error handler function. If any errors occur
596   - * and no error handler is accessed, a single warning will be
597   - * issued, and the error will be written to stderr.
598   - */
599   - QPDF_DLL
600   - void qpdf_register_oh_error_handler(
601   - qpdf_data qpdf,
602   - void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data),
603   - void* data);
604   -
605 628 /* Releasing objects -- see comments above. These functions have no
606 629 * equivalent in the C++ API.
607 630 */
... ...
libqpdf/qpdf-c.cc
... ... @@ -42,9 +42,8 @@ struct _qpdf_data
42 42 PointerHolder<Buffer> output_buffer;
43 43  
44 44 // QPDFObjectHandle support
45   - void (*oh_error_handler)(qpdf_data, qpdf_error, void*);
46   - void* oh_error_handler_data;
47   - bool default_oh_error_handler_called;
  45 + bool silence_errors;
  46 + bool oh_error_occurred;
48 47 std::map<qpdf_oh, PointerHolder<QPDFObjectHandle>> oh_cache;
49 48 qpdf_oh next_oh;
50 49 std::set<std::string> cur_iter_dict_keys;
... ... @@ -52,32 +51,10 @@ struct _qpdf_data
52 51 std::string cur_dict_key;
53 52 };
54 53  
55   -static void default_oh_error_handler(qpdf_data qpdf, qpdf_error e, void* data)
56   -{
57   - bool* called = reinterpret_cast<bool*>(data);
58   - if (called != nullptr)
59   - {
60   - QTC::TC("qpdf", "qpdf-c warn about oh error", *called ? 0 : 1);
61   - if (! *called)
62   - {
63   - qpdf->warnings.push_back(
64   - QPDFExc(
65   - qpdf_e_internal,
66   - qpdf->qpdf->getFilename(),
67   - "", 0,
68   - "C API object handle accessor errors occurred,"
69   - " and the application did not define an error handler"));
70   - *called = true;
71   - }
72   - }
73   - std::cerr << e->exc->what() << std::endl;
74   -}
75   -
76 54 _qpdf_data::_qpdf_data() :
77 55 write_memory(false),
78   - oh_error_handler(default_oh_error_handler),
79   - oh_error_handler_data(&this->default_oh_error_handler_called),
80   - default_oh_error_handler_called(false),
  56 + silence_errors(false),
  57 + oh_error_occurred(false),
81 58 next_oh(0)
82 59 {
83 60 }
... ... @@ -876,14 +853,10 @@ QPDF_ERROR_CODE qpdf_write(qpdf_data qpdf)
876 853 return status;
877 854 }
878 855  
879   -void qpdf_register_oh_error_handler(
880   - qpdf_data qpdf,
881   - void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data),
882   - void* data)
  856 +void qpdf_silence_errors(qpdf_data qpdf)
883 857 {
884   - QTC::TC("qpdf", "qpdf-c registered oh error handler");
885   - qpdf->oh_error_handler = handle_error;
886   - qpdf->oh_error_handler_data = data;
  858 + QTC::TC("qpdf", "qpdf-c silence oh errors");
  859 + qpdf->silence_errors = true;
887 860 }
888 861  
889 862 template<class RET>
... ... @@ -901,8 +874,24 @@ static RET trap_oh_errors(
901 874 });
902 875 if (status & QPDF_ERRORS)
903 876 {
904   - (*qpdf->oh_error_handler)(
905   - qpdf, qpdf_get_error(qpdf), qpdf->oh_error_handler_data);
  877 + if (! qpdf->silence_errors)
  878 + {
  879 + QTC::TC("qpdf", "qpdf-c warn about oh error",
  880 + qpdf->oh_error_occurred ? 0 : 1);
  881 + if (! qpdf->oh_error_occurred)
  882 + {
  883 + qpdf->warnings.push_back(
  884 + QPDFExc(
  885 + qpdf_e_internal,
  886 + qpdf->qpdf->getFilename(),
  887 + "", 0,
  888 + "C API function caught an exception that it isn't"
  889 + " returning; please point the application developer"
  890 + " to ERROR HANDLING in qpdf-c.h"));
  891 + qpdf->oh_error_occurred = true;
  892 + }
  893 + std::cerr << qpdf->error->what() << std::endl;
  894 + }
906 895 return fallback();
907 896 }
908 897 return ret;
... ...
manual/index.rst
... ... @@ -3632,14 +3632,16 @@ For a detailed list of changes, please see the file
3632 3632  
3633 3633 - C API Enhancements
3634 3634  
3635   - - Overhaul error handling for the object handle functions
3636   - C API. See comments in the "Object handling" section of
3637   - :file:`include/qpdf/qpdf-c.h` for details.
3638   - In particular, exceptions thrown by the underlying C++ code
3639   - when calling object accessors are caught and converted into
3640   - errors. The errors can be trapped by registering an error
3641   - handler with ``qpdf_register_oh_error_handler`` or will be
3642   - written to stderr if no handler is registered.
  3635 + - Overhaul error handling for the object handle functions C API.
  3636 + Some rare error conditions that would previously have caused a
  3637 + crash are now trapped and reported, and the functions that
  3638 + generate them return fallback values. See comments in the
  3639 + ``ERROR HANDLING`` section of :file:`include/qpdf/qpdf-c.h` for
  3640 + details. In particular, exceptions thrown by the underlying C++
  3641 + code when calling object accessors are caught and converted into
  3642 + errors. The errors can be checked by call ``qpdf_has_error``.
  3643 + Use ``qpdf_silence_errors`` to prevent the error from being
  3644 + written to stderr.
3643 3645  
3644 3646 - Add ``qpdf_get_last_string_length`` to the C API to get the
3645 3647 length of the last string that was returned. This is needed to
... ...
qpdf/qpdf-ctest.c
... ... @@ -73,14 +73,12 @@ static void report_errors()
73 73 }
74 74 }
75 75  
76   -static void handle_oh_error(qpdf_data qpdf, qpdf_error error, void* data)
  76 +static void handle_oh_error(qpdf_data qpdf, char const* label)
77 77 {
78   - char const* label = "oh error";
79   - if (data)
  78 + if (qpdf_has_error(qpdf))
80 79 {
81   - label = *((char const**)data);
  80 + print_error(label, qpdf, qpdf_get_error(qpdf));
82 81 }
83   - print_error(label, qpdf, error);
84 82 }
85 83  
86 84 static void read_file_into_memory(char const* filename,
... ... @@ -837,42 +835,44 @@ static void test29(char const* infile,
837 835 * errors rather than warnings when they don't have an owning QPDF
838 836 * object.
839 837 */
840   - char const* label = "oh error";
841   - qpdf_register_oh_error_handler(qpdf, handle_oh_error, (void*)&label);
  838 + qpdf_silence_errors(qpdf);
842 839  
843 840 /* get_root fails when we have no trailer */
844   - label = "get root";
845 841 qpdf_oh root = qpdf_get_root(qpdf);
  842 + handle_oh_error(qpdf, "get root");
846 843 assert(root != 0);
847 844 assert(! qpdf_oh_is_initialized(qpdf, root));
848 845  
849   - label = "bad parse";
850 846 assert(! qpdf_oh_is_initialized(qpdf, qpdf_oh_parse(qpdf, "[oops")));
  847 + handle_oh_error(qpdf, "bad parse");
851 848 report_errors();
852 849  
853   - label = "type mismatch";
854 850 assert(qpdf_oh_get_int_value_as_int(
855 851 qpdf, qpdf_oh_new_string(qpdf, "x")) == 0);
  852 + handle_oh_error(qpdf, "type mismatch (int operation on string)");
856 853 qpdf_oh int_oh = qpdf_oh_new_integer(qpdf, 12);
857 854 assert(strlen(qpdf_oh_get_string_value(qpdf, int_oh)) == 0);
  855 + handle_oh_error(qpdf, "type mismatch (string operation on int)");
858 856  
859 857 // This doesn't test every possible error flow, but it tests each
860 858 // way of handling errors in the library code.
861   - label = "array type mismatch";
862 859 assert(qpdf_oh_get_array_n_items(qpdf, int_oh) == 0);
  860 + handle_oh_error(qpdf, "array type mismatch - n_items");
863 861 assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, int_oh, 3)));
864   - label = "append to non-array";
  862 + handle_oh_error(qpdf, "array type mismatch - item");
865 863 qpdf_oh_append_item(qpdf, int_oh, qpdf_oh_new_null(qpdf));
  864 + handle_oh_error(qpdf, "append to non-array");
866 865 qpdf_oh array = qpdf_oh_new_array(qpdf);
867   - label = "array bounds";
868 866 assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, array, 3)));
  867 + handle_oh_error(qpdf, "array bounds");
869 868  
870   - label = "dictionary iter type mismatch";
871 869 qpdf_oh_begin_dict_key_iter(qpdf, int_oh);
872 870 assert(qpdf_oh_dict_more_keys(qpdf) == QPDF_FALSE);
873   - label = "dictionary type mismatch";
  871 + handle_oh_error(qpdf, "dictionary iter type mismatch");
874 872 assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_key(qpdf, int_oh, "potato")));
  873 + handle_oh_error(qpdf, "dictionary type mismatch");
875 874 assert(qpdf_oh_has_key(qpdf, int_oh, "potato") == QPDF_FALSE);
  875 + handle_oh_error(qpdf, "dictionary type mismatch");
876 876  
877 877 report_errors();
878 878 }
... ... @@ -1030,6 +1030,9 @@ static void test35(char const* infile,
1030 1030 assert(qpdf_more_warnings(qpdf));
1031 1031 e = qpdf_next_warning(qpdf);
1032 1032 assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS);
  1033 + assert(qpdf_has_error(qpdf));
  1034 + e = qpdf_get_error(qpdf);
  1035 + assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS);
1033 1036 assert(! qpdf_has_error(qpdf));
1034 1037  
1035 1038 assert(qpdf_find_page_by_id(qpdf, 100, 0) == -1);
... ...
qpdf/qpdf.testcov
... ... @@ -606,7 +606,6 @@ qpdf-c called qpdf_oh_is_initialized 0
606 606 qpdf-c registered progress reporter 0
607 607 qpdf-c called qpdf_oh_new_uninitialized 0
608 608 qpdf-c warn about oh error 1
609   -qpdf-c registered oh error handler 0
610 609 qpdf-c cleanup warned about unhandled error 0
611 610 qpdf-c called qpdf_get_object_by_id 0
612 611 qpdf-c called qpdf_replace_object 0
... ... @@ -625,3 +624,4 @@ qpdf-c stream data filtered set 1
625 624 qpdf-c stream data buf set 1
626 625 qpdf-c called qpdf_oh_get_page_content_data 0
627 626 qpdf-c called qpdf_oh_replace_stream_data 0
  627 +qpdf-c silence oh errors 0
... ...
qpdf/qtest/qpdf/c-object-handles.out
... ... @@ -10,9 +10,14 @@ item 3: 792 792.00
10 10 minimal.pdf (C API object handle 6): attempted access to unknown object handle
11 11 minimal.pdf (C API object handle 9): attempted access to unknown object handle
12 12 minimal.pdf (C API object handle 9): attempted access to unknown object handle
13   -warning: minimal.pdf: C API object handle accessor errors occurred, and the application did not define an error handler
  13 +warning: minimal.pdf: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h
14 14 code: 1
15 15 file: minimal.pdf
16 16 pos : 0
17   - text: C API object handle accessor errors occurred, and the application did not define an error handler
  17 + text: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h
  18 +error: minimal.pdf (C API object handle 9): attempted access to unknown object handle
  19 + code: 1
  20 + file: minimal.pdf
  21 + pos : 0
  22 + text: attempted access to unknown object handle
18 23 C test 24 done
... ...
qpdf/qtest/qpdf/c-oh-errors.out
... ... @@ -8,22 +8,22 @@ bad parse: parsed object (offset 1): unknown token while reading object; treatin
8 8 file: parsed object
9 9 pos : 1
10 10 text: unknown token while reading object; treating as string
11   -type mismatch: operation for integer attempted on object of type string: returning 0
  11 +type mismatch (int operation on string): operation for integer attempted on object of type string: returning 0
12 12 code: 7
13 13 file:
14 14 pos : 0
15 15 text: operation for integer attempted on object of type string: returning 0
16   -type mismatch: operation for string attempted on object of type integer: returning empty string
  16 +type mismatch (string operation on int): operation for string attempted on object of type integer: returning empty string
17 17 code: 7
18 18 file:
19 19 pos : 0
20 20 text: operation for string attempted on object of type integer: returning empty string
21   -array type mismatch: operation for array attempted on object of type integer: treating as empty
  21 +array type mismatch - n_items: operation for array attempted on object of type integer: treating as empty
22 22 code: 7
23 23 file:
24 24 pos : 0
25 25 text: operation for array attempted on object of type integer: treating as empty
26   -array type mismatch: operation for array attempted on object of type integer: returning null
  26 +array type mismatch - item: operation for array attempted on object of type integer: returning null
27 27 code: 7
28 28 file:
29 29 pos : 0
... ...
qpdf/qtest/qpdf/c-oh-uninitialized-objects.out
1 1 closed input source (C API object handle 1): attempted access to unknown object handle
  2 +WARNING: application did not handle error: closed input source (C API object handle 1): attempted access to unknown object handle
2 3 C test 26 done
... ...
qpdf/qtest/qpdf/c-page-errors.out
... ... @@ -6,4 +6,9 @@ warning: object 27 0: operation for dictionary attempted on object of type null:
6 6 file:
7 7 pos : 0
8 8 text: operation for dictionary attempted on object of type null: ignoring key replacement request
  9 +error: 11-pages.pdf (C API object handle 1000): attempted access to unknown object handle
  10 + code: 1
  11 + file: 11-pages.pdf
  12 + pos : 0
  13 + text: attempted access to unknown object handle
9 14 C test 35 done
... ...