Commit 3d0de5b92471a280bc8baf467159561d79428ebc

Authored by Jay Berkenbilt
1 parent 98174373

Fixes to ChangeLog and manual for 10.0.0 changes

ChangeLog
@@ -8,6 +8,12 @@ @@ -8,6 +8,12 @@
8 recovery when objects are copied from other files and when 8 recovery when objects are copied from other files and when
9 "immediate copy from" is enabled. 9 "immediate copy from" is enabled.
10 10
  11 + * When copying foreign streams with immediateCopyFrom set, the
  12 + same type of recovery from streams with filtering errors is
  13 + performed as when dealing with streams in the original input. This
  14 + could happen, for example, if you are using the --pages option to
  15 + take pages from another file and that file has errors in it.
  16 +
11 * Add a new version of QPDFObjectHandle::pipeStreamData whose 17 * Add a new version of QPDFObjectHandle::pipeStreamData whose
12 return value indicates overall success or failure rather than 18 return value indicates overall success or failure rather than
13 whether nor not filtering was attempted. It should have always 19 whether nor not filtering was attempted. It should have always
@@ -36,6 +42,12 @@ @@ -36,6 +42,12 @@
36 --preserve-unreferenced-resources is now a synonym for 42 --preserve-unreferenced-resources is now a synonym for
37 --remove-unreferenced-resources=no. 43 --remove-unreferenced-resources=no.
38 44
  45 + * Use std::atomic for unique ID generation internally within the
  46 + library. This eliminates the already extremely low chance of a
  47 + collision, improves thread safety, and removes a dependency on a
  48 + random number generator. Thanks to Dean Scarff for the
  49 + contribution.
  50 +
39 2020-04-03 Jay Berkenbilt <ejb@ql.org> 51 2020-04-03 Jay Berkenbilt <ejb@ql.org>
40 52
41 * Allow qpdf to be built on systems without wchar_t. All "normal" 53 * Allow qpdf to be built on systems without wchar_t. All "normal"
@@ -50,6 +62,10 @@ @@ -50,6 +62,10 @@
50 maximally fill the destination rectangle. Prior to this change, 62 maximally fill the destination rectangle. Prior to this change,
51 placeFormXObject might shrink it but would never expand it. 63 placeFormXObject might shrink it but would never expand it.
52 64
  65 + * When calling the C API, accept any non-zero value as TRUE rather
  66 + than just 1. This appears to resolve issues on Windows when
  67 + calling some versions of the DLL directly from other languages.
  68 +
53 2020-04-02 Jay Berkenbilt <ejb@ql.org> 69 2020-04-02 Jay Berkenbilt <ejb@ql.org>
54 70
55 * Add method QPDFObjectHandle::unsafeShallowCopy for copying only 71 * Add method QPDFObjectHandle::unsafeShallowCopy for copying only
manual/qpdf-manual.xml
@@ -1945,20 +1945,50 @@ outfile.pdf&lt;/option&gt; @@ -1945,20 +1945,50 @@ outfile.pdf&lt;/option&gt;
1945 </listitem> 1945 </listitem>
1946 </varlistentry> 1946 </varlistentry>
1947 <varlistentry> 1947 <varlistentry>
  1948 + <term><option>--remove-unreferenced-resources=<replaceable>option</replaceable></option></term>
  1949 + <listitem>
  1950 + <para>
  1951 + The <replaceable>option</replaceable> may be
  1952 + <literal>auto</literal>, <literal>yes</literal>, or
  1953 + <literal>no</literal>. The default is <literal>auto</literal>.
  1954 + </para>
  1955 + <para>
  1956 + Starting with qpdf 8.1, when splitting pages, qpdf is able to
  1957 + attempt to remove images and fonts that are not used by a page
  1958 + even if they are referenced in the page's resources
  1959 + dictionary. When shared resources are in use, this behavior
  1960 + can greatly reduce the file sizes of split pages, but the
  1961 + analysis is very slow. In versions from 8.1 through 9.1.1,
  1962 + qpdf did this analysis by default. Starting in qpdf 10.0.0, if
  1963 + <literal>auto</literal> is used, qpdf does a quick analysis of
  1964 + the file to determine whether the file is likely to have
  1965 + unreferenced objects on pages, a pattern that frequently
  1966 + occurs when resource dictionaries are shared across multiple
  1967 + pages and rarely occurs otherwise. If it discovers this
  1968 + pattern, then it will attempt to remove unreferenced
  1969 + resources. Usually this means you get the slower splitting
  1970 + speed only when it's actually going to create smaller files.
  1971 + You can suppress removal of unreferenced resources altogether
  1972 + by specifying <literal>no</literal> or force it to do the full
  1973 + algorithm by specifying <literal>yes</literal>.
  1974 + </para>
  1975 + <para>
  1976 + Other than cases in which you don't care about file size and
  1977 + care a lot about runtime, there are few reasons to use this
  1978 + option, especially now that <literal>auto</literal> mode is
  1979 + supported. One reason to use this is if you suspect that qpdf
  1980 + is removing resources it shouldn't be removing. If you
  1981 + encounter that case, please report it as bug at <ulink
  1982 + url="https://github.com/qpdf/qpdf/issues/">https://github.com/qpdf/qpdf/issues/</ulink>.
  1983 + </para>
  1984 + </listitem>
  1985 + </varlistentry>
  1986 + <varlistentry>
1948 <term><option>--preserve-unreferenced-resources</option></term> 1987 <term><option>--preserve-unreferenced-resources</option></term>
1949 <listitem> 1988 <listitem>
1950 <para> 1989 <para>
1951 - Starting with qpdf 8.1, when splitting pages, qpdf ordinarily  
1952 - attempts to remove images and fonts that are not used by a  
1953 - page even if they are referenced in the page's resources  
1954 - dictionary. This option suppresses that behavior. There are  
1955 - few reasons to use this option. One reason to use this is if  
1956 - you suspect that qpdf is removing resources it shouldn't be  
1957 - removing. If you encounter that case, please report it as a  
1958 - bug. Another reason is that the new behavior can be much  
1959 - slower for files that include a very large number of images or  
1960 - other XObjects on a page. In that case, using this option will  
1961 - return qpdf to the old behavior and speed. 1990 + This is a synonym for
  1991 + <option>--remove-unreferenced-resources=no</option>.
1962 </para> 1992 </para>
1963 <para> 1993 <para>
1964 See also <option>--preserve-unreferenced</option>, which does 1994 See also <option>--preserve-unreferenced</option>, which does
@@ -4700,6 +4730,239 @@ print &quot;\n&quot;; @@ -4700,6 +4730,239 @@ print &quot;\n&quot;;
4700 <filename>ChangeLog</filename> in the source distribution. 4730 <filename>ChangeLog</filename> in the source distribution.
4701 </para> 4731 </para>
4702 <variablelist> 4732 <variablelist>
  4733 +<!--
  4734 + <varlistentry>
  4735 + <term>x.y.z: Month dd, YYYY</term>
  4736 + <listitem>
  4737 + <itemizedlist>
  4738 + <listitem>
  4739 + <para>
  4740 + Category
  4741 + </para>
  4742 + <itemizedlist>
  4743 + <listitem>
  4744 + <para>
  4745 + Item
  4746 + </para>
  4747 + </listitem>
  4748 + <listitem>
  4749 + <para>
  4750 + Item
  4751 + </para>
  4752 + </listitem>
  4753 + </itemizedlist>
  4754 + </listitem>
  4755 + <listitem>
  4756 + <para>
  4757 + Category
  4758 + </para>
  4759 + <itemizedlist>
  4760 + <listitem>
  4761 + <para>
  4762 + Item
  4763 + </para>
  4764 + </listitem>
  4765 + <listitem>
  4766 + <para>
  4767 + Item
  4768 + </para>
  4769 + </listitem>
  4770 + </itemizedlist>
  4771 + </listitem>
  4772 + </itemizedlist>
  4773 + </listitem>
  4774 + </varlistentry>
  4775 +-->
  4776 + <varlistentry>
  4777 + <term>10.0.0: April 6, 2020</term>
  4778 + <listitem>
  4779 + <itemizedlist>
  4780 + <listitem>
  4781 + <para>
  4782 + Performance Enhancements
  4783 + </para>
  4784 + <itemizedlist>
  4785 + <listitem>
  4786 + <para>
  4787 + The qpdf library and executable should run much faster in
  4788 + this version than in the last several releases. Several
  4789 + internal library optimizations have been made, and there has
  4790 + been improved behavior on page splitting as well. This
  4791 + version of qpdf should outperform any of the 8.x or 9.x
  4792 + versions.
  4793 + </para>
  4794 + </listitem>
  4795 + </itemizedlist>
  4796 + </listitem>
  4797 + <listitem>
  4798 + <para>
  4799 + CLI Enhancements
  4800 + </para>
  4801 + <itemizedlist>
  4802 + <listitem>
  4803 + <para>
  4804 + Add <literal>objectinfo</literal> key to the JSON output.
  4805 + This will be a place to put computed metadata or other
  4806 + information about PDF objects that are not immediately
  4807 + evident in other ways or that seem useful for some other
  4808 + reason. In this version, information is provided about each
  4809 + object indicating whether it is a stream and, if so, what
  4810 + its length and filters are. Without this, it was not
  4811 + possible to tell conclusively from the JSON output alone
  4812 + whether or not an object was a stream. Run <command>qpdf
  4813 + --json-help</command> for details.
  4814 + </para>
  4815 + </listitem>
  4816 + <listitem>
  4817 + <para>
  4818 + Add new option
  4819 + <option>--remove-unreferenced-resources</option> which takes
  4820 + <literal>auto</literal>, <literal>yes</literal>, or
  4821 + <literal>no</literal> as arguments. The new
  4822 + <literal>auto</literal> mode, which is the default, performs
  4823 + a fast heuristic over a PDF file when splitting pages to
  4824 + determine whether the expensive process of finding and
  4825 + removing unreferenced resources is likely to be of benefit.
  4826 + For most files, this new default will result in a
  4827 + significant performance improvement for splitting pages. See
  4828 + <xref linkend="ref.advanced-transformation"/> for a more
  4829 + detailed discussion.
  4830 + </para>
  4831 + </listitem>
  4832 + <listitem>
  4833 + <para>
  4834 + The <option>--preserve-unreferenced-resources</option> is
  4835 + now just a synonym for
  4836 + <option>--remove-unreferenced-resources=no</option>.
  4837 + </para>
  4838 + </listitem>
  4839 + <listitem>
  4840 + <para>
  4841 + If the <literal>QPDF_EXECUTABLE</literal> environment
  4842 + variable is set when invoking <command>qpdf
  4843 + --bash-completion</command> or <command>qpdf
  4844 + --zsh-completion</command>, the completion command that it
  4845 + outputs will refer to qpdf using the value of that variable
  4846 + rather than what <command>qpdf</command> determines its
  4847 + executable path to be. This can be useful when wrapping
  4848 + <command>qpdf</command> with a script, working with a
  4849 + version in the source tree, using an AppImage, or other
  4850 + situations where there is some indirection.
  4851 + </para>
  4852 + </listitem>
  4853 + </itemizedlist>
  4854 + </listitem>
  4855 + <listitem>
  4856 + <para>
  4857 + Library Enhancements
  4858 + </para>
  4859 + <itemizedlist>
  4860 + <listitem>
  4861 + <para>
  4862 + Add a new version of
  4863 + <function>QPDFObjectHandle::StreamDataProvider::provideStreamData</function>
  4864 + that accepts the <function>suppress_warnings</function> and
  4865 + <function>will_retry</function> options and allows a success
  4866 + code to be returned. This makes it possible to implement a
  4867 + <classname>StreamDataProvider</classname> that calls
  4868 + <function>pipeStreamData</function> on another stream and to
  4869 + pass the response back to the caller, which enables better
  4870 + error handling on those proxied streams.
  4871 + </para>
  4872 + </listitem>
  4873 + <listitem>
  4874 + <para>
  4875 + Update <function>QPDFObjectHandle::pipeStreamData</function>
  4876 + to return an overall success code that goes beyond whether
  4877 + or not filtered data was written successfully. This allows
  4878 + better error handling of cases that were not filtering
  4879 + errors. You have to call this explicitly. Methods in
  4880 + previously existing APIs have the same semantics as before.
  4881 + </para>
  4882 + </listitem>
  4883 + <listitem>
  4884 + <para>
  4885 + The
  4886 + <function>QPDFPageObjectHelper::placeFormXObject</function>
  4887 + method now allows separate control over whether it should be
  4888 + willing to shrink or expand objects to fit them better into
  4889 + the destination rectangle. The previous behavior was that
  4890 + shrinking was allowed but expansion was not. The previous
  4891 + behavior is still the default.
  4892 + </para>
  4893 + </listitem>
  4894 + <listitem>
  4895 + <para>
  4896 + When calling the C API, any non-zero value passed to a
  4897 + boolean parameter is treated as <literal>TRUE</literal>.
  4898 + Previously only the value <literal>1</literal> was accepted.
  4899 + This makes the C API behave more like most C interfaces and
  4900 + is known to improve compatibility with some Windows
  4901 + environments that dynamically load the DLL and call
  4902 + functions from it.
  4903 + </para>
  4904 + </listitem>
  4905 + <listitem>
  4906 + <para>
  4907 + Add <function>QPDFObjectHandle::unsafeShallowCopy</function>
  4908 + for copying only top-level dictionary keys or array items.
  4909 + This is unsafe because it creates a situation in which
  4910 + changing a lower-level item in one object may also change it
  4911 + in another object, but for cases in which you
  4912 + <emphasis>know</emphasis> you are only inserting or
  4913 + replacing top-level items, it is much faster than
  4914 + <function>QPDFObjectHandle::shallowCopy</function>.
  4915 + </para>
  4916 + </listitem>
  4917 + <listitem>
  4918 + <para>
  4919 + Add <function>QPDFObjectHandle::filterAsContents</function>,
  4920 + which filter's a stream's data as a content stream. This is
  4921 + useful for parsing the contents for form XObjects in the
  4922 + same way as parsing page content streams.
  4923 + </para>
  4924 + </listitem>
  4925 + </itemizedlist>
  4926 + </listitem>
  4927 + <listitem>
  4928 + <para>
  4929 + Bug Fixes
  4930 + </para>
  4931 + <itemizedlist>
  4932 + <listitem>
  4933 + <para>
  4934 + When detecting and removing unreferenced resources during
  4935 + page splitting, traverse into form XObjects and handle their
  4936 + resources dictionaries as well.
  4937 + </para>
  4938 + </listitem>
  4939 + <listitem>
  4940 + <para>
  4941 + The same error recovery is applied to streams in other than
  4942 + the primary input file when merging or splitting pages.
  4943 + </para>
  4944 + </listitem>
  4945 + </itemizedlist>
  4946 + </listitem>
  4947 + <listitem>
  4948 + <para>
  4949 + Build Changes
  4950 + </para>
  4951 + <itemizedlist>
  4952 + <listitem>
  4953 + <para>
  4954 + Allow qpdf to built on stripped down systems whose C/C++
  4955 + libraries lack the <classname>wchar_t</classname> type.
  4956 + Search for <classname>wchar_t</classname> in qpdf's
  4957 + README.md for details. This should be very rare, but it is
  4958 + known to be helpful in some embedded environments.
  4959 + </para>
  4960 + </listitem>
  4961 + </itemizedlist>
  4962 + </listitem>
  4963 + </itemizedlist>
  4964 + </listitem>
  4965 + </varlistentry>
4703 <varlistentry> 4966 <varlistentry>
4704 <term>9.1.1: January 26, 2020</term> 4967 <term>9.1.1: January 26, 2020</term>
4705 <listitem> 4968 <listitem>
@@ -4804,8 +5067,6 @@ print &quot;\n&quot;; @@ -4804,8 +5067,6 @@ print &quot;\n&quot;;
4804 </itemizedlist> 5067 </itemizedlist>
4805 </listitem> 5068 </listitem>
4806 </varlistentry> 5069 </varlistentry>
4807 - </variablelist>  
4808 - <variablelist>  
4809 <varlistentry> 5070 <varlistentry>
4810 <term>9.1.0: November 17, 2019</term> 5071 <term>9.1.0: November 17, 2019</term>
4811 <listitem> 5072 <listitem>
@@ -4905,8 +5166,6 @@ print &quot;\n&quot;; @@ -4905,8 +5166,6 @@ print &quot;\n&quot;;
4905 </itemizedlist> 5166 </itemizedlist>
4906 </listitem> 5167 </listitem>
4907 </varlistentry> 5168 </varlistentry>
4908 - </variablelist>  
4909 - <variablelist>  
4910 <varlistentry> 5169 <varlistentry>
4911 <term>9.0.2: October 12, 2019</term> 5170 <term>9.0.2: October 12, 2019</term>
4912 <listitem> 5171 <listitem>
@@ -5272,7 +5531,7 @@ print &quot;\n&quot;; @@ -5272,7 +5531,7 @@ print &quot;\n&quot;;
5272 in dynamically linked code catching exceptions or 5531 in dynamically linked code catching exceptions or
5273 subclassing, this could be the reason. If you see this, 5532 subclassing, this could be the reason. If you see this,
5274 please report a bug at <ulink 5533 please report a bug at <ulink
5275 - url="https://github.com/qpdf/qpdf/issues/">pikepdf</ulink>. 5534 + url="https://github.com/qpdf/qpdf/issues/">https://github.com/qpdf/qpdf/issues/</ulink>.
5276 </para> 5535 </para>
5277 </listitem> 5536 </listitem>
5278 <listitem> 5537 <listitem>
qpdf/qpdf.cc
@@ -1483,10 +1483,10 @@ ArgParser::argHelp() @@ -1483,10 +1483,10 @@ ArgParser::argHelp()
1483 << "--normalize-content=[yn] enables or disables normalization of content streams\n" 1483 << "--normalize-content=[yn] enables or disables normalization of content streams\n"
1484 << "--object-streams=mode controls handing of object streams\n" 1484 << "--object-streams=mode controls handing of object streams\n"
1485 << "--preserve-unreferenced preserve unreferenced objects\n" 1485 << "--preserve-unreferenced preserve unreferenced objects\n"
1486 - << "--preserve-unreferenced-resources\n"  
1487 - << " synonym for --remove-unreferenced-resources=no\n"  
1488 << "--remove-unreferenced-resources={auto,yes,no}\n" 1486 << "--remove-unreferenced-resources={auto,yes,no}\n"
1489 << " whether to remove unreferenced page resources\n" 1487 << " whether to remove unreferenced page resources\n"
  1488 + << "--preserve-unreferenced-resources\n"
  1489 + << " synonym for --remove-unreferenced-resources=no\n"
1490 << "--newline-before-endstream always put a newline before endstream\n" 1490 << "--newline-before-endstream always put a newline before endstream\n"
1491 << "--coalesce-contents force all pages' content to be a single stream\n" 1491 << "--coalesce-contents force all pages' content to be a single stream\n"
1492 << "--flatten-annotations=option\n" 1492 << "--flatten-annotations=option\n"