diff --git a/oletools/thirdparty/olefile/LICENSE.txt b/oletools/thirdparty/olefile/LICENSE.txt
index 418c1da..506a3d7 100644
--- a/oletools/thirdparty/olefile/LICENSE.txt
+++ b/oletools/thirdparty/olefile/LICENSE.txt
@@ -1,6 +1,6 @@
LICENSE for the olefile package:
-olefile (formerly OleFileIO_PL) is copyright (c) 2005-2015 Philippe Lagadec
+olefile (formerly OleFileIO_PL) is copyright (c) 2005-2016 Philippe Lagadec
(http://www.decalage.info)
All rights reserved.
diff --git a/oletools/thirdparty/olefile/README.html b/oletools/thirdparty/olefile/README.html
index 65b590d..74d95ac 100644
--- a/oletools/thirdparty/olefile/README.html
+++ b/oletools/thirdparty/olefile/README.html
@@ -1,91 +1,81 @@
-
-
-
-
-
-
-
-
-
-
-olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.
-Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the author - Repository - Updates on Twitter
-News
-Follow all updates and news on Twitter: https://twitter.com/decalage2
-
-- 2015-01-25 v0.42: improved handling of special characters in stream/storage names on Python 2.x (using UTF-8 instead of Latin-1), fixed bug in listdir with empty storages.
-- 2014-11-25 v0.41: OleFileIO.open and isOleFile now support OLE files stored in byte strings, fixed installer for python 3, added support for Jython (Niko Ehrenfeuchter)
-- 2014-10-01 v0.40: renamed OleFileIO_PL to olefile, added initial write support for streams >4K, updated doc and license, improved the setup script.
-- 2014-07-27 v0.31: fixed support for large files with 4K sectors, thanks to Niko Ehrenfeuchter, Martijn Berger and Dave Jones. Added test scripts from Pillow (by hugovk). Fixed setup for Python 3 (Martin Panter)
-- 2014-02-04 v0.30: now compatible with Python 3.x, thanks to Martin Panter who did most of the hard work.
-- 2013-07-24 v0.26: added methods to parse stream/storage timestamps, improved listdir to include storages, fixed parsing of direntry timestamps
-- 2013-05-27 v0.25: improved metadata extraction, properties parsing and exception handling, fixed issue #12
-- 2013-05-07 v0.24: new features to extract metadata (get_metadata method and OleMetadata class), improved getproperties to convert timestamps to Python datetime
-- 2012-10-09: published python-oletools, a package of analysis tools based on OleFileIO_PL
-- 2012-09-11 v0.23: added support for file-like objects, fixed issue #8
-- 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2 (added close method)
-- 2011-10-20: code hosted on bitbucket to ease contributions and bug tracking
-- 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC Macs.
-- 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not plain str.
-- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben G. and Martijn for reporting the bug)
-- see changelog in source code for more info.
-
-Download/Install
-If you have pip or setuptools installed (pip is included in Python 2.7.9+), you may simply run pip install olefile or easy_install olefile for the first installation.
-To update olefile, run pip install -U olefile.
-Otherwise, see https://bitbucket.org/decalage/olefileio_pl/wiki/Install
-Features
-
-- Parse, read and write any OLE file such as Microsoft Office 97-2003 legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt, Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView OIB files, etc
-- List all the streams and storages contained in an OLE file
-- Open streams as files
-- Parse and read property streams, containing metadata of the file
-- Portable, pure Python module, no dependency
-
-olefile can be used as an independent package or with PIL/Pillow.
-olefile is mostly meant for developers. If you are looking for tools to analyze OLE files or to extract data (especially for security purposes such as malware analysis and forensics), then please also check my python-oletools, which are built upon olefile and provide a higher-level interface.
-History
-olefile is based on the OleFileIO module from PIL, the excellent Python Imaging Library, created and maintained by Fredrik Lundh. The olefile API is still compatible with PIL, but since 2005 I have improved the internal implementation significantly, with new features, bugfixes and a more robust design. From 2005 to 2014 the project was called OleFileIO_PL, and in 2014 I changed its name to olefile to celebrate its 9 years and its new write features.
-As far as I know, olefile is the most complete and robust Python implementation to read MS OLE2 files, portable on several operating systems. (please tell me if you know other similar Python modules)
-Since 2014 olefile/OleFileIO_PL has been integrated into Pillow, the friendly fork of PIL. olefile will continue to be improved as a separate project, and new versions will be merged into Pillow regularly.
-Main improvements over the original version of OleFileIO in PIL:
-
-- Compatible with Python 3.x and 2.6+
-- Many bug fixes
-- Support for files larger than 6.8MB
-- Support for 64 bits platforms and big-endian CPUs
-- Robust: many checks to detect malformed files
-- Runtime option to choose if malformed files should be parsed or raise exceptions
-- Improved API
-- Metadata extraction, stream/storage timestamps (e.g. for document forensics)
-- Can open file-like objects
-- Added setup.py and install.bat to ease installation
-- More convenient slash-based syntax for stream paths
-- Write features
-
-Documentation
-Please see the online documentation for more information, especially the OLE overview and the API page which describe how to use olefile in Python applications. A copy of the same documentation is also provided in the doc subfolder of the olefile package.
-Real-life examples
-A real-life example: using OleFileIO_PL for malware analysis and forensics.
-See also this paper about python tools for forensics, which features olefile.
-License
-olefile (formerly OleFileIO_PL) is copyright (c) 2005-2015 Philippe Lagadec (http://www.decalage.info)
-All rights reserved.
-Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
-- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-olefile is based on source code from the OleFileIO module of the Python Imaging Library (PIL) published by Fredrik Lundh under the following license:
-The Python Imaging Library (PIL) is
-
-- Copyright (c) 1997-2005 by Secret Labs AB
-- Copyright (c) 1995-2005 by Fredrik Lundh
-
-By obtaining, using, and/or copying this software and/or its associated documentation, you agree that you have read, understood, and will comply with the following terms and conditions:
-Permission to use, copy, modify, and distribute this software and its associated documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all copies, and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Secret Labs AB or the author not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.
-SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
-
-
+
+olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.
+Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the author - Repository - Updates on Twitter
+News
+Follow all updates and news on Twitter: https://twitter.com/decalage2
+
+- 2016-02-02 v0.43: fixed issues #26 and #27, better handling of malformed files, use python logging.
+- 2015-01-25 v0.42: improved handling of special characters in stream/storage names on Python 2.x (using UTF-8 instead of Latin-1), fixed bug in listdir with empty storages.
+- 2014-11-25 v0.41: OleFileIO.open and isOleFile now support OLE files stored in byte strings, fixed installer for python 3, added support for Jython (Niko Ehrenfeuchter)
+- 2014-10-01 v0.40: renamed OleFileIO_PL to olefile, added initial write support for streams >4K, updated doc and license, improved the setup script.
+- 2014-07-27 v0.31: fixed support for large files with 4K sectors, thanks to Niko Ehrenfeuchter, Martijn Berger and Dave Jones. Added test scripts from Pillow (by hugovk). Fixed setup for Python 3 (Martin Panter)
+- 2014-02-04 v0.30: now compatible with Python 3.x, thanks to Martin Panter who did most of the hard work.
+- 2013-07-24 v0.26: added methods to parse stream/storage timestamps, improved listdir to include storages, fixed parsing of direntry timestamps
+- 2013-05-27 v0.25: improved metadata extraction, properties parsing and exception handling, fixed issue #12
+- 2013-05-07 v0.24: new features to extract metadata (get_metadata method and OleMetadata class), improved getproperties to convert timestamps to Python datetime
+- 2012-10-09: published python-oletools, a package of analysis tools based on OleFileIO_PL
+- 2012-09-11 v0.23: added support for file-like objects, fixed issue #8
+- 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2 (added close method)
+- 2011-10-20: code hosted on bitbucket to ease contributions and bug tracking
+- 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC Macs.
+- 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not plain str.
+- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben G. and Martijn for reporting the bug)
+- see changelog in source code for more info.
+
+Download/Install
+If you have pip or setuptools installed (pip is included in Python 2.7.9+), you may simply run pip install olefile or easy_install olefile for the first installation.
+To update olefile, run pip install -U olefile.
+Otherwise, see https://bitbucket.org/decalage/olefileio_pl/wiki/Install
+Features
+
+- Parse, read and write any OLE file such as Microsoft Office 97-2003 legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt, Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView OIB files, etc
+- List all the streams and storages contained in an OLE file
+- Open streams as files
+- Parse and read property streams, containing metadata of the file
+- Portable, pure Python module, no dependency
+
+olefile can be used as an independent package or with PIL/Pillow.
+olefile is mostly meant for developers. If you are looking for tools to analyze OLE files or to extract data (especially for security purposes such as malware analysis and forensics), then please also check my python-oletools, which are built upon olefile and provide a higher-level interface.
+History
+olefile is based on the OleFileIO module from PIL, the excellent Python Imaging Library, created and maintained by Fredrik Lundh. The olefile API is still compatible with PIL, but since 2005 I have improved the internal implementation significantly, with new features, bugfixes and a more robust design. From 2005 to 2014 the project was called OleFileIO_PL, and in 2014 I changed its name to olefile to celebrate its 9 years and its new write features.
+As far as I know, olefile is the most complete and robust Python implementation to read MS OLE2 files, portable on several operating systems. (please tell me if you know other similar Python modules)
+Since 2014 olefile/OleFileIO_PL has been integrated into Pillow, the friendly fork of PIL. olefile will continue to be improved as a separate project, and new versions will be merged into Pillow regularly.
+Main improvements over the original version of OleFileIO in PIL:
+
+- Compatible with Python 3.x and 2.6+
+- Many bug fixes
+- Support for files larger than 6.8MB
+- Support for 64 bits platforms and big-endian CPUs
+- Robust: many checks to detect malformed files
+- Runtime option to choose if malformed files should be parsed or raise exceptions
+- Improved API
+- Metadata extraction, stream/storage timestamps (e.g. for document forensics)
+- Can open file-like objects
+- Added setup.py and install.bat to ease installation
+- More convenient slash-based syntax for stream paths
+- Write features
+
+Documentation
+Please see the online documentation for more information, especially the OLE overview and the API page which describe how to use olefile in Python applications. A copy of the same documentation is also provided in the doc subfolder of the olefile package.
+Real-life examples
+A real-life example: using OleFileIO_PL for malware analysis and forensics.
+See also this paper about python tools for forensics, which features olefile.
+License
+olefile (formerly OleFileIO_PL) is copyright (c) 2005-2016 Philippe Lagadec (http://www.decalage.info)
+All rights reserved.
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+olefile is based on source code from the OleFileIO module of the Python Imaging Library (PIL) published by Fredrik Lundh under the following license:
+The Python Imaging Library (PIL) is
+
+- Copyright (c) 1997-2005 by Secret Labs AB
+- Copyright (c) 1995-2005 by Fredrik Lundh
+
+By obtaining, using, and/or copying this software and/or its associated documentation, you agree that you have read, understood, and will comply with the following terms and conditions:
+Permission to use, copy, modify, and distribute this software and its associated documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all copies, and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Secret Labs AB or the author not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.
+SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
diff --git a/oletools/thirdparty/olefile/README.rst b/oletools/thirdparty/olefile/README.rst
index e37a110..04d7775 100644
--- a/oletools/thirdparty/olefile/README.rst
+++ b/oletools/thirdparty/olefile/README.rst
@@ -1,221 +1,226 @@
-olefile (formerly OleFileIO\_PL)
-================================
-
-`olefile `_ is a Python package to
-parse, read and write `Microsoft OLE2
-files `_ (also
-called Structured Storage, Compound File Binary Format or Compound
-Document File Format), such as Microsoft Office 97-2003 documents,
-vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix
-files, Outlook messages, StickyNotes, several Microscopy file formats,
-McAfee antivirus quarantine files, etc.
-
-**Quick links:** `Home page `_ -
-`Download/Install `_
-- `Documentation `_ -
-`Report
-Issues/Suggestions/Questions `_
-- `Contact the author `_ -
-`Repository `_ - `Updates
-on Twitter `_
-
-News
-----
-
-Follow all updates and news on Twitter: https://twitter.com/decalage2
-
-- **2015-01-25 v0.42**: improved handling of special characters in
- stream/storage names on Python 2.x (using UTF-8 instead of Latin-1),
- fixed bug in listdir with empty storages.
-- 2014-11-25 v0.41: OleFileIO.open and isOleFile now support OLE files
- stored in byte strings, fixed installer for python 3, added support
- for Jython (Niko Ehrenfeuchter)
-- 2014-10-01 v0.40: renamed OleFileIO\_PL to olefile, added initial
- write support for streams >4K, updated doc and license, improved the
- setup script.
-- 2014-07-27 v0.31: fixed support for large files with 4K sectors,
- thanks to Niko Ehrenfeuchter, Martijn Berger and Dave Jones. Added
- test scripts from Pillow (by hugovk). Fixed setup for Python 3
- (Martin Panter)
-- 2014-02-04 v0.30: now compatible with Python 3.x, thanks to Martin
- Panter who did most of the hard work.
-- 2013-07-24 v0.26: added methods to parse stream/storage timestamps,
- improved listdir to include storages, fixed parsing of direntry
- timestamps
-- 2013-05-27 v0.25: improved metadata extraction, properties parsing
- and exception handling, fixed `issue
- #12 `_
-- 2013-05-07 v0.24: new features to extract metadata (get\_metadata
- method and OleMetadata class), improved getproperties to convert
- timestamps to Python datetime
-- 2012-10-09: published
- `python-oletools `_, a
- package of analysis tools based on OleFileIO\_PL
-- 2012-09-11 v0.23: added support for file-like objects, fixed `issue
- #8 `_
-- 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2
- (added close method)
-- 2011-10-20: code hosted on bitbucket to ease contributions and bug
- tracking
-- 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC
- Macs.
-- 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not
- plain str.
-- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben
- G. and Martijn for reporting the bug)
-- see changelog in source code for more info.
-
-Download/Install
-----------------
-
-If you have pip or setuptools installed (pip is included in Python
-2.7.9+), you may simply run **pip install olefile** or **easy\_install
-olefile** for the first installation.
-
-To update olefile, run **pip install -U olefile**.
-
-Otherwise, see https://bitbucket.org/decalage/olefileio\_pl/wiki/Install
-
-Features
---------
-
-- Parse, read and write any OLE file such as Microsoft Office 97-2003
- legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt,
- Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook
- messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView
- OIB files, etc
-- List all the streams and storages contained in an OLE file
-- Open streams as files
-- Parse and read property streams, containing metadata of the file
-- Portable, pure Python module, no dependency
-
-olefile can be used as an independent package or with PIL/Pillow.
-
-olefile is mostly meant for developers. If you are looking for tools to
-analyze OLE files or to extract data (especially for security purposes
-such as malware analysis and forensics), then please also check my
-`python-oletools `_, which are
-built upon olefile and provide a higher-level interface.
-
-History
--------
-
-olefile is based on the OleFileIO module from
-`PIL `_, the excellent
-Python Imaging Library, created and maintained by Fredrik Lundh. The
-olefile API is still compatible with PIL, but since 2005 I have improved
-the internal implementation significantly, with new features, bugfixes
-and a more robust design. From 2005 to 2014 the project was called
-OleFileIO\_PL, and in 2014 I changed its name to olefile to celebrate
-its 9 years and its new write features.
-
-As far as I know, olefile is the most complete and robust Python
-implementation to read MS OLE2 files, portable on several operating
-systems. (please tell me if you know other similar Python modules)
-
-Since 2014 olefile/OleFileIO\_PL has been integrated into
-`Pillow `_, the friendly fork of PIL.
-olefile will continue to be improved as a separate project, and new
-versions will be merged into Pillow regularly.
-
-Main improvements over the original version of OleFileIO in PIL:
-----------------------------------------------------------------
-
-- Compatible with Python 3.x and 2.6+
-- Many bug fixes
-- Support for files larger than 6.8MB
-- Support for 64 bits platforms and big-endian CPUs
-- Robust: many checks to detect malformed files
-- Runtime option to choose if malformed files should be parsed or raise
- exceptions
-- Improved API
-- Metadata extraction, stream/storage timestamps (e.g. for document
- forensics)
-- Can open file-like objects
-- Added setup.py and install.bat to ease installation
-- More convenient slash-based syntax for stream paths
-- Write features
-
-Documentation
--------------
-
-Please see the `online
-documentation `_ for
-more information, especially the `OLE
-overview `_
-and the `API
-page `_ which
-describe how to use olefile in Python applications. A copy of the same
-documentation is also provided in the doc subfolder of the olefile
-package.
-
-Real-life examples
-------------------
-
-A real-life example: `using OleFileIO\_PL for malware analysis and
-forensics `_.
-
-See also `this
-paper `_
-about python tools for forensics, which features olefile.
-
-License
--------
-
-olefile (formerly OleFileIO\_PL) is copyright (c) 2005-2015 Philippe
-Lagadec (`http://www.decalage.info `_)
-
-All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are
-met:
-
-- Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
-- Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in the
- documentation and/or other materials provided with the distribution.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
-IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
-TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
-PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
-TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
-PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
-LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
-NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
-SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
---------------
-
-olefile is based on source code from the OleFileIO module of the Python
-Imaging Library (PIL) published by Fredrik Lundh under the following
-license:
-
-The Python Imaging Library (PIL) is
-
-- Copyright (c) 1997-2005 by Secret Labs AB
-- Copyright (c) 1995-2005 by Fredrik Lundh
-
-By obtaining, using, and/or copying this software and/or its associated
-documentation, you agree that you have read, understood, and will comply
-with the following terms and conditions:
-
-Permission to use, copy, modify, and distribute this software and its
-associated documentation for any purpose and without fee is hereby
-granted, provided that the above copyright notice appears in all copies,
-and that both that copyright notice and this permission notice appear in
-supporting documentation, and that the name of Secret Labs AB or the
-author not be used in advertising or publicity pertaining to
-distribution of the software without specific, written prior permission.
-
-SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
-THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
-FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
-ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
-RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
-CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
-CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+olefile (formerly OleFileIO\_PL)
+================================
+
+`olefile `__ is a Python package to
+parse, read and write `Microsoft OLE2
+files `__
+(also called Structured Storage, Compound File Binary Format or Compound
+Document File Format), such as Microsoft Office 97-2003 documents,
+vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix
+files, Outlook messages, StickyNotes, several Microscopy file formats,
+McAfee antivirus quarantine files, etc.
+
+**Quick links:** `Home page `__ -
+`Download/Install `__
+- `Documentation `__ -
+`Report
+Issues/Suggestions/Questions `__
+- `Contact the author `__ -
+`Repository `__ - `Updates
+on Twitter `__
+
+News
+----
+
+Follow all updates and news on Twitter: https://twitter.com/decalage2
+
+- **2016-02-02 v0.43**: fixed issues
+ `#26 `__
+ and
+ `#27 `__,
+ better handling of malformed files, use python logging.
+- 2015-01-25 v0.42: improved handling of special characters in
+ stream/storage names on Python 2.x (using UTF-8 instead of Latin-1),
+ fixed bug in listdir with empty storages.
+- 2014-11-25 v0.41: OleFileIO.open and isOleFile now support OLE files
+ stored in byte strings, fixed installer for python 3, added support
+ for Jython (Niko Ehrenfeuchter)
+- 2014-10-01 v0.40: renamed OleFileIO\_PL to olefile, added initial
+ write support for streams >4K, updated doc and license, improved the
+ setup script.
+- 2014-07-27 v0.31: fixed support for large files with 4K sectors,
+ thanks to Niko Ehrenfeuchter, Martijn Berger and Dave Jones. Added
+ test scripts from Pillow (by hugovk). Fixed setup for Python 3
+ (Martin Panter)
+- 2014-02-04 v0.30: now compatible with Python 3.x, thanks to Martin
+ Panter who did most of the hard work.
+- 2013-07-24 v0.26: added methods to parse stream/storage timestamps,
+ improved listdir to include storages, fixed parsing of direntry
+ timestamps
+- 2013-05-27 v0.25: improved metadata extraction, properties parsing
+ and exception handling, fixed `issue
+ #12 `__
+- 2013-05-07 v0.24: new features to extract metadata (get\_metadata
+ method and OleMetadata class), improved getproperties to convert
+ timestamps to Python datetime
+- 2012-10-09: published
+ `python-oletools `__, a
+ package of analysis tools based on OleFileIO\_PL
+- 2012-09-11 v0.23: added support for file-like objects, fixed `issue
+ #8 `__
+- 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2
+ (added close method)
+- 2011-10-20: code hosted on bitbucket to ease contributions and bug
+ tracking
+- 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC
+ Macs.
+- 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not
+ plain str.
+- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben
+ G. and Martijn for reporting the bug)
+- see changelog in source code for more info.
+
+Download/Install
+----------------
+
+If you have pip or setuptools installed (pip is included in Python
+2.7.9+), you may simply run **pip install olefile** or **easy\_install
+olefile** for the first installation.
+
+To update olefile, run **pip install -U olefile**.
+
+Otherwise, see https://bitbucket.org/decalage/olefileio\_pl/wiki/Install
+
+Features
+--------
+
+- Parse, read and write any OLE file such as Microsoft Office 97-2003
+ legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt,
+ Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook
+ messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView
+ OIB files, etc
+- List all the streams and storages contained in an OLE file
+- Open streams as files
+- Parse and read property streams, containing metadata of the file
+- Portable, pure Python module, no dependency
+
+olefile can be used as an independent package or with PIL/Pillow.
+
+olefile is mostly meant for developers. If you are looking for tools to
+analyze OLE files or to extract data (especially for security purposes
+such as malware analysis and forensics), then please also check my
+`python-oletools `__, which
+are built upon olefile and provide a higher-level interface.
+
+History
+-------
+
+olefile is based on the OleFileIO module from
+`PIL `__, the
+excellent Python Imaging Library, created and maintained by Fredrik
+Lundh. The olefile API is still compatible with PIL, but since 2005 I
+have improved the internal implementation significantly, with new
+features, bugfixes and a more robust design. From 2005 to 2014 the
+project was called OleFileIO\_PL, and in 2014 I changed its name to
+olefile to celebrate its 9 years and its new write features.
+
+As far as I know, olefile is the most complete and robust Python
+implementation to read MS OLE2 files, portable on several operating
+systems. (please tell me if you know other similar Python modules)
+
+Since 2014 olefile/OleFileIO\_PL has been integrated into
+`Pillow `__, the friendly fork of PIL.
+olefile will continue to be improved as a separate project, and new
+versions will be merged into Pillow regularly.
+
+Main improvements over the original version of OleFileIO in PIL:
+----------------------------------------------------------------
+
+- Compatible with Python 3.x and 2.6+
+- Many bug fixes
+- Support for files larger than 6.8MB
+- Support for 64 bits platforms and big-endian CPUs
+- Robust: many checks to detect malformed files
+- Runtime option to choose if malformed files should be parsed or raise
+ exceptions
+- Improved API
+- Metadata extraction, stream/storage timestamps (e.g. for document
+ forensics)
+- Can open file-like objects
+- Added setup.py and install.bat to ease installation
+- More convenient slash-based syntax for stream paths
+- Write features
+
+Documentation
+-------------
+
+Please see the `online
+documentation `__ for
+more information, especially the `OLE
+overview `__
+and the `API
+page `__ which
+describe how to use olefile in Python applications. A copy of the same
+documentation is also provided in the doc subfolder of the olefile
+package.
+
+Real-life examples
+------------------
+
+A real-life example: `using OleFileIO\_PL for malware analysis and
+forensics `__.
+
+See also `this
+paper `__
+about python tools for forensics, which features olefile.
+
+License
+-------
+
+olefile (formerly OleFileIO\_PL) is copyright (c) 2005-2016 Philippe
+Lagadec (http://www.decalage.info)
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+- Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+- Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
+IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
+TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+--------------
+
+olefile is based on source code from the OleFileIO module of the Python
+Imaging Library (PIL) published by Fredrik Lundh under the following
+license:
+
+The Python Imaging Library (PIL) is
+
+- Copyright (c) 1997-2005 by Secret Labs AB
+- Copyright (c) 1995-2005 by Fredrik Lundh
+
+By obtaining, using, and/or copying this software and/or its associated
+documentation, you agree that you have read, understood, and will comply
+with the following terms and conditions:
+
+Permission to use, copy, modify, and distribute this software and its
+associated documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appears in all copies,
+and that both that copyright notice and this permission notice appear in
+supporting documentation, and that the name of Secret Labs AB or the
+author not be used in advertising or publicity pertaining to
+distribution of the software without specific, written prior permission.
+
+SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
+THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
+ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
+RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
+CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
+CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
diff --git a/oletools/thirdparty/olefile/olefile.py b/oletools/thirdparty/olefile/olefile.py
index 8918cbf..b861e3e 100644
--- a/oletools/thirdparty/olefile/olefile.py
+++ b/oletools/thirdparty/olefile/olefile.py
@@ -1,6 +1,6 @@
#!/usr/bin/env python
-# olefile (formerly OleFileIO_PL) version 0.43 2015-04-17
+# olefile (formerly OleFileIO_PL)
#
# Module to read/write Microsoft OLE2 files (also called Structured Storage or
# Microsoft Compound Document File Format), such as Microsoft Office 97-2003
@@ -9,7 +9,7 @@
#
# Project website: http://www.decalage.info/olefile
#
-# olefile is copyright (c) 2005-2015 Philippe Lagadec (http://www.decalage.info)
+# olefile is copyright (c) 2005-2016 Philippe Lagadec (http://www.decalage.info)
#
# olefile is based on the OleFileIO module from the PIL library v1.1.6
# See: http://www.pythonware.com/products/pil/index.htm
@@ -29,12 +29,12 @@ from __future__ import print_function # This version of olefile requires Pytho
__author__ = "Philippe Lagadec"
-__date__ = "2015-04-17"
-__version__ = '0.43'
+__date__ = "2016-02-02"
+__version__ = '0.44'
#--- LICENSE ------------------------------------------------------------------
-# olefile (formerly OleFileIO_PL) is copyright (c) 2005-2015 Philippe Lagadec
+# olefile (formerly OleFileIO_PL) is copyright (c) 2005-2016 Philippe Lagadec
# (http://www.decalage.info)
#
# All rights reserved.
@@ -182,6 +182,14 @@ __version__ = '0.43'
# - added path_encoding option to override the default
# - fixed a bug in _list when a storage is empty
# 2015-04-17 v0.43 PL: - slight changes in _OleDirectoryEntry
+# 2015-10-19 - fixed issue #26 in OleFileIO.getproperties
+# (using id and type as local variable names)
+# 2015-10-29 - replaced debug() with proper logging
+# - use optparse to handle command line options
+# - improved attribute names in OleFileIO class
+# 2015-11-05 - fixed issue #27 by correcting the MiniFAT sector
+# cutoff size if invalid.
+# 2016-02-02 - logging is disabled by default
#-----------------------------------------------------------------------------
# TODO (for version 1.0):
@@ -257,7 +265,7 @@ __version__ = '0.43'
import io
import sys
-import struct, array, os.path, datetime
+import struct, array, os.path, datetime, logging
#=== COMPATIBILITY WORKAROUNDS ================================================
@@ -327,30 +335,46 @@ else:
DEFAULT_PATH_ENCODING = None
-#=== DEBUGGING ===============================================================
+# === LOGGING =================================================================
-#TODO: replace this by proper logging
-
-#[PL] DEBUG display mode: False by default, use set_debug_mode() or "-d" on
-# command line to change it.
-DEBUG_MODE = False
-def debug_print(msg):
- print(msg)
-def debug_pass(msg):
- pass
-debug = debug_pass
+class NullHandler(logging.Handler):
+ """
+ Log Handler without output, to avoid printing messages if logging is not
+ configured by the main application.
+ Python 2.7 has logging.NullHandler, but this is necessary for 2.6:
+ see https://docs.python.org/2.6/library/logging.html#configuring-logging-for-a-library
+ """
+ def emit(self, record):
+ pass
-def set_debug_mode(debug_mode):
+def get_logger(name, level=logging.CRITICAL+1):
"""
- Set debug mode on or off, to control display of debugging messages.
- :param mode: True or False
+ Create a suitable logger object for this module.
+ The goal is not to change settings of the root logger, to avoid getting
+ other modules' logs on the screen.
+ If a logger exists with same name, reuse it. (Else it would have duplicate
+ handlers and messages would be doubled.)
+ The level is set to CRITICAL+1 by default, to avoid any logging.
"""
- global DEBUG_MODE, debug
- DEBUG_MODE = debug_mode
- if debug_mode:
- debug = debug_print
- else:
- debug = debug_pass
+ # First, test if there is already a logger with the same name, else it
+ # will generate duplicate messages (due to duplicate handlers):
+ if name in logging.Logger.manager.loggerDict:
+ #NOTE: another less intrusive but more "hackish" solution would be to
+ # use getLogger then test if its effective level is not default.
+ logger = logging.getLogger(name)
+ # make sure level is OK:
+ logger.setLevel(level)
+ return logger
+ # get a new logger:
+ logger = logging.getLogger(name)
+ # only add a NullHandler for this logger, it is up to the application
+ # to configure its own logging:
+ logger.addHandler(NullHandler())
+ logger.setLevel(level)
+ return logger
+
+# a global logger object used for debugging:
+log = get_logger('olefile')
#=== CONSTANTS ===============================================================
@@ -518,7 +542,7 @@ def filetime2datetime(filetime):
# TODO: manage exception when microseconds is too large
# inspired from http://code.activestate.com/recipes/511425-filetime-to-datetime/
_FILETIME_null_date = datetime.datetime(1601, 1, 1, 0, 0, 0)
- #debug('timedelta days=%d' % (filetime//(10*1000000*3600*24)))
+ #log.debug('timedelta days=%d' % (filetime//(10*1000000*3600*24)))
return _FILETIME_null_date + datetime.timedelta(microseconds=filetime//10)
@@ -695,6 +719,7 @@ class _OleStream(io.BytesIO):
- size: actual size of data stream, after it was opened.
"""
+ #TODO: use _raise_defect instead of exceptions
# FIXME: should store the list of sects obtained by following
# the fat chain, and load new sectors on demand instead of
@@ -713,8 +738,8 @@ class _OleStream(io.BytesIO):
:param filesize: size of OLE file (for debugging)
:returns: a BytesIO instance containing the OLE stream
"""
- debug('_OleStream.__init__:')
- debug(' sect=%d (%X), size=%d, offset=%d, sectorsize=%d, len(fat)=%d, fp=%s'
+ log.debug('_OleStream.__init__:')
+ log.debug(' sect=%d (%X), size=%d, offset=%d, sectorsize=%d, len(fat)=%d, fp=%s'
%(sect,sect,size,offset,sectorsize,len(fat), repr(fp)))
#[PL] To detect malformed documents with FAT loops, we compute the
# expected number of sectors in the stream:
@@ -726,9 +751,9 @@ class _OleStream(io.BytesIO):
size = len(fat)*sectorsize
# and we keep a record that size was unknown:
unknown_size = True
- debug(' stream with UNKNOWN SIZE')
+ log.debug(' stream with UNKNOWN SIZE')
nb_sectors = (size + (sectorsize-1)) // sectorsize
- debug('nb_sectors = %d' % nb_sectors)
+ log.debug('nb_sectors = %d' % nb_sectors)
# This number should (at least) be less than the total number of
# sectors in the given FAT:
if nb_sectors > len(fat):
@@ -739,7 +764,7 @@ class _OleStream(io.BytesIO):
data = []
# if size is zero, then first sector index should be ENDOFCHAIN:
if size == 0 and sect != ENDOFCHAIN:
- debug('size == 0 and sect != ENDOFCHAIN:')
+ log.debug('size == 0 and sect != ENDOFCHAIN:')
raise IOError('incorrect OLE sector index for empty stream')
#[PL] A fixed-length for loop is used instead of an undefined while
# loop to avoid DoS attacks:
@@ -750,24 +775,24 @@ class _OleStream(io.BytesIO):
break
else:
# else this means that the stream is smaller than declared:
- debug('sect=ENDOFCHAIN before expected size')
+ log.debug('sect=ENDOFCHAIN before expected size')
raise IOError('incomplete OLE stream')
# sector index should be within FAT:
if sect<0 or sect>=len(fat):
- debug('sect=%d (%X) / len(fat)=%d' % (sect, sect, len(fat)))
- debug('i=%d / nb_sectors=%d' %(i, nb_sectors))
+ log.debug('sect=%d (%X) / len(fat)=%d' % (sect, sect, len(fat)))
+ log.debug('i=%d / nb_sectors=%d' %(i, nb_sectors))
## tmp_data = b"".join(data)
## f = open('test_debug.bin', 'wb')
## f.write(tmp_data)
## f.close()
-## debug('data read so far: %d bytes' % len(tmp_data))
+## log.debug('data read so far: %d bytes' % len(tmp_data))
raise IOError('incorrect OLE FAT, sector index out of range')
#TODO: merge this code with OleFileIO.getsect() ?
#TODO: check if this works with 4K sectors:
try:
fp.seek(offset + sectorsize * sect)
except:
- debug('sect=%d, seek=%d, filesize=%d' %
+ log.debug('sect=%d, seek=%d, filesize=%d' %
(sect, offset+sectorsize*sect, filesize))
raise IOError('OLE sector index out of range')
sector_data = fp.read(sectorsize)
@@ -776,9 +801,9 @@ class _OleStream(io.BytesIO):
# complete sector (of 512 or 4K), so we may read less than
# sectorsize.
if len(sector_data)!=sectorsize and sect!=(len(fat)-1):
- debug('sect=%d / len(fat)=%d, seek=%d / filesize=%d, len read=%d' %
+ log.debug('sect=%d / len(fat)=%d, seek=%d / filesize=%d, len read=%d' %
(sect, len(fat), offset+sectorsize*sect, filesize, len(sector_data)))
- debug('seek+len(read)=%d' % (offset+sectorsize*sect+len(sector_data)))
+ log.debug('seek+len(read)=%d' % (offset+sectorsize*sect+len(sector_data)))
raise IOError('incomplete OLE sector')
data.append(sector_data)
# jump to next sector in the FAT:
@@ -802,7 +827,8 @@ class _OleStream(io.BytesIO):
self.size = len(data)
else:
# read data is less than expected:
- debug('len(data)=%d, size=%d' % (len(data), size))
+ log.debug('len(data)=%d, size=%d' % (len(data), size))
+ # TODO: provide details in exception message
raise IOError('OLE stream size is less than declared')
# when all data is read in memory, BytesIO constructor is called
io.BytesIO.__init__(self, data)
@@ -888,7 +914,7 @@ class _OleDirectoryEntry:
olefile._raise_defect(DEFECT_INCORRECT, 'duplicate OLE root entry')
if sid == 0 and self.entry_type != STGTY_ROOT:
olefile._raise_defect(DEFECT_INCORRECT, 'incorrect OLE root entry')
- #debug (struct.unpack(fmt_entry, entry[:len_entry]))
+ #log.debug(struct.unpack(fmt_entry, entry[:len_entry]))
# name should be at most 31 unicode characters + null character,
# so 64 bytes in total (31*2 + 2):
if self.namelength>64:
@@ -903,10 +929,10 @@ class _OleDirectoryEntry:
# name is converted from UTF-16LE to the path encoding specified in the OleFileIO:
self.name = olefile._decode_utf16_str(self.name_utf16)
- debug('DirEntry SID=%d: %s' % (self.sid, repr(self.name)))
- debug(' - type: %d' % self.entry_type)
- debug(' - sect: %d' % self.isectStart)
- debug(' - SID left: %d, right: %d, child: %d' % (self.sid_left,
+ log.debug('DirEntry SID=%d: %s' % (self.sid, repr(self.name)))
+ log.debug(' - type: %d' % self.entry_type)
+ log.debug(' - sect: %Xh' % self.isectStart)
+ log.debug(' - SID left: %d, right: %d, child: %d' % (self.sid_left,
self.sid_right, self.sid_child))
# sizeHigh is only used for 4K sectors, it should be zero for 512 bytes
@@ -914,13 +940,14 @@ class _OleDirectoryEntry:
# or some other value so it cannot be raised as a defect in general:
if olefile.sectorsize == 512:
if self.sizeHigh != 0 and self.sizeHigh != 0xFFFFFFFF:
- debug('sectorsize=%d, sizeLow=%d, sizeHigh=%d (%X)' %
+ log.debug('sectorsize=%d, sizeLow=%d, sizeHigh=%d (%X)' %
(olefile.sectorsize, self.sizeLow, self.sizeHigh, self.sizeHigh))
olefile._raise_defect(DEFECT_UNSURE, 'incorrect OLE stream size')
self.size = self.sizeLow
else:
self.size = self.sizeLow + (long(self.sizeHigh)<<32)
- debug(' - size: %d (sizeLow=%d, sizeHigh=%d)' % (self.size, self.sizeLow, self.sizeHigh))
+ log.debug(' - size: %d (sizeLow=%d, sizeHigh=%d)' % (self.size, self.sizeLow, self.sizeHigh))
+
self.clsid = _clsid(clsid)
# a storage should have a null size, BUT some implementations such as
# Word 8 for Mac seem to allow non-null values => Potential defect:
@@ -945,7 +972,7 @@ class _OleDirectoryEntry:
Note that this method builds a tree of all subentries, so it should
only be called for the root object once.
"""
- debug('build_storage_tree: SID=%d - %s - sid_child=%d'
+ log.debug('build_storage_tree: SID=%d - %s - sid_child=%d'
% (self.sid, repr(self.name), self.sid_child))
if self.sid_child != NOSTREAM:
# if child SID is not NOSTREAM, then this entry is a storage.
@@ -980,7 +1007,7 @@ class _OleDirectoryEntry:
self.olefile._raise_defect(DEFECT_FATAL, 'OLE DirEntry index out of range')
# get child direntry:
child = self.olefile._load_direntry(child_sid) #direntries[child_sid]
- debug('append_kids: child_sid=%d - %s - sid_left=%d, sid_right=%d, sid_child=%d'
+ log.debug('append_kids: child_sid=%d - %s - sid_left=%d, sid_right=%d, sid_child=%d'
% (child.sid, repr(child.name), child.sid_left, child.sid_right, child.sid_child))
# the directory entries are organized as a red-black tree.
# (cf. Wikipedia for details)
@@ -1121,14 +1148,13 @@ class OleFileIO:
:param write_mode: bool, if True the file is opened in read/write mode instead
of read-only by default.
- :param debug: bool, set debug mode
+ :param debug: bool, set debug mode (deprecated, not used anymore)
:param path_encoding: None or str, name of the codec to use for path
names (streams and storages), or None for Unicode.
Unicode by default on Python 3+, UTF-8 on Python 2.x.
(new in olefile 0.42, was hardcoded to Latin-1 until olefile v0.41)
"""
- set_debug_mode(debug)
# minimal level for defects to be raised as exceptions:
self._raise_defects_level = raise_defects
# list of defects/issues not raised as exceptions:
@@ -1160,10 +1186,12 @@ class OleFileIO:
"""
# added by [PL]
if defect_level >= self._raise_defects_level:
+ log.error(message)
raise exception_type(message)
else:
# just record the issue, no exception raised:
self.parsing_issues.append((exception_type, message))
+ log.warning(message)
def _decode_utf16_str(self, utf16_str, errors='replace'):
@@ -1235,6 +1263,7 @@ class OleFileIO:
finally:
self.fp.seek(0)
self._filesize = filesize
+ log.debug('File size: %d' % self._filesize)
# lists of streams in FAT and MiniFAT, to detect duplicate references
# (list of indexes of first sectors of each stream)
@@ -1244,6 +1273,7 @@ class OleFileIO:
header = self.fp.read(512)
if len(header) != 512 or header[:8] != MAGIC:
+ log.debug('Magic = %r instead of %r' % (header[:8], MAGIC))
self._raise_defect(DEFECT_FATAL, "not an OLE2 structured storage file")
# [PL] header structure according to AAF specifications:
@@ -1285,120 +1315,125 @@ class OleFileIO:
# '<' indicates little-endian byte ordering for Intel (cf. struct module help)
fmt_header = '<8s16sHHHHHHLLLLLLLLLL'
header_size = struct.calcsize(fmt_header)
- debug( "fmt_header size = %d, +FAT = %d" % (header_size, header_size + 109*4) )
+ log.debug( "fmt_header size = %d, +FAT = %d" % (header_size, header_size + 109*4) )
header1 = header[:header_size]
(
- self.Sig,
- self.clsid,
- self.MinorVersion,
- self.DllVersion,
- self.ByteOrder,
- self.SectorShift,
- self.MiniSectorShift,
- self.Reserved, self.Reserved1,
- self.csectDir,
- self.csectFat,
- self.sectDirStart,
- self.signature,
- self.MiniSectorCutoff,
- self.MiniFatStart,
- self.csectMiniFat,
- self.sectDifStart,
- self.csectDif
+ self.header_signature,
+ self.header_clsid,
+ self.minor_version,
+ self.dll_version,
+ self.byte_order,
+ self.sector_shift,
+ self.mini_sector_shift,
+ self.reserved1,
+ self.reserved2,
+ self.num_dir_sectors,
+ self.num_fat_sectors,
+ self.first_dir_sector,
+ self.transaction_signature_number,
+ self.mini_stream_cutoff_size,
+ self.first_mini_fat_sector,
+ self.num_mini_fat_sectors,
+ self.first_difat_sector,
+ self.num_difat_sectors
) = struct.unpack(fmt_header, header1)
- debug( struct.unpack(fmt_header, header1))
+ log.debug( struct.unpack(fmt_header, header1))
- if self.Sig != MAGIC:
+ if self.header_signature != MAGIC:
# OLE signature should always be present
self._raise_defect(DEFECT_FATAL, "incorrect OLE signature")
- if self.clsid != bytearray(16):
+ if self.header_clsid != bytearray(16):
# according to AAF specs, CLSID should always be zero
self._raise_defect(DEFECT_INCORRECT, "incorrect CLSID in OLE header")
- debug( "MinorVersion = %d" % self.MinorVersion )
- debug( "DllVersion = %d" % self.DllVersion )
- if self.DllVersion not in [3, 4]:
+ log.debug( "Minor Version = %d" % self.minor_version )
+ log.debug( "DLL Version = %d (expected: 3 or 4)" % self.dll_version )
+ if self.dll_version not in [3, 4]:
# version 3: usual format, 512 bytes per sector
# version 4: large format, 4K per sector
self._raise_defect(DEFECT_INCORRECT, "incorrect DllVersion in OLE header")
- debug( "ByteOrder = %X" % self.ByteOrder )
- if self.ByteOrder != 0xFFFE:
+ log.debug( "Byte Order = %X (expected: FFFE)" % self.byte_order )
+ if self.byte_order != 0xFFFE:
# For now only common little-endian documents are handled correctly
self._raise_defect(DEFECT_FATAL, "incorrect ByteOrder in OLE header")
# TODO: add big-endian support for documents created on Mac ?
# But according to [MS-CFB] ? v20140502, ByteOrder MUST be 0xFFFE.
- self.SectorSize = 2**self.SectorShift
- debug( "SectorSize = %d" % self.SectorSize )
- if self.SectorSize not in [512, 4096]:
- self._raise_defect(DEFECT_INCORRECT, "incorrect SectorSize in OLE header")
- if (self.DllVersion==3 and self.SectorSize!=512) \
- or (self.DllVersion==4 and self.SectorSize!=4096):
- self._raise_defect(DEFECT_INCORRECT, "SectorSize does not match DllVersion in OLE header")
- self.MiniSectorSize = 2**self.MiniSectorShift
- debug( "MiniSectorSize = %d" % self.MiniSectorSize )
- if self.MiniSectorSize not in [64]:
- self._raise_defect(DEFECT_INCORRECT, "incorrect MiniSectorSize in OLE header")
- if self.Reserved != 0 or self.Reserved1 != 0:
+ self.sector_size = 2**self.sector_shift
+ log.debug( "Sector Size = %d bytes (expected: 512 or 4096)" % self.sector_size )
+ if self.sector_size not in [512, 4096]:
+ self._raise_defect(DEFECT_INCORRECT, "incorrect sector_size in OLE header")
+ if (self.dll_version==3 and self.sector_size!=512) \
+ or (self.dll_version==4 and self.sector_size!=4096):
+ self._raise_defect(DEFECT_INCORRECT, "sector_size does not match DllVersion in OLE header")
+ self.mini_sector_size = 2**self.mini_sector_shift
+ log.debug( "MiniFAT Sector Size = %d bytes (expected: 64)" % self.mini_sector_size )
+ if self.mini_sector_size not in [64]:
+ self._raise_defect(DEFECT_INCORRECT, "incorrect mini_sector_size in OLE header")
+ if self.reserved1 != 0 or self.reserved2 != 0:
self._raise_defect(DEFECT_INCORRECT, "incorrect OLE header (non-null reserved bytes)")
- debug( "csectDir = %d" % self.csectDir )
+ log.debug( "Number of directory sectors = %d" % self.num_dir_sectors )
# Number of directory sectors (only allowed if DllVersion != 3)
- if self.SectorSize==512 and self.csectDir!=0:
- self._raise_defect(DEFECT_INCORRECT, "incorrect csectDir in OLE header")
- debug( "csectFat = %d" % self.csectFat )
- # csectFat = number of FAT sectors in the file
- debug( "sectDirStart = %X" % self.sectDirStart )
- # sectDirStart = 1st sector containing the directory
- debug( "signature = %d" % self.signature )
+ if self.sector_size==512 and self.num_dir_sectors!=0:
+ self._raise_defect(DEFECT_INCORRECT, "incorrect number of directory sectors in OLE header")
+ log.debug( "num_fat_sectors = %d" % self.num_fat_sectors )
+ # num_fat_sectors = number of FAT sectors in the file
+ log.debug( "first_dir_sector = %X" % self.first_dir_sector )
+ # first_dir_sector = 1st sector containing the directory
+ log.debug( "transaction_signature_number = %d" % self.transaction_signature_number )
# Signature should be zero, BUT some implementations do not follow this
# rule => only a potential defect:
# (according to MS-CFB, may be != 0 for applications supporting file
# transactions)
- if self.signature != 0:
- self._raise_defect(DEFECT_POTENTIAL, "incorrect OLE header (signature>0)")
- debug( "MiniSectorCutoff = %d" % self.MiniSectorCutoff )
+ if self.transaction_signature_number != 0:
+ self._raise_defect(DEFECT_POTENTIAL, "incorrect OLE header (transaction_signature_number>0)")
+ log.debug( "mini_stream_cutoff_size = 0x%X (expected: 0x1000)" % self.mini_stream_cutoff_size )
# MS-CFB: This integer field MUST be set to 0x00001000. This field
# specifies the maximum size of a user-defined data stream allocated
# from the mini FAT and mini stream, and that cutoff is 4096 bytes.
# Any user-defined data stream larger than or equal to this cutoff size
# must be allocated as normal sectors from the FAT.
- if self.MiniSectorCutoff != 0x1000:
- self._raise_defect(DEFECT_INCORRECT, "incorrect MiniSectorCutoff in OLE header")
- debug( "MiniFatStart = %X" % self.MiniFatStart )
- debug( "csectMiniFat = %d" % self.csectMiniFat )
- debug( "sectDifStart = %X" % self.sectDifStart )
- debug( "csectDif = %d" % self.csectDif )
+ if self.mini_stream_cutoff_size != 0x1000:
+ self._raise_defect(DEFECT_INCORRECT, "incorrect mini_stream_cutoff_size in OLE header")
+ # if no exception is raised, the cutoff size is fixed to 0x1000
+ log.warning('Fixing the mini_stream_cutoff_size to 4096 (mandatory value) instead of %d' %
+ self.mini_stream_cutoff_size)
+ self.mini_stream_cutoff_size = 0x1000
+ log.debug( "first_mini_fat_sector = %Xh" % self.first_mini_fat_sector )
+ log.debug( "num_mini_fat_sectors = %d" % self.num_mini_fat_sectors )
+ log.debug( "first_difat_sector = %Xh" % self.first_difat_sector )
+ log.debug( "num_difat_sectors = %d" % self.num_difat_sectors )
# calculate the number of sectors in the file
# (-1 because header doesn't count)
- self.nb_sect = ( (filesize + self.SectorSize-1) // self.SectorSize) - 1
- debug( "Number of sectors in the file: %d" % self.nb_sect )
+ self.nb_sect = ( (filesize + self.sector_size-1) // self.sector_size) - 1
+ log.debug( "Number of sectors in the file: %d" % self.nb_sect )
#TODO: change this test, because an OLE file MAY contain other data
# after the last sector.
# file clsid
- self.clsid = _clsid(header[8:24])
+ self.header_clsid = _clsid(header[8:24])
#TODO: remove redundant attributes, and fix the code which uses them?
- self.sectorsize = self.SectorSize #1 << i16(header, 30)
- self.minisectorsize = self.MiniSectorSize #1 << i16(header, 32)
- self.minisectorcutoff = self.MiniSectorCutoff # i32(header, 56)
+ self.sectorsize = self.sector_size #1 << i16(header, 30)
+ self.minisectorsize = self.mini_sector_size #1 << i16(header, 32)
+ self.minisectorcutoff = self.mini_stream_cutoff_size # i32(header, 56)
# check known streams for duplicate references (these are always in FAT,
# never in MiniFAT):
- self._check_duplicate_stream(self.sectDirStart)
+ self._check_duplicate_stream(self.first_dir_sector)
# check MiniFAT only if it is not empty:
- if self.csectMiniFat:
- self._check_duplicate_stream(self.MiniFatStart)
+ if self.num_mini_fat_sectors:
+ self._check_duplicate_stream(self.first_mini_fat_sector)
# check DIFAT only if it is not empty:
- if self.csectDif:
- self._check_duplicate_stream(self.sectDifStart)
+ if self.num_difat_sectors:
+ self._check_duplicate_stream(self.first_difat_sector)
# Load file allocation tables
self.loadfat(header)
# Load direcory. This sets both the direntries list (ordered by sid)
# and the root (ordered by hierarchy) members.
- self.loaddirectory(self.sectDirStart)#i32(header, 48))
+ self.loaddirectory(self.first_dir_sector)#i32(header, 48))
self.ministream = None
- self.minifatsect = self.MiniFatStart #i32(header, 60)
+ self.minifatsect = self.first_mini_fat_sector #i32(header, 60)
def close(self):
@@ -1418,10 +1453,10 @@ class OleFileIO:
:param minifat: bool, if True, stream is located in the MiniFAT, else in the FAT
"""
if minifat:
- debug('_check_duplicate_stream: sect=%d in MiniFAT' % first_sect)
+ log.debug('_check_duplicate_stream: sect=%Xh in MiniFAT' % first_sect)
used_streams = self._used_streams_minifat
else:
- debug('_check_duplicate_stream: sect=%d in FAT' % first_sect)
+ log.debug('_check_duplicate_stream: sect=%Xh in FAT' % first_sect)
# some values can be safely ignored (not a real stream):
if first_sect in (DIFSECT,FATSECT,ENDOFCHAIN,FREESECT):
return
@@ -1435,10 +1470,9 @@ class OleFileIO:
def dumpfat(self, fat, firstindex=0):
- "Displays a part of FAT in human-readable form for debugging purpose"
- # [PL] added only for debug
- if not DEBUG_MODE:
- return
+ """
+ Display a part of FAT in human-readable form for debugging purposes
+ """
# dictionary to convert special FAT values in human-readable strings
VPL = 8 # values per line (8+1 * 8+1 = 81)
fatnames = {
@@ -1455,7 +1489,7 @@ class OleFileIO:
print()
for l in range(nlines):
index = l*VPL
- print("%8X:" % (firstindex+index), end=" ")
+ print("%6X:" % (firstindex+index), end=" ")
for i in range(index, index+VPL):
if i>=nbsect:
break
@@ -1473,9 +1507,9 @@ class OleFileIO:
def dumpsect(self, sector, firstindex=0):
- "Displays a sector in a human-readable form, for debugging purpose."
- if not DEBUG_MODE:
- return
+ """
+ Display a sector in a human-readable form, for debugging purposes
+ """
VPL=8 # number of values per line (8+1 * 8+1 = 81)
tab = array.array(UINT32, sector)
if sys.byteorder == 'big':
@@ -1488,7 +1522,7 @@ class OleFileIO:
print()
for l in range(nlines):
index = l*VPL
- print("%8X:" % (firstindex+index), end=" ")
+ print("%6X:" % (firstindex+index), end=" ")
for i in range(index, index+VPL):
if i>=nbsect:
break
@@ -1523,14 +1557,18 @@ class OleFileIO:
else:
# if it's a raw sector, it is parsed in an array
fat1 = self.sect2array(sect)
- self.dumpsect(sect)
+ # Display the sector contents only if the logging level is debug:
+ if log.isEnabledFor(logging.DEBUG):
+ self.dumpsect(sect)
# The FAT is a sector chain starting at the first index of itself.
+ # initialize isect, just in case:
+ isect = None
for isect in fat1:
isect = isect & 0xFFFFFFFF # JYTHON-WORKAROUND
- debug("isect = %X" % isect)
+ log.debug("isect = %X" % isect)
if isect == ENDOFCHAIN or isect == FREESECT:
# the end of the sector chain has been reached
- debug("found end of sector chain")
+ log.debug("found end of sector chain")
break
# read the FAT sector
s = self.getsect(isect)
@@ -1551,7 +1589,7 @@ class OleFileIO:
# Additional sectors are described by DIF blocks
sect = header[76:512]
- debug( "len(sect)=%d, so %d integers" % (len(sect), len(sect)//4) )
+ log.debug( "len(sect)=%d, so %d integers" % (len(sect), len(sect)//4) )
#fat = []
# [PL] FAT is an array of 32 bits unsigned ints, it's more effective
# to use an array than a list in Python.
@@ -1567,53 +1605,57 @@ class OleFileIO:
## s = self.getsect(ix)
## #fat = fat + [i32(s, i) for i in range(0, len(s), 4)]
## fat = fat + array.array(UINT32, s)
- if self.csectDif != 0:
+ if self.num_difat_sectors != 0:
# [PL] There's a DIFAT because file is larger than 6.8MB
# some checks just in case:
- if self.csectFat <= 109:
+ if self.num_fat_sectors <= 109:
# there must be at least 109 blocks in header and the rest in
# DIFAT, so number of sectors must be >109.
self._raise_defect(DEFECT_INCORRECT, 'incorrect DIFAT, not enough sectors')
- if self.sectDifStart >= self.nb_sect:
+ if self.first_difat_sector >= self.nb_sect:
# initial DIFAT block index must be valid
self._raise_defect(DEFECT_FATAL, 'incorrect DIFAT, first index out of range')
- debug( "DIFAT analysis..." )
+ log.debug( "DIFAT analysis..." )
# We compute the necessary number of DIFAT sectors :
# Number of pointers per DIFAT sector = (sectorsize/4)-1
# (-1 because the last pointer is the next DIFAT sector number)
nb_difat_sectors = (self.sectorsize//4)-1
# (if 512 bytes: each DIFAT sector = 127 pointers + 1 towards next DIFAT sector)
- nb_difat = (self.csectFat-109 + nb_difat_sectors-1)//nb_difat_sectors
- debug( "nb_difat = %d" % nb_difat )
- if self.csectDif != nb_difat:
+ nb_difat = (self.num_fat_sectors-109 + nb_difat_sectors-1)//nb_difat_sectors
+ log.debug( "nb_difat = %d" % nb_difat )
+ if self.num_difat_sectors != nb_difat:
raise IOError('incorrect DIFAT')
- isect_difat = self.sectDifStart
+ isect_difat = self.first_difat_sector
for i in iterrange(nb_difat):
- debug( "DIFAT block %d, sector %X" % (i, isect_difat) )
+ log.debug( "DIFAT block %d, sector %X" % (i, isect_difat) )
#TODO: check if corresponding FAT SID = DIFSECT
sector_difat = self.getsect(isect_difat)
difat = self.sect2array(sector_difat)
- self.dumpsect(sector_difat)
+ # Display the sector contents only if the logging level is debug:
+ if log.isEnabledFor(logging.DEBUG):
+ self.dumpsect(sector_difat)
self.loadfat_sect(difat[:nb_difat_sectors])
# last DIFAT pointer is next DIFAT sector:
isect_difat = difat[nb_difat_sectors]
- debug( "next DIFAT sector: %X" % isect_difat )
+ log.debug( "next DIFAT sector: %X" % isect_difat )
# checks:
if isect_difat not in [ENDOFCHAIN, FREESECT]:
# last DIFAT pointer value must be ENDOFCHAIN or FREESECT
raise IOError('incorrect end of DIFAT')
-## if len(self.fat) != self.csectFat:
-## # FAT should contain csectFat blocks
-## print("FAT length: %d instead of %d" % (len(self.fat), self.csectFat))
+## if len(self.fat) != self.num_fat_sectors:
+## # FAT should contain num_fat_sectors blocks
+## print("FAT length: %d instead of %d" % (len(self.fat), self.num_fat_sectors))
## raise IOError('incorrect DIFAT')
# since FAT is read from fixed-size sectors, it may contain more values
# than the actual number of sectors in the file.
# Keep only the relevant sector indexes:
if len(self.fat) > self.nb_sect:
- debug('len(fat)=%d, shrunk to nb_sect=%d' % (len(self.fat), self.nb_sect))
+ log.debug('len(fat)=%d, shrunk to nb_sect=%d' % (len(self.fat), self.nb_sect))
self.fat = self.fat[:self.nb_sect]
- debug('\nFAT:')
- self.dumpfat(self.fat)
+ # Display the FAT contents only if the logging level is debug:
+ if log.isEnabledFor(logging.DEBUG):
+ log.debug('\nFAT:')
+ self.dumpfat(self.fat)
def loadminifat(self):
@@ -1626,15 +1668,15 @@ class OleFileIO:
# 1) Stream size is calculated according to the number of sectors
# declared in the OLE header. This allocated stream may be more than
# needed to store the actual sector indexes.
- # (self.csectMiniFat is the number of sectors of size self.SectorSize)
- stream_size = self.csectMiniFat * self.SectorSize
+ # (self.num_mini_fat_sectors is the number of sectors of size self.sector_size)
+ stream_size = self.num_mini_fat_sectors * self.sector_size
# 2) Actually used size is calculated by dividing the MiniStream size
# (given by root entry size) by the size of mini sectors, *4 for
# 32 bits indexes:
- nb_minisectors = (self.root.size + self.MiniSectorSize-1) // self.MiniSectorSize
+ nb_minisectors = (self.root.size + self.mini_sector_size-1) // self.mini_sector_size
used_size = nb_minisectors * 4
- debug('loadminifat(): minifatsect=%d, nb FAT sectors=%d, used_size=%d, stream_size=%d, nb MiniSectors=%d' %
- (self.minifatsect, self.csectMiniFat, used_size, stream_size, nb_minisectors))
+ log.debug('loadminifat(): minifatsect=%d, nb FAT sectors=%d, used_size=%d, stream_size=%d, nb MiniSectors=%d' %
+ (self.minifatsect, self.num_mini_fat_sectors, used_size, stream_size, nb_minisectors))
if used_size > stream_size:
# This is not really a problem, but may indicate a wrong implementation:
self._raise_defect(DEFECT_INCORRECT, 'OLE MiniStream is larger than MiniFAT')
@@ -1644,11 +1686,13 @@ class OleFileIO:
#self.minifat = [i32(s, i) for i in range(0, len(s), 4)]
self.minifat = self.sect2array(s)
# Then shrink the array to used size, to avoid indexes out of MiniStream:
- debug('MiniFAT shrunk from %d to %d sectors' % (len(self.minifat), nb_minisectors))
+ log.debug('MiniFAT shrunk from %d to %d sectors' % (len(self.minifat), nb_minisectors))
self.minifat = self.minifat[:nb_minisectors]
- debug('loadminifat(): len=%d' % len(self.minifat))
- debug('\nMiniFAT:')
- self.dumpfat(self.minifat)
+ log.debug('loadminifat(): len=%d' % len(self.minifat))
+ # Display the FAT contents only if the logging level is debug:
+ if log.isEnabledFor(logging.DEBUG):
+ log.debug('\nMiniFAT:')
+ self.dumpfat(self.minifat)
def getsect(self, sect):
"""
@@ -1671,12 +1715,12 @@ class OleFileIO:
try:
self.fp.seek(self.sectorsize * (sect+1))
except:
- debug('getsect(): sect=%X, seek=%d, filesize=%d' %
+ log.debug('getsect(): sect=%X, seek=%d, filesize=%d' %
(sect, self.sectorsize*(sect+1), self._filesize))
self._raise_defect(DEFECT_FATAL, 'OLE sector index out of range')
sector = self.fp.read(self.sectorsize)
if len(sector) != self.sectorsize:
- debug('getsect(): sect=%X, read=%d, sectorsize=%d' %
+ log.debug('getsect(): sect=%X, read=%d, sectorsize=%d' %
(sect, len(sector), self.sectorsize))
self._raise_defect(DEFECT_FATAL, 'incomplete OLE sector')
return sector
@@ -1698,7 +1742,7 @@ class OleFileIO:
try:
self.fp.seek(self.sectorsize * (sect+1))
except:
- debug('write_sect(): sect=%X, seek=%d, filesize=%d' %
+ log.debug('write_sect(): sect=%X, seek=%d, filesize=%d' %
(sect, self.sectorsize*(sect+1), self._filesize))
self._raise_defect(DEFECT_FATAL, 'OLE sector index out of range')
if len(data) < self.sectorsize:
@@ -1725,7 +1769,7 @@ class OleFileIO:
#[PL] to detect malformed documents and avoid DoS attacks, the maximum
# number of directory entries can be calculated:
max_entries = self.directory_fp.size // 128
- debug('loaddirectory: size=%d, max_entries=%d' %
+ log.debug('loaddirectory: size=%d, max_entries=%d' %
(self.directory_fp.size, max_entries))
# Create list of directory entries
@@ -1741,6 +1785,10 @@ class OleFileIO:
root_entry = self._load_direntry(0)
# Root entry is the first entry:
self.root = self.direntries[0]
+ # TODO: read ALL directory entries (ignore bad entries?)
+ # TODO: adapt build_storage_tree to avoid duplicate reads
+ # for i in range(1, max_entries):
+ # self._load_direntry(i)
# read and build all storage trees, starting from the root:
self.root.build_storage_tree()
@@ -1788,9 +1836,9 @@ class OleFileIO:
:param force_FAT: if False (default), stream will be opened in FAT or MiniFAT
according to size. If True, it will always be opened in FAT.
"""
- debug('OleFileIO.open(): sect=%d, size=%d, force_FAT=%s' %
+ log.debug('OleFileIO.open(): sect=%Xh, size=%d, force_FAT=%s' %
(start, size, str(force_FAT)))
- # stream size is compared to the MiniSectorCutoff threshold:
+ # stream size is compared to the mini_stream_cutoff_size threshold:
if size < self.minisectorcutoff and not force_FAT:
# ministream object
if not self.ministream:
@@ -1799,7 +1847,7 @@ class OleFileIO:
# The first sector index of the miniFAT stream is stored in the
# root directory entry:
size_ministream = self.root.size
- debug('Opening MiniStream: sect=%d, size=%d' %
+ log.debug('Opening MiniStream: sect=%Xh, size=%d' %
(self.root.isectStart, size_ministream))
self.ministream = self._open(self.root.isectStart,
size_ministream, force_FAT=True)
@@ -1940,12 +1988,12 @@ class OleFileIO:
sect = entry.isectStart
# number of sectors to write
nb_sectors = (size + (self.sectorsize-1)) // self.sectorsize
- debug('nb_sectors = %d' % nb_sectors)
+ log.debug('nb_sectors = %d' % nb_sectors)
for i in range(nb_sectors):
## try:
## self.fp.seek(offset + self.sectorsize * sect)
## except:
-## debug('sect=%d, seek=%d' %
+## log.debug('sect=%d, seek=%d' %
## (sect, offset+self.sectorsize*sect))
## raise IOError('OLE sector index out of range')
# extract one sector from data, the last one being smaller:
@@ -1956,7 +2004,7 @@ class OleFileIO:
else:
data_sector = data [i*self.sectorsize:]
#TODO: comment this if it works
- debug('write_stream: size=%d sectorsize=%d data_sector=%d size%%sectorsize=%d'
+ log.debug('write_stream: size=%d sectorsize=%d data_sector=%Xh size%%sectorsize=%d'
% (size, self.sectorsize, len(data_sector), size % self.sectorsize))
assert(len(data_sector) % self.sectorsize==size % self.sectorsize)
self.write_sect(sect, data_sector)
@@ -2113,31 +2161,31 @@ class OleFileIO:
return data
for i in range(num_props):
+ property_id = 0 # just in case of an exception
try:
- id = 0 # just in case of an exception
- id = i32(s, 8+i*8)
+ property_id = i32(s, 8+i*8)
offset = i32(s, 12+i*8)
- type = i32(s, offset)
+ property_type = i32(s, offset)
- debug ('property id=%d: type=%d offset=%X' % (id, type, offset))
+ log.debug('property id=%d: type=%d offset=%X' % (property_id, property_type, offset))
# test for common types first (should perhaps use
# a dictionary instead?)
- if type == VT_I2: # 16-bit signed integer
+ if property_type == VT_I2: # 16-bit signed integer
value = i16(s, offset+4)
if value >= 32768:
value = value - 65536
- elif type == VT_UI2: # 2-byte unsigned integer
+ elif property_type == VT_UI2: # 2-byte unsigned integer
value = i16(s, offset+4)
- elif type in (VT_I4, VT_INT, VT_ERROR):
+ elif property_type in (VT_I4, VT_INT, VT_ERROR):
# VT_I4: 32-bit signed integer
# VT_ERROR: HRESULT, similar to 32-bit signed integer,
# see http://msdn.microsoft.com/en-us/library/cc230330.aspx
value = i32(s, offset+4)
- elif type in (VT_UI4, VT_UINT): # 4-byte unsigned integer
+ elif property_type in (VT_UI4, VT_UINT): # 4-byte unsigned integer
value = i32(s, offset+4) # FIXME
- elif type in (VT_BSTR, VT_LPSTR):
+ elif property_type in (VT_BSTR, VT_LPSTR):
# CodePageString, see http://msdn.microsoft.com/en-us/library/dd942354.aspx
# size is a 32 bits integer, including the null terminator, and
# possibly trailing or embedded null chars
@@ -2146,50 +2194,50 @@ class OleFileIO:
value = s[offset+8:offset+8+count-1]
# remove all null chars:
value = value.replace(b'\x00', b'')
- elif type == VT_BLOB:
+ elif property_type == VT_BLOB:
# binary large object (BLOB)
# see http://msdn.microsoft.com/en-us/library/dd942282.aspx
count = i32(s, offset+4)
value = s[offset+8:offset+8+count]
- elif type == VT_LPWSTR:
+ elif property_type == VT_LPWSTR:
# UnicodeString
# see http://msdn.microsoft.com/en-us/library/dd942313.aspx
# "the string should NOT contain embedded or additional trailing
# null characters."
count = i32(s, offset+4)
value = self._decode_utf16_str(s[offset+8:offset+8+count*2])
- elif type == VT_FILETIME:
+ elif property_type == VT_FILETIME:
value = long(i32(s, offset+4)) + (long(i32(s, offset+8))<<32)
# FILETIME is a 64-bit int: "number of 100ns periods
# since Jan 1,1601".
- if convert_time and id not in no_conversion:
- debug('Converting property #%d to python datetime, value=%d=%fs'
- %(id, value, float(value)/10000000))
+ if convert_time and property_id not in no_conversion:
+ log.debug('Converting property #%d to python datetime, value=%d=%fs'
+ %(property_id, value, float(value)/10000000))
# convert FILETIME to Python datetime.datetime
# inspired from http://code.activestate.com/recipes/511425-filetime-to-datetime/
_FILETIME_null_date = datetime.datetime(1601, 1, 1, 0, 0, 0)
- debug('timedelta days=%d' % (value//(10*1000000*3600*24)))
+ log.debug('timedelta days=%d' % (value//(10*1000000*3600*24)))
value = _FILETIME_null_date + datetime.timedelta(microseconds=value//10)
else:
# legacy code kept for backward compatibility: returns a
# number of seconds since Jan 1,1601
value = value // 10000000 # seconds
- elif type == VT_UI1: # 1-byte unsigned integer
+ elif property_type == VT_UI1: # 1-byte unsigned integer
value = i8(s[offset+4])
- elif type == VT_CLSID:
+ elif property_type == VT_CLSID:
value = _clsid(s[offset+4:offset+20])
- elif type == VT_CF:
+ elif property_type == VT_CF:
# PropertyIdentifier or ClipboardData??
# see http://msdn.microsoft.com/en-us/library/dd941945.aspx
count = i32(s, offset+4)
value = s[offset+8:offset+8+count]
- elif type == VT_BOOL:
+ elif property_type == VT_BOOL:
# VARIANT_BOOL, 16 bits bool, 0x0000=Fals, 0xFFFF=True
# see http://msdn.microsoft.com/en-us/library/cc237864.aspx
value = bool(i16(s, offset+4))
else:
value = None # everything else yields "None"
- debug ('property id=%d: type=%d not implemented in parser yet' % (id, type))
+ log.debug('property id=%d: type=%d not implemented in parser yet' % (property_id, property_type))
# missing: VT_EMPTY, VT_NULL, VT_R4, VT_R8, VT_CY, VT_DATE,
# VT_DECIMAL, VT_I1, VT_I8, VT_UI8,
@@ -2201,15 +2249,15 @@ class OleFileIO:
# type of items, e.g. VT_VECTOR|VT_BSTR
# see http://msdn.microsoft.com/en-us/library/dd942011.aspx
- #print("%08x" % id, repr(value), end=" ")
+ #print("%08x" % property_id, repr(value), end=" ")
#print("(%s)" % VT[i32(s, offset) & 0xFFF])
- data[id] = value
+ data[property_id] = value
except BaseException as exc:
# catch exception while parsing each property, and only raise
# a DEFECT_INCORRECT, because parsing can go on
msg = 'Error while parsing property id %d in stream %s: %s' % (
- id, repr(streampath), exc)
+ property_id, repr(streampath), exc)
self._raise_defect(DEFECT_INCORRECT, msg, type(exc))
return data
@@ -2233,38 +2281,47 @@ class OleFileIO:
if __name__ == "__main__":
- import sys
-
- # [PL] display quick usage info if launched from command-line
- if len(sys.argv) <= 1:
- print('olefile version %s %s - %s' % (__version__, __date__, __author__))
- print(
-"""
-Launched from the command line, this script parses OLE files and prints info.
-
-Usage: olefile.py [-d] [-c] [file2 ...]
+ import sys, optparse
+
+ DEFAULT_LOG_LEVEL = "warning" # Default log level
+ LOG_LEVELS = {
+ 'debug': logging.DEBUG,
+ 'info': logging.INFO,
+ 'warning': logging.WARNING,
+ 'error': logging.ERROR,
+ 'critical': logging.CRITICAL
+ }
+
+ usage = 'usage: %prog [options] [filename2 ...]'
+ parser = optparse.OptionParser(usage=usage)
+ parser.add_option("-c", action="store_true", dest="check_streams",
+ help='check all streams (for debugging purposes)')
+ parser.add_option("-d", action="store_true", dest="debug_mode",
+ help='debug mode, shortcut for -l debug (displays a lot of debug information, for developers only)')
+ parser.add_option('-l', '--loglevel', dest="loglevel", action="store", default=DEFAULT_LOG_LEVEL,
+ help="logging level debug/info/warning/error/critical (default=%default)")
+
+ (options, args) = parser.parse_args()
+
+ print('olefile version %s %s - http://www.decalage.info/en/olefile\n' % (__version__, __date__))
+
+ # Print help if no arguments are passed
+ if len(args) == 0:
+ print(__doc__)
+ parser.print_help()
+ sys.exit()
-Options:
--d : debug mode (displays a lot of debug information, for developers only)
--c : check all streams (for debugging purposes)
+ if options.debug_mode:
+ options.loglevel = 'debug'
-For more information, see http://www.decalage.info/olefile
-""")
- sys.exit()
+ # setup logging to the console
+ logging.basicConfig(level=LOG_LEVELS[options.loglevel], format='%(levelname)-8s %(message)s')
- check_streams = False
- for filename in sys.argv[1:]:
-## try:
- # OPTIONS:
- if filename == '-d':
- # option to switch debug mode on:
- set_debug_mode(True)
- continue
- if filename == '-c':
- # option to switch check streams mode on:
- check_streams = True
- continue
+ # also set the same log level for the module's logger to enable it:
+ log.setLevel(LOG_LEVELS[options.loglevel])
+ for filename in args:
+ try:
ole = OleFileIO(filename)#, raise_defects=DEFECT_INCORRECT)
print("-" * 68)
print(filename)
@@ -2272,24 +2329,27 @@ For more information, see http://www.decalage.info/olefile
ole.dumpdirectory()
for streamname in ole.listdir():
if streamname[-1][0] == "\005":
- print(streamname, ": properties")
- props = ole.getproperties(streamname, convert_time=True)
- props = sorted(props.items())
- for k, v in props:
- #[PL]: avoid to display too large or binary values:
- if isinstance(v, (basestring, bytes)):
- if len(v) > 50:
- v = v[:50]
- if isinstance(v, bytes):
- # quick and dirty binary check:
- for c in (1,2,3,4,5,6,7,11,12,14,15,16,17,18,19,20,
- 21,22,23,24,25,26,27,28,29,30,31):
- if c in bytearray(v):
- v = '(binary data)'
- break
- print(" ", k, v)
-
- if check_streams:
+ print("%r: properties" % streamname)
+ try:
+ props = ole.getproperties(streamname, convert_time=True)
+ props = sorted(props.items())
+ for k, v in props:
+ #[PL]: avoid to display too large or binary values:
+ if isinstance(v, (basestring, bytes)):
+ if len(v) > 50:
+ v = v[:50]
+ if isinstance(v, bytes):
+ # quick and dirty binary check:
+ for c in (1,2,3,4,5,6,7,11,12,14,15,16,17,18,19,20,
+ 21,22,23,24,25,26,27,28,29,30,31):
+ if c in bytearray(v):
+ v = '(binary data)'
+ break
+ print(" ", k, v)
+ except:
+ log.exception('Error while parsing property stream %r' % streamname)
+
+ if options.check_streams:
# Read all streams to check if there are errors:
print('\nChecking streams...')
for streamname in ole.listdir():
@@ -2318,8 +2378,11 @@ For more information, see http://www.decalage.info/olefile
print()
# parse and display metadata:
- meta = ole.get_metadata()
- meta.dump()
+ try:
+ meta = ole.get_metadata()
+ meta.dump()
+ except:
+ log.exception('Error while parsing metadata')
print()
#[PL] Test a few new methods:
root = ole.get_rootentry_name()
@@ -2338,7 +2401,7 @@ For more information, see http://www.decalage.info/olefile
print('- %s: %s' % (exctype.__name__, msg))
else:
print('None')
-## except IOError as v:
-## print("***", "cannot read", file, "-", v)
+ except:
+ log.exception('Error while parsing file %r' % filename)
# this code was developed while listening to The Wedding Present "Sea Monsters"
diff --git a/oletools/thirdparty/olefile/olefile2.py b/oletools/thirdparty/olefile/olefile2.py
index d7333de..7f5ab0e 100644
--- a/oletools/thirdparty/olefile/olefile2.py
+++ b/oletools/thirdparty/olefile/olefile2.py
@@ -1166,33 +1166,33 @@ class OleFileIO:
self._raise_defect(DEFECT_FATAL, "incorrect ByteOrder in OLE header")
# TODO: add big-endian support for documents created on Mac ?
self.SectorSize = 2**self.SectorShift
- debug( "SectorSize = %d" % self.SectorSize )
+ debug( "sector_size = %d" % self.SectorSize )
if self.SectorSize not in [512, 4096]:
- self._raise_defect(DEFECT_INCORRECT, "incorrect SectorSize in OLE header")
+ self._raise_defect(DEFECT_INCORRECT, "incorrect sector_size in OLE header")
if (self.DllVersion==3 and self.SectorSize!=512) \
or (self.DllVersion==4 and self.SectorSize!=4096):
- self._raise_defect(DEFECT_INCORRECT, "SectorSize does not match DllVersion in OLE header")
+ self._raise_defect(DEFECT_INCORRECT, "sector_size does not match DllVersion in OLE header")
self.MiniSectorSize = 2**self.MiniSectorShift
- debug( "MiniSectorSize = %d" % self.MiniSectorSize )
+ debug( "mini_sector_size = %d" % self.MiniSectorSize )
if self.MiniSectorSize not in [64]:
- self._raise_defect(DEFECT_INCORRECT, "incorrect MiniSectorSize in OLE header")
+ self._raise_defect(DEFECT_INCORRECT, "incorrect mini_sector_size in OLE header")
if self.Reserved != 0 or self.Reserved1 != 0:
self._raise_defect(DEFECT_INCORRECT, "incorrect OLE header (non-null reserved bytes)")
debug( "csectDir = %d" % self.csectDir )
if self.SectorSize==512 and self.csectDir!=0:
self._raise_defect(DEFECT_INCORRECT, "incorrect csectDir in OLE header")
- debug( "csectFat = %d" % self.csectFat )
- debug( "sectDirStart = %X" % self.sectDirStart )
- debug( "signature = %d" % self.signature )
+ debug( "num_fat_sectors = %d" % self.csectFat )
+ debug( "first_dir_sector = %X" % self.sectDirStart )
+ debug( "transaction_signature_number = %d" % self.signature )
# Signature should be zero, BUT some implementations do not follow this
# rule => only a potential defect:
if self.signature != 0:
- self._raise_defect(DEFECT_POTENTIAL, "incorrect OLE header (signature>0)")
- debug( "MiniSectorCutoff = %d" % self.MiniSectorCutoff )
- debug( "MiniFatStart = %X" % self.MiniFatStart )
- debug( "csectMiniFat = %d" % self.csectMiniFat )
- debug( "sectDifStart = %X" % self.sectDifStart )
- debug( "csectDif = %d" % self.csectDif )
+ self._raise_defect(DEFECT_POTENTIAL, "incorrect OLE header (transaction_signature_number>0)")
+ debug( "mini_stream_cutoff_size = %d" % self.MiniSectorCutoff )
+ debug( "first_mini_fat_sector = %X" % self.MiniFatStart )
+ debug( "num_mini_fat_sectors = %d" % self.csectMiniFat )
+ debug( "first_difat_sector = %X" % self.sectDifStart )
+ debug( "num_difat_sectors = %d" % self.csectDif )
# calculate the number of sectors in the file
# (-1 because header doesn't count)
@@ -1414,9 +1414,9 @@ class OleFileIO:
if isect_difat not in [ENDOFCHAIN, FREESECT]:
# last DIFAT pointer value must be ENDOFCHAIN or FREESECT
raise IOError, 'incorrect end of DIFAT'
-## if len(self.fat) != self.csectFat:
-## # FAT should contain csectFat blocks
-## print "FAT length: %d instead of %d" % (len(self.fat), self.csectFat)
+## if len(self.fat) != self.num_fat_sectors:
+## # FAT should contain num_fat_sectors blocks
+## print "FAT length: %d instead of %d" % (len(self.fat), self.num_fat_sectors)
## raise IOError, 'incorrect DIFAT'
# since FAT is read from fixed-size sectors, it may contain more values
# than the actual number of sectors in the file.
@@ -1438,7 +1438,7 @@ class OleFileIO:
# 1) Stream size is calculated according to the number of sectors
# declared in the OLE header. This allocated stream may be more than
# needed to store the actual sector indexes.
- # (self.csectMiniFat is the number of sectors of size self.SectorSize)
+ # (self.num_mini_fat_sectors is the number of sectors of size self.sector_size)
stream_size = self.csectMiniFat * self.SectorSize
# 2) Actually used size is calculated by dividing the MiniStream size
# (given by root entry size) by the size of mini sectors, *4 for
@@ -1565,7 +1565,7 @@ class OleFileIO:
"""
debug('OleFileIO.open(): sect=%d, size=%d, force_FAT=%s' %
(start, size, str(force_FAT)))
- # stream size is compared to the MiniSectorCutoff threshold:
+ # stream size is compared to the mini_stream_cutoff_size threshold:
if size < self.minisectorcutoff and not force_FAT:
# ministream object
if not self.ministream: