Commit 430c7b98a68cf38ed3f67b8a23209d431aba0d0b
1 parent
cdbcd101
updated doc and readme
Showing
13 changed files
with
452 additions
and
115 deletions
README.md
| ... | ... | @@ -22,7 +22,7 @@ Note: python-oletools is not related to OLETools published by BeCubed Software. |
| 22 | 22 | News |
| 23 | 23 | ---- |
| 24 | 24 | |
| 25 | -- **2015-02-05 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings | |
| 25 | +- **2015-02-08 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings | |
| 26 | 26 | obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western |
| 27 | 27 | codepages with olefile 0.42, improved API and display, several bugfixes. |
| 28 | 28 | - 2015-01-05 v0.07: improved [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) to detect suspicious | ... | ... |
oletools/README.html
| ... | ... | @@ -13,7 +13,7 @@ |
| 13 | 13 | <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p> |
| 14 | 14 | <h2 id="news">News</h2> |
| 15 | 15 | <ul> |
| 16 | -<li><strong>2015-02-05 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li> | |
| 16 | +<li><strong>2015-02-08 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li> | |
| 17 | 17 | <li>2015-01-05 v0.07: improved <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> to detect suspicious keywords and IOCs in VBA macros, can now scan several files and open password-protected zip archives, added a Python API, upgraded OleFileIO_PL to olefile v0.41</li> |
| 18 | 18 | <li>2014-08-28 v0.06: added <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved <a href="https://bitbucket.org/decalage/oletools/wiki">documentation</a></li> |
| 19 | 19 | <li>2013-07-24 v0.05: added new tools <a href="https://bitbucket.org/decalage/oletools/wiki/olemeta">olemeta</a> and <a href="https://bitbucket.org/decalage/oletools/wiki/oletimes">oletimes</a></li> | ... | ... |
oletools/README.rst
| ... | ... | @@ -27,7 +27,7 @@ Software. |
| 27 | 27 | News |
| 28 | 28 | ---- |
| 29 | 29 | |
| 30 | -- **2015-02-05 v0.08**: | |
| 30 | +- **2015-02-08 v0.08**: | |
| 31 | 31 | `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`_ can |
| 32 | 32 | now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and |
| 33 | 33 | extract IOCs. Added new triage mode, support for non-western | ... | ... |
oletools/doc/Home.html
| ... | ... | @@ -7,7 +7,7 @@ |
| 7 | 7 | <title></title> |
| 8 | 8 | </head> |
| 9 | 9 | <body> |
| 10 | -<h1 id="python-oletools-v0.07-documentation">python-oletools v0.07 documentation</h1> | |
| 10 | +<h1 id="python-oletools-v0.08-documentation">python-oletools v0.08 documentation</h1> | |
| 11 | 11 | <p>This is the home page of the documentation for python-oletools. The latest version can be found <a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> |
| 12 | 12 | <p><a href="http://www.decalage.info/python/oletools">python-oletools</a> is a package of python tools to analyze <a href="http://en.wikipedia.org/wiki/Compound_File_Binary_Format">Microsoft OLE2 files</a> (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the <a href="http://www.decalage.info/olefile">olefile</a> parser. See <a href="http://www.decalage.info/python/oletools">http://www.decalage.info/python/oletools</a> for more info.</p> |
| 13 | 13 | <p><strong>Quick links:</strong> <a href="http://www.decalage.info/python/oletools">Home page</a> - <a href="https://bitbucket.org/decalage/oletools/wiki/Install">Download/Install</a> - <a href="https://bitbucket.org/decalage/oletools/wiki">Documentation</a> - <a href="https://bitbucket.org/decalage/oletools/issues?status=new&status=open">Report Issues/Suggestions/Questions</a> - <a href="http://decalage.info/contact">Contact the author</a> - <a href="https://bitbucket.org/decalage/oletools">Repository</a> - <a href="https://twitter.com/decalage2">Updates on Twitter</a></p> |
| ... | ... | @@ -18,7 +18,7 @@ |
| 18 | 18 | <li><strong><a href="oleid.html">oleid</a></strong>: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.</li> |
| 19 | 19 | <li><strong><a href="olemeta.html">olemeta</a></strong>: a tool to extract all standard properties (metadata) from OLE files.</li> |
| 20 | 20 | <li><strong><a href="oletimes.html">oletimes</a></strong>: a tool to extract creation and modification timestamps of all streams and storages.</li> |
| 21 | -<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).</li> | |
| 21 | +<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).</li> | |
| 22 | 22 | <li><strong><a href="pyxswf.html">pyxswf</a></strong>: a tool to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.</li> |
| 23 | 23 | <li><strong><a href="rtfobj.html">rtfobj</a></strong>: a tool and python module to extract embedded objects from RTF files.</li> |
| 24 | 24 | <li>and a few others (coming soon)</li> | ... | ... |
oletools/doc/Home.md
| 1 | -python-oletools v0.07 documentation | |
| 1 | +python-oletools v0.08 documentation | |
| 2 | 2 | =================================== |
| 3 | 3 | |
| 4 | 4 | This is the home page of the documentation for python-oletools. The latest version can be found |
| ... | ... | @@ -29,7 +29,7 @@ Tools in python-oletools: |
| 29 | 29 | - **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files. |
| 30 | 30 | - **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files. |
| 31 | 31 | - **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages. |
| 32 | -- **[[olevba]]**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML). | |
| 32 | +- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML). | |
| 33 | 33 | - **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may |
| 34 | 34 | be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, |
| 35 | 35 | which is especially useful for malware analysis. | ... | ... |
oletools/doc/Install.html
| ... | ... | @@ -9,14 +9,15 @@ |
| 9 | 9 | <body> |
| 10 | 10 | <h1 id="how-to-download-and-install-python-oletools">How to Download and Install python-oletools</h1> |
| 11 | 11 | <h2 id="pre-requisites">Pre-requisites</h2> |
| 12 | -<p>For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet.</p> | |
| 13 | -<h2 id="for-command-line-tools">For command-line tools</h2> | |
| 14 | -<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice.</p> | |
| 15 | -<p>You may then add the directory to your PATH environment variable to access the tools from anywhere.</p> | |
| 16 | -<p>To get the latest development version, click on "Download repository" on the <a href="https://bitbucket.org/decalage/oletools/downloads">downloads page</a>, or use mercurial to clone the repository.</p> | |
| 12 | +<p>For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.</p> | |
| 13 | +<h2 id="to-use-oletools-as-command-line-tools">To use oletools as command-line tools</h2> | |
| 14 | +<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice. Pick the latest release version, or click on "Download Repository" to get the latest development version with the most recent features.</p> | |
| 15 | +<p>Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily in the future.</p> | |
| 16 | +<p>You may add the oletools directory to your PATH environment variable to access the tools from anywhere.</p> | |
| 17 | 17 | <h2 id="for-python-applications">For python applications</h2> |
| 18 | -<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use "<strong>pip install oletools</strong>" or "<strong>easy_install oletools</strong>" to download and install the package in one go.</p> | |
| 19 | -<p>Otherwise you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run "<strong>python setup.py install</strong>".</p> | |
| 18 | +<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use <strong>"pip install oletools"</strong> or <strong>"easy_install oletools"</strong> to download and install the package in one go. Pip is included with Python since version 2.7.9.</p> | |
| 19 | +<p><strong>Important: to update oletools</strong> if it is already installed, you must run <strong>"pip install -U oletools"</strong>, otherwise pip will not update it.</p> | |
| 20 | +<p>Alternatively, you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run <strong>"python setup.py install"</strong>.</p> | |
| 20 | 21 | <hr /> |
| 21 | 22 | <h2 id="python-oletools-documentation">python-oletools documentation</h2> |
| 22 | 23 | <ul> | ... | ... |
oletools/doc/Install.md
| ... | ... | @@ -4,30 +4,35 @@ How to Download and Install python-oletools |
| 4 | 4 | Pre-requisites |
| 5 | 5 | -------------- |
| 6 | 6 | |
| 7 | -For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet. | |
| 7 | +For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet. | |
| 8 | 8 | |
| 9 | 9 | |
| 10 | -For command-line tools | |
| 11 | ----------------------- | |
| 10 | +To use oletools as command-line tools | |
| 11 | +------------------------------------- | |
| 12 | 12 | |
| 13 | 13 | To use python-oletools from the command line as analysis tools, you may simply |
| 14 | 14 | [download the zip archive](https://bitbucket.org/decalage/oletools/downloads) |
| 15 | -and extract the files in the directory of your choice. | |
| 15 | +and extract the files in the directory of your choice. Pick the latest release version, or click on "Download Repository" | |
| 16 | +to get the latest development version with the most recent features. | |
| 16 | 17 | |
| 17 | -You may then add the directory to your PATH environment variable to access the tools from anywhere. | |
| 18 | +Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily | |
| 19 | +in the future. | |
| 18 | 20 | |
| 19 | -To get the latest development version, click on "Download repository" on the | |
| 20 | -[downloads page](https://bitbucket.org/decalage/oletools/downloads), or use mercurial to clone the repository. | |
| 21 | +You may add the oletools directory to your PATH environment variable to access the tools from anywhere. | |
| 21 | 22 | |
| 22 | 23 | |
| 23 | 24 | For python applications |
| 24 | ----------------------- | |
| 25 | +----------------------- | |
| 25 | 26 | |
| 26 | 27 | If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use |
| 27 | -"**pip install oletools**" or "**easy_install oletools**" to download and install the package in one go. | |
| 28 | +**"pip install oletools"** or **"easy_install oletools"** to download and install the package in one go. Pip is included | |
| 29 | +with Python since version 2.7.9. | |
| 28 | 30 | |
| 29 | -Otherwise you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary | |
| 30 | -directory and run "**python setup.py install**". | |
| 31 | +**Important: to update oletools** if it is already installed, you must run **"pip install -U oletools"**, otherwise pip | |
| 32 | +will not update it. | |
| 33 | + | |
| 34 | +Alternatively, you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary | |
| 35 | +directory and run **"python setup.py install"**. | |
| 31 | 36 | |
| 32 | 37 | -------------------------------------------------------------------------- |
| 33 | 38 | ... | ... |
oletools/doc/oleid.html
| ... | ... | @@ -39,19 +39,43 @@ |
| 39 | 39 | <pre><code>C:\oletools>oleid.py word_flash_vba.doc |
| 40 | 40 | |
| 41 | 41 | Filename: word_flash_vba.doc |
| 42 | -OLE format: True | |
| 43 | -Has SummaryInformation stream: True | |
| 44 | -Application name: Microsoft Office Word | |
| 45 | -Encrypted: False | |
| 46 | -Word Document: True | |
| 47 | -VBA Macros: True | |
| 48 | -Excel Workbook: False | |
| 49 | -PowerPoint Presentation: False | |
| 50 | -Visio Drawing: False | |
| 51 | -ObjectPool: True | |
| 52 | -Flash objects: 1</code></pre> | |
| 53 | -<h2 id="how-to-use-oleid-in-python-applications">How to use oleid in Python applications</h2> | |
| 54 | -<p>TODO</p> | |
| 42 | ++-------------------------------+-----------------------+ | |
| 43 | +| Indicator | Value | | |
| 44 | ++-------------------------------+-----------------------+ | |
| 45 | +| OLE format | True | | |
| 46 | +| Has SummaryInformation stream | True | | |
| 47 | +| Application name | Microsoft Office Word | | |
| 48 | +| Encrypted | False | | |
| 49 | +| Word Document | True | | |
| 50 | +| VBA Macros | True | | |
| 51 | +| Excel Workbook | False | | |
| 52 | +| PowerPoint Presentation | False | | |
| 53 | +| Visio Drawing | False | | |
| 54 | +| ObjectPool | True | | |
| 55 | +| Flash objects | 1 | | |
| 56 | ++-------------------------------+-----------------------+</code></pre> | |
| 57 | +<h2 id="how-to-use-oleid-in-your-python-applications">How to use oleid in your Python applications</h2> | |
| 58 | +<p>First, import oletools.oleid, and create an <strong>OleID</strong> object to scan a file:</p> | |
| 59 | +<pre><code>import oletools.oleid | |
| 60 | + | |
| 61 | +oid = oletools.oleid.OleID(filename)</code></pre> | |
| 62 | +<p>Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.</p> | |
| 63 | +<p>Second, call the <strong>check()</strong> method. It returns a list of <strong>Indicator</strong> objects.</p> | |
| 64 | +<p>Each Indicator object has the following attributes:</p> | |
| 65 | +<ul> | |
| 66 | +<li><strong>id</strong>: str, identifier for the indicator</li> | |
| 67 | +<li><strong>name</strong>: str, name to display the indicator</li> | |
| 68 | +<li><strong>description</strong>: str, long description of the indicator</li> | |
| 69 | +<li><strong>type</strong>: class of the indicator (e.g. bool, str, int)</li> | |
| 70 | +<li><strong>value</strong>: value of the indicator</li> | |
| 71 | +</ul> | |
| 72 | +<p>For example, the following code displays all the indicators:</p> | |
| 73 | +<pre><code>indicators = oid.check() | |
| 74 | +for i in indicators: | |
| 75 | + print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value)) | |
| 76 | + print 'description:', i.description | |
| 77 | + print ''</code></pre> | |
| 78 | +<p>See the source code of oleid.py for more details.</p> | |
| 55 | 79 | <hr /> |
| 56 | 80 | <h2 id="python-oletools-documentation">python-oletools documentation</h2> |
| 57 | 81 | <ul> | ... | ... |
oletools/doc/oleid.md
| ... | ... | @@ -42,21 +42,53 @@ Analyzing a Word document containing a Flash object and VBA macros: |
| 42 | 42 | C:\oletools>oleid.py word_flash_vba.doc |
| 43 | 43 | |
| 44 | 44 | Filename: word_flash_vba.doc |
| 45 | - OLE format: True | |
| 46 | - Has SummaryInformation stream: True | |
| 47 | - Application name: Microsoft Office Word | |
| 48 | - Encrypted: False | |
| 49 | - Word Document: True | |
| 50 | - VBA Macros: True | |
| 51 | - Excel Workbook: False | |
| 52 | - PowerPoint Presentation: False | |
| 53 | - Visio Drawing: False | |
| 54 | - ObjectPool: True | |
| 55 | - Flash objects: 1 | |
| 56 | - | |
| 57 | -## How to use oleid in Python applications | |
| 58 | - | |
| 59 | -TODO | |
| 45 | + +-------------------------------+-----------------------+ | |
| 46 | + | Indicator | Value | | |
| 47 | + +-------------------------------+-----------------------+ | |
| 48 | + | OLE format | True | | |
| 49 | + | Has SummaryInformation stream | True | | |
| 50 | + | Application name | Microsoft Office Word | | |
| 51 | + | Encrypted | False | | |
| 52 | + | Word Document | True | | |
| 53 | + | VBA Macros | True | | |
| 54 | + | Excel Workbook | False | | |
| 55 | + | PowerPoint Presentation | False | | |
| 56 | + | Visio Drawing | False | | |
| 57 | + | ObjectPool | True | | |
| 58 | + | Flash objects | 1 | | |
| 59 | + +-------------------------------+-----------------------+ | |
| 60 | + | |
| 61 | +## How to use oleid in your Python applications | |
| 62 | + | |
| 63 | +First, import oletools.oleid, and create an **OleID** object to scan a file: | |
| 64 | + | |
| 65 | + :::python | |
| 66 | + import oletools.oleid | |
| 67 | + | |
| 68 | + oid = oletools.oleid.OleID(filename) | |
| 69 | + | |
| 70 | +Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed. | |
| 71 | + | |
| 72 | +Second, call the **check()** method. It returns a list of **Indicator** objects. | |
| 73 | + | |
| 74 | +Each Indicator object has the following attributes: | |
| 75 | + | |
| 76 | +- **id**: str, identifier for the indicator | |
| 77 | +- **name**: str, name to display the indicator | |
| 78 | +- **description**: str, long description of the indicator | |
| 79 | +- **type**: class of the indicator (e.g. bool, str, int) | |
| 80 | +- **value**: value of the indicator | |
| 81 | + | |
| 82 | +For example, the following code displays all the indicators: | |
| 83 | + | |
| 84 | + :::python | |
| 85 | + indicators = oid.check() | |
| 86 | + for i in indicators: | |
| 87 | + print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value)) | |
| 88 | + print 'description:', i.description | |
| 89 | + print '' | |
| 90 | + | |
| 91 | +See the source code of oleid.py for more details. | |
| 60 | 92 | |
| 61 | 93 | -------------------------------------------------------------------------- |
| 62 | 94 | ... | ... |
oletools/doc/oletimes.html
| ... | ... | @@ -16,24 +16,29 @@ |
| 16 | 16 | <p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p> |
| 17 | 17 | <pre><code>>oletimes.py DIAN_caso-5415.doc |
| 18 | 18 | |
| 19 | -- Root mtime=2014-05-14 12:45:24.752000 ctime=None | |
| 20 | -- '\x01CompObj': mtime=None ctime=None | |
| 21 | -- '\x05DocumentSummaryInformation': mtime=None ctime=None | |
| 22 | -- '\x05SummaryInformation': mtime=None ctime=None | |
| 23 | -- '1Table': mtime=None ctime=None | |
| 24 | -- 'Data': mtime=None ctime=None | |
| 25 | -- 'Macros': mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000 | |
| 26 | -- 'Macros/PROJECT': mtime=None ctime=None | |
| 27 | -- 'Macros/PROJECTwm': mtime=None ctime=None | |
| 28 | -- 'Macros/VBA': mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000 | |
| 29 | -- 'Macros/VBA/ThisDocument': mtime=None ctime=None | |
| 30 | -- 'Macros/VBA/_VBA_PROJECT': mtime=None ctime=None | |
| 31 | -- 'Macros/VBA/__SRP_0': mtime=None ctime=None | |
| 32 | -- 'Macros/VBA/__SRP_1': mtime=None ctime=None | |
| 33 | -- 'Macros/VBA/__SRP_2': mtime=None ctime=None | |
| 34 | -- 'Macros/VBA/__SRP_3': mtime=None ctime=None | |
| 35 | -- 'Macros/VBA/dir': mtime=None ctime=None | |
| 36 | -- 'WordDocument': mtime=None ctime=None</code></pre> | |
| 19 | ++----------------------------+---------------------+---------------------+ | |
| 20 | +| Stream/Storage name | Modification Time | Creation Time | | |
| 21 | ++----------------------------+---------------------+---------------------+ | |
| 22 | +| Root | 2014-05-14 12:45:24 | None | | |
| 23 | +| '\x01CompObj' | None | None | | |
| 24 | +| '\x05DocumentSummaryInform | None | None | | |
| 25 | +| ation' | | | | |
| 26 | +| '\x05SummaryInformation' | None | None | | |
| 27 | +| '1Table' | None | None | | |
| 28 | +| 'Data' | None | None | | |
| 29 | +| 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 | | |
| 30 | +| 'Macros/PROJECT' | None | None | | |
| 31 | +| 'Macros/PROJECTwm' | None | None | | |
| 32 | +| 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 | | |
| 33 | +| 'Macros/VBA/ThisDocument' | None | None | | |
| 34 | +| 'Macros/VBA/_VBA_PROJECT' | None | None | | |
| 35 | +| 'Macros/VBA/__SRP_0' | None | None | | |
| 36 | +| 'Macros/VBA/__SRP_1' | None | None | | |
| 37 | +| 'Macros/VBA/__SRP_2' | None | None | | |
| 38 | +| 'Macros/VBA/__SRP_3' | None | None | | |
| 39 | +| 'Macros/VBA/dir' | None | None | | |
| 40 | +| 'WordDocument' | None | None | | |
| 41 | ++----------------------------+---------------------+---------------------+</code></pre> | |
| 37 | 42 | <h2 id="how-to-use-oletimes-in-python-applications">How to use oletimes in Python applications</h2> |
| 38 | 43 | <p>TODO</p> |
| 39 | 44 | <hr /> | ... | ... |
oletools/doc/oletimes.md
| ... | ... | @@ -19,24 +19,29 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4 |
| 19 | 19 | :::text |
| 20 | 20 | >oletimes.py DIAN_caso-5415.doc |
| 21 | 21 | |
| 22 | - - Root mtime=2014-05-14 12:45:24.752000 ctime=None | |
| 23 | - - '\x01CompObj': mtime=None ctime=None | |
| 24 | - - '\x05DocumentSummaryInformation': mtime=None ctime=None | |
| 25 | - - '\x05SummaryInformation': mtime=None ctime=None | |
| 26 | - - '1Table': mtime=None ctime=None | |
| 27 | - - 'Data': mtime=None ctime=None | |
| 28 | - - 'Macros': mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000 | |
| 29 | - - 'Macros/PROJECT': mtime=None ctime=None | |
| 30 | - - 'Macros/PROJECTwm': mtime=None ctime=None | |
| 31 | - - 'Macros/VBA': mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000 | |
| 32 | - - 'Macros/VBA/ThisDocument': mtime=None ctime=None | |
| 33 | - - 'Macros/VBA/_VBA_PROJECT': mtime=None ctime=None | |
| 34 | - - 'Macros/VBA/__SRP_0': mtime=None ctime=None | |
| 35 | - - 'Macros/VBA/__SRP_1': mtime=None ctime=None | |
| 36 | - - 'Macros/VBA/__SRP_2': mtime=None ctime=None | |
| 37 | - - 'Macros/VBA/__SRP_3': mtime=None ctime=None | |
| 38 | - - 'Macros/VBA/dir': mtime=None ctime=None | |
| 39 | - - 'WordDocument': mtime=None ctime=None | |
| 22 | + +----------------------------+---------------------+---------------------+ | |
| 23 | + | Stream/Storage name | Modification Time | Creation Time | | |
| 24 | + +----------------------------+---------------------+---------------------+ | |
| 25 | + | Root | 2014-05-14 12:45:24 | None | | |
| 26 | + | '\x01CompObj' | None | None | | |
| 27 | + | '\x05DocumentSummaryInform | None | None | | |
| 28 | + | ation' | | | | |
| 29 | + | '\x05SummaryInformation' | None | None | | |
| 30 | + | '1Table' | None | None | | |
| 31 | + | 'Data' | None | None | | |
| 32 | + | 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 | | |
| 33 | + | 'Macros/PROJECT' | None | None | | |
| 34 | + | 'Macros/PROJECTwm' | None | None | | |
| 35 | + | 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 | | |
| 36 | + | 'Macros/VBA/ThisDocument' | None | None | | |
| 37 | + | 'Macros/VBA/_VBA_PROJECT' | None | None | | |
| 38 | + | 'Macros/VBA/__SRP_0' | None | None | | |
| 39 | + | 'Macros/VBA/__SRP_1' | None | None | | |
| 40 | + | 'Macros/VBA/__SRP_2' | None | None | | |
| 41 | + | 'Macros/VBA/__SRP_3' | None | None | | |
| 42 | + | 'Macros/VBA/dir' | None | None | | |
| 43 | + | 'WordDocument' | None | None | | |
| 44 | + +----------------------------+---------------------+---------------------+ | |
| 40 | 45 | |
| 41 | 46 | ## How to use oletimes in Python applications |
| 42 | 47 | ... | ... |
oletools/doc/olevba.html
| ... | ... | @@ -8,7 +8,7 @@ |
| 8 | 8 | </head> |
| 9 | 9 | <body> |
| 10 | 10 | <h1 id="olevba">olevba</h1> |
| 11 | -<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc).</p> | |
| 11 | +<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc). It also detects and decodes several common <strong>obfuscation methods including Hex encoding, StrReverse, Base64, Dridex</strong>, and extracts IOCs from decoded strings.</p> | |
| 12 | 12 | <p>It can be used either as a command-line tool, or as a python module from your own applications.</p> |
| 13 | 13 | <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p> |
| 14 | 14 | <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a> by John William Davison, with significant modifications.</p> |
| ... | ... | @@ -24,14 +24,26 @@ |
| 24 | 24 | <li>Extract VBA macro source code</li> |
| 25 | 25 | <li>Detect auto-executable macros</li> |
| 26 | 26 | <li>Detect suspicious VBA keywords often used by malware</li> |
| 27 | +<li>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</li> | |
| 27 | 28 | <li>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</li> |
| 28 | 29 | <li>Scan multiple files and sample collections (wildcards, recursive)</li> |
| 30 | +<li>Triage mode for a summary view of multiple files</li> | |
| 29 | 31 | <li>Scan malware samples in password-protected Zip archives</li> |
| 30 | 32 | <li>Python API to use olevba from your applications</li> |
| 31 | 33 | </ul> |
| 32 | 34 | <p>MS Office files encrypted with a password are also supported, because VBA macro code is never encrypted, only the content of the document.</p> |
| 33 | 35 | <h2 id="about-vba-macros">About VBA Macros</h2> |
| 34 | 36 | <p>See <a href="http://www.decalage.info/en/vba_tools">this article</a> for more information and technical details about VBA Macros and how they are stored in MS Office documents.</p> |
| 37 | +<h2 id="how-it-works">How it works</h2> | |
| 38 | +<ol style="list-style-type: decimal"> | |
| 39 | +<li>olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.</li> | |
| 40 | +<li>If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.</li> | |
| 41 | +<li>olevba identifies all the VBA projects stored in the OLE structure.</li> | |
| 42 | +<li>Each VBA project is parsed to find the corresponding OLE streams containing macro code.</li> | |
| 43 | +<li>In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).</li> | |
| 44 | +<li>olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).</li> | |
| 45 | +<li>olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).</li> | |
| 46 | +</ol> | |
| 35 | 47 | <h2 id="usage">Usage</h2> |
| 36 | 48 | <pre><code>Usage: olevba.py [options] <filename> [filename2 ...] |
| 37 | 49 | |
| ... | ... | @@ -44,9 +56,34 @@ Options: |
| 44 | 56 | -f ZIP_FNAME, --zipfname=ZIP_FNAME |
| 45 | 57 | if the file is a zip archive, file(s) to be opened |
| 46 | 58 | within the zip. Wildcards * and ? are supported. |
| 47 | - (default:*)</code></pre> | |
| 48 | -<h3 id="example">Example</h3> | |
| 49 | -<p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p> | |
| 59 | + (default:*) | |
| 60 | + -t triage mode, display results as a summary table | |
| 61 | + (default for multiple files) | |
| 62 | + -d detailed mode, display full results (default for | |
| 63 | + single file) | |
| 64 | + -i INPUT, --input=INPUT | |
| 65 | + input file containing VBA source code to be analyzed | |
| 66 | + (no parsing) | |
| 67 | + --hex display all the hex-encoded strings with their decoded | |
| 68 | + content.</code></pre> | |
| 69 | +<h3 id="examples">Examples</h3> | |
| 70 | +<p>Scan a single file:</p> | |
| 71 | +<pre><code>olevba.py file.doc</code></pre> | |
| 72 | +<p>Scan a single file, stored in a Zip archive with password "infected":</p> | |
| 73 | +<pre><code>olevba.py malicious_file.xls.zip -z infected</code></pre> | |
| 74 | +<p>Scan a single file, showing all obfuscated strings decoded:</p> | |
| 75 | +<pre><code>olevba.py file.doc --hex</code></pre> | |
| 76 | +<p>Scan VBA source code extracted into a text file:</p> | |
| 77 | +<pre><code>olevba.py -i source_code.vba</code></pre> | |
| 78 | +<p>Scan a collection of files stored in a folder:</p> | |
| 79 | +<pre><code>olevba.py MalwareZoo/VBA/*</code></pre> | |
| 80 | +<p>Scan all .doc and .xls files, recursively in all subfolders:</p> | |
| 81 | +<pre><code>olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r</code></pre> | |
| 82 | +<p>Scan all .doc files within all .zip files with password, recursively:</p> | |
| 83 | +<pre><code>olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc</code></pre> | |
| 84 | +<h3 id="detailed-analysis-mode-default-for-single-file">Detailed analysis mode (default for single file)</h3> | |
| 85 | +<p>When a single file is scanned, or when using the option -d, all details of the analysis are displayed.</p> | |
| 86 | +<p>For example, checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p> | |
| 50 | 87 | <pre><code>>olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected |
| 51 | 88 | =============================================================================== |
| 52 | 89 | FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip |
| ... | ... | @@ -108,22 +145,63 @@ ANALYSIS: |
| 108 | 145 | | IOC | test.exe | Executable file name | |
| 109 | 146 | | IOC | sfjozjero.exe | Executable file name | |
| 110 | 147 | +------------+----------------------+-----------------------------------------+</code></pre> |
| 148 | +<h3 id="triage-mode-default-for-multiple-files">Triage mode (default for multiple files)</h3> | |
| 149 | +<p>When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. This is more convenient for quick triage of a collection of suspicious files.</p> | |
| 150 | +<p>The following flags show the results of the analysis:</p> | |
| 151 | +<ul> | |
| 152 | +<li><strong>OLE</strong>: the file type is OLE, for example MS Office 97-2003</li> | |
| 153 | +<li><strong>OpX</strong>: the file type is OpenXML, for example MS Office 2007+</li> | |
| 154 | +<li><strong>?</strong>: the file type is not supported</li> | |
| 155 | +<li><strong>M</strong>: contains VBA Macros</li> | |
| 156 | +<li><strong>A</strong>: auto-executable macros</li> | |
| 157 | +<li><strong>S</strong>: suspicious VBA keywords</li> | |
| 158 | +<li><strong>I</strong>: potential IOCs</li> | |
| 159 | +<li><strong>H</strong>: hex-encoded strings (potential obfuscation)</li> | |
| 160 | +<li><strong>B</strong>: Base64-encoded strings (potential obfuscation)</li> | |
| 161 | +<li><strong>D</strong>: Dridex-encoded strings (potential obfuscation)</li> | |
| 162 | +</ul> | |
| 163 | +<p>Here is an example:</p> | |
| 164 | +<pre><code>c:\>olevba.py \MalwareZoo\VBA\samples\* | |
| 165 | +Flags Filename | |
| 166 | +----------- ----------------------------------------------------------------- | |
| 167 | +OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware | |
| 168 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware | |
| 169 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware | |
| 170 | +OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware | |
| 171 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware | |
| 172 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware | |
| 173 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware | |
| 174 | +OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware | |
| 175 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware | |
| 176 | +OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware | |
| 177 | +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware | |
| 178 | +OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc | |
| 179 | +OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc | |
| 180 | +OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware | |
| 181 | +OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware | |
| 182 | +OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc</code></pre> | |
| 183 | +<hr /> | |
| 111 | 184 | <h2 id="how-to-use-olevba-in-python-applications">How to use olevba in Python applications</h2> |
| 112 | 185 | <p>olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code from your own python applications.</p> |
| 186 | +<p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p> | |
| 113 | 187 | <h3 id="import-olevba">Import olevba</h3> |
| 114 | -<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser class:</p> | |
| 115 | -<pre><code>from oletools.olevba import VBA_Parser</code></pre> | |
| 188 | +<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p> | |
| 189 | +<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre> | |
| 116 | 190 | <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3> |
| 117 | -<p>Create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual filename may be provided as a second parameter, if available.</p> | |
| 191 | +<p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. For example:</p> | |
| 118 | 192 | <pre><code>vba = VBA_Parser('my_file_with_macros.doc')</code></pre> |
| 193 | +<p>The file may also be provided as a bytes string containing its data. In that case, the actual filename must be provided for reference, and the file content with the data parameter. For example:</p> | |
| 194 | +<pre><code>myfile = 'my_file_with_macros.doc' | |
| 195 | +filedata = open(myfile, 'rb').read() | |
| 196 | +vba = VBA_Parser(myfile, data=filedata)</code></pre> | |
| 119 | 197 | <p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML (MS Office 2007+).</p> |
| 120 | 198 | <h3 id="detect-vba-macros">Detect VBA macros</h3> |
| 121 | -<p>The method <strong>detect_vba_macros</strong> returns True if VBA macros have been found in the file, False otherwise.</p> | |
| 199 | +<p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file, False otherwise.</p> | |
| 122 | 200 | <pre><code>if vba.detect_vba_macros(): |
| 123 | 201 | print 'VBA Macros found' |
| 124 | 202 | else: |
| 125 | 203 | print 'No VBA Macros found'</code></pre> |
| 126 | -<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way.</p> | |
| 204 | +<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way which is not yet supported by olevba.</p> | |
| 127 | 205 | <p>Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.</p> |
| 128 | 206 | <h3 id="extract-vba-macro-source-code">Extract VBA Macro Source Code</h3> |
| 129 | 207 | <p>The method <strong>extract_macros</strong> extracts and decompresses source code for each VBA macro found in the file (possibly including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) for each VBA macro found.</p> |
| ... | ... | @@ -141,7 +219,29 @@ else: |
| 141 | 219 | print 'VBA filename:', vba_filename |
| 142 | 220 | print '- '*39 |
| 143 | 221 | print vba_code</code></pre> |
| 144 | -<h3 id="detect-auto-executable-macros">Detect auto-executable macros</h3> | |
| 222 | +<h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3> | |
| 223 | +<p>Note: this API is under active development and may change in the future.</p> | |
| 224 | +<p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p> | |
| 225 | +<p>First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the extract_macros method). Then call the methods <strong>scan</strong> or <strong>scan_summary</strong> to get the results of the analysis.</p> | |
| 226 | +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded strings found in the code (Hex, Base64, Dridex) with their decoded value.</p> | |
| 227 | +<p><strong>scan</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p> | |
| 228 | +<ul> | |
| 229 | +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.</li> | |
| 230 | +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is the decoded value of the string.</li> | |
| 231 | +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li> | |
| 232 | +</ul> | |
| 233 | +<p>Example:</p> | |
| 234 | +<pre><code>vba_scanner = VBA_Scanner(vba_code) | |
| 235 | +results = vba_scanner.scan(include_decoded_strings=True) | |
| 236 | +for kw_type, keyword, description in results: | |
| 237 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)</code></pre> | |
| 238 | +<p>The function <strong>scan_vba</strong> is a shortcut for VBA_Scanner(vba_code).scan():</p> | |
| 239 | +<pre><code>results = scan_vba(vba_code, include_decoded_strings=True) | |
| 240 | +for kw_type, keyword, description in results: | |
| 241 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)</code></pre> | |
| 242 | +<p><strong>scan_summary</strong> returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).</p> | |
| 243 | +<h3 id="detect-auto-executable-macros-deprecated">Detect auto-executable macros (deprecated)</h3> | |
| 244 | +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p> | |
| 145 | 245 | <p>The function <strong>detect_autoexec</strong> checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.</p> |
| 146 | 246 | <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)</p> |
| 147 | 247 | <p>Sample usage:</p> |
| ... | ... | @@ -153,7 +253,8 @@ if autoexec_keywords: |
| 153 | 253 | print '%s: %s' % (keyword, description) |
| 154 | 254 | else: |
| 155 | 255 | print 'Auto-executable macro keywords: None found'</code></pre> |
| 156 | -<h3 id="detect-suspicious-vba-keywords">Detect suspicious VBA keywords</h3> | |
| 256 | +<h3 id="detect-suspicious-vba-keywords-deprecated">Detect suspicious VBA keywords (deprecated)</h3> | |
| 257 | +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p> | |
| 157 | 258 | <p>The function <strong>detect_suspicious</strong> checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).</p> |
| 158 | 259 | <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)</p> |
| 159 | 260 | <p>Sample usage:</p> |
| ... | ... | @@ -165,7 +266,8 @@ if suspicious_keywords: |
| 165 | 266 | print '%s: %s' % (keyword, description) |
| 166 | 267 | else: |
| 167 | 268 | print 'Suspicious VBA keywords: None found'</code></pre> |
| 168 | -<h3 id="extract-potential-iocs">Extract potential IOCs</h3> | |
| 269 | +<h3 id="extract-potential-iocs-deprecated">Extract potential IOCs (deprecated)</h3> | |
| 270 | +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p> | |
| 169 | 271 | <p>The function <strong>detect_patterns</strong> checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.</p> |
| 170 | 272 | <p>It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)</p> |
| 171 | 273 | <p>Sample usage:</p> | ... | ... |
oletools/doc/olevba.md
| ... | ... | @@ -5,7 +5,8 @@ olevba is a script to parse OLE and OpenXML files such as MS Office documents |
| 5 | 5 | (e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text, |
| 6 | 6 | and detect security-related patterns such as **auto-executable macros**, **suspicious |
| 7 | 7 | VBA keywords** used by malware, and potential **IOCs** (IP addresses, URLs, executable |
| 8 | -filenames, etc). | |
| 8 | +filenames, etc). It also detects and decodes several common **obfuscation methods including Hex encoding, | |
| 9 | +StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings. | |
| 9 | 10 | |
| 10 | 11 | It can be used either as a command-line tool, or as a python module from your own applications. |
| 11 | 12 | |
| ... | ... | @@ -26,8 +27,10 @@ by John William Davison, with significant modifications. |
| 26 | 27 | - Extract VBA macro source code |
| 27 | 28 | - Detect auto-executable macros |
| 28 | 29 | - Detect suspicious VBA keywords often used by malware |
| 30 | +- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex | |
| 29 | 31 | - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names |
| 30 | 32 | - Scan multiple files and sample collections (wildcards, recursive) |
| 33 | +- Triage mode for a summary view of multiple files | |
| 31 | 34 | - Scan malware samples in password-protected Zip archives |
| 32 | 35 | - Python API to use olevba from your applications |
| 33 | 36 | |
| ... | ... | @@ -39,6 +42,18 @@ encrypted, only the content of the document. |
| 39 | 42 | See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros |
| 40 | 43 | and how they are stored in MS Office documents. |
| 41 | 44 | |
| 45 | +## How it works | |
| 46 | + | |
| 47 | +1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away. | |
| 48 | +1. If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them. | |
| 49 | +1. olevba identifies all the VBA projects stored in the OLE structure. | |
| 50 | +1. Each VBA project is parsed to find the corresponding OLE streams containing macro code. | |
| 51 | +1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression). | |
| 52 | +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex). | |
| 53 | +1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros | |
| 54 | +and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc). | |
| 55 | + | |
| 56 | + | |
| 42 | 57 | ## Usage |
| 43 | 58 | |
| 44 | 59 | :::text |
| ... | ... | @@ -54,10 +69,59 @@ and how they are stored in MS Office documents. |
| 54 | 69 | if the file is a zip archive, file(s) to be opened |
| 55 | 70 | within the zip. Wildcards * and ? are supported. |
| 56 | 71 | (default:*) |
| 72 | + -t triage mode, display results as a summary table | |
| 73 | + (default for multiple files) | |
| 74 | + -d detailed mode, display full results (default for | |
| 75 | + single file) | |
| 76 | + -i INPUT, --input=INPUT | |
| 77 | + input file containing VBA source code to be analyzed | |
| 78 | + (no parsing) | |
| 79 | + --hex display all the hex-encoded strings with their decoded | |
| 80 | + content. | |
| 57 | 81 | |
| 58 | -### Example | |
| 82 | +### Examples | |
| 83 | + | |
| 84 | +Scan a single file: | |
| 85 | + | |
| 86 | + :::text | |
| 87 | + olevba.py file.doc | |
| 88 | + | |
| 89 | +Scan a single file, stored in a Zip archive with password "infected": | |
| 90 | + | |
| 91 | + :::text | |
| 92 | + olevba.py malicious_file.xls.zip -z infected | |
| 93 | + | |
| 94 | +Scan a single file, showing all obfuscated strings decoded: | |
| 95 | + | |
| 96 | + :::text | |
| 97 | + olevba.py file.doc --hex | |
| 98 | + | |
| 99 | +Scan VBA source code extracted into a text file: | |
| 100 | + | |
| 101 | + :::text | |
| 102 | + olevba.py -i source_code.vba | |
| 103 | + | |
| 104 | +Scan a collection of files stored in a folder: | |
| 59 | 105 | |
| 60 | -Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/): | |
| 106 | + :::text | |
| 107 | + olevba.py MalwareZoo/VBA/* | |
| 108 | + | |
| 109 | +Scan all .doc and .xls files, recursively in all subfolders: | |
| 110 | + | |
| 111 | + :::text | |
| 112 | + olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r | |
| 113 | + | |
| 114 | +Scan all .doc files within all .zip files with password, recursively: | |
| 115 | + | |
| 116 | + :::text | |
| 117 | + olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc | |
| 118 | + | |
| 119 | + | |
| 120 | +### Detailed analysis mode (default for single file) | |
| 121 | + | |
| 122 | +When a single file is scanned, or when using the option -d, all details of the analysis are displayed. | |
| 123 | + | |
| 124 | +For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/): | |
| 61 | 125 | |
| 62 | 126 | :::text |
| 63 | 127 | >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected |
| ... | ... | @@ -122,33 +186,87 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4 |
| 122 | 186 | | IOC | sfjozjero.exe | Executable file name | |
| 123 | 187 | +------------+----------------------+-----------------------------------------+ |
| 124 | 188 | |
| 189 | +### Triage mode (default for multiple files) | |
| 190 | + | |
| 191 | +When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. | |
| 192 | +This is more convenient for quick triage of a collection of suspicious files. | |
| 193 | + | |
| 194 | +The following flags show the results of the analysis: | |
| 195 | + | |
| 196 | +- **OLE**: the file type is OLE, for example MS Office 97-2003 | |
| 197 | +- **OpX**: the file type is OpenXML, for example MS Office 2007+ | |
| 198 | +- **?**: the file type is not supported | |
| 199 | +- **M**: contains VBA Macros | |
| 200 | +- **A**: auto-executable macros | |
| 201 | +- **S**: suspicious VBA keywords | |
| 202 | +- **I**: potential IOCs | |
| 203 | +- **H**: hex-encoded strings (potential obfuscation) | |
| 204 | +- **B**: Base64-encoded strings (potential obfuscation) | |
| 205 | +- **D**: Dridex-encoded strings (potential obfuscation) | |
| 206 | + | |
| 207 | +Here is an example: | |
| 208 | + | |
| 209 | + :::text | |
| 210 | + c:\>olevba.py \MalwareZoo\VBA\samples\* | |
| 211 | + Flags Filename | |
| 212 | + ----------- ----------------------------------------------------------------- | |
| 213 | + OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware | |
| 214 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware | |
| 215 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware | |
| 216 | + OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware | |
| 217 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware | |
| 218 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware | |
| 219 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware | |
| 220 | + OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware | |
| 221 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware | |
| 222 | + OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware | |
| 223 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware | |
| 224 | + OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc | |
| 225 | + OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc | |
| 226 | + OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware | |
| 227 | + OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware | |
| 228 | + OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc | |
| 229 | + | |
| 230 | + | |
| 231 | +-------------------------------------------------------------------------- | |
| 232 | + | |
| 125 | 233 | ## How to use olevba in Python applications |
| 126 | 234 | |
| 127 | 235 | olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code |
| 128 | 236 | from your own python applications. |
| 129 | 237 | |
| 238 | +IMPORTANT: olevba is currently under active development, therefore this API is likely to change. | |
| 239 | + | |
| 130 | 240 | ### Import olevba |
| 131 | 241 | |
| 132 | -First, import the **oletools.olevba** package, using at least the VBA_Parser class: | |
| 242 | +First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes: | |
| 133 | 243 | |
| 134 | 244 | :::python |
| 135 | - from oletools.olevba import VBA_Parser | |
| 245 | + from oletools.olevba import VBA_Parser, VBA_Scanner | |
| 136 | 246 | |
| 137 | 247 | ### Parse a MS Office file |
| 138 | 248 | |
| 139 | -Create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter. | |
| 140 | -The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual | |
| 141 | -filename may be provided as a second parameter, if available. | |
| 249 | +To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter. | |
| 250 | +For example: | |
| 142 | 251 | |
| 143 | 252 | :::python |
| 144 | 253 | vba = VBA_Parser('my_file_with_macros.doc') |
| 254 | + | |
| 255 | +The file may also be provided as a bytes string containing its data. In that case, the actual | |
| 256 | +filename must be provided for reference, and the file content with the data parameter. For example: | |
| 257 | + | |
| 258 | + :::python | |
| 259 | + myfile = 'my_file_with_macros.doc' | |
| 260 | + filedata = open(myfile, 'rb').read() | |
| 261 | + vba = VBA_Parser(myfile, data=filedata) | |
| 145 | 262 | |
| 146 | 263 | VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML |
| 147 | 264 | (MS Office 2007+). |
| 148 | 265 | |
| 149 | 266 | ### Detect VBA macros |
| 150 | 267 | |
| 151 | -The method **detect_vba_macros** returns True if VBA macros have been found in the file, False otherwise. | |
| 268 | +The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file, | |
| 269 | +False otherwise. | |
| 152 | 270 | |
| 153 | 271 | :::python |
| 154 | 272 | if vba.detect_vba_macros(): |
| ... | ... | @@ -158,7 +276,7 @@ The method **detect_vba_macros** returns True if VBA macros have been found in t |
| 158 | 276 | |
| 159 | 277 | Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine |
| 160 | 278 | for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will |
| 161 | -always return False because VBA Macros are stored in a different way. | |
| 279 | +always return False because VBA Macros are stored in a different way which is not yet supported by olevba. | |
| 162 | 280 | |
| 163 | 281 | Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method |
| 164 | 282 | may return True if the embedded document contains VBA Macros, even if the main document does not. |
| ... | ... | @@ -186,8 +304,49 @@ Example: |
| 186 | 304 | print 'VBA filename:', vba_filename |
| 187 | 305 | print '- '*39 |
| 188 | 306 | print vba_code |
| 189 | - | |
| 190 | -### Detect auto-executable macros | |
| 307 | + | |
| 308 | +### Analyze VBA Source Code | |
| 309 | + | |
| 310 | +Note: this API is under active development and may change in the future. | |
| 311 | + | |
| 312 | +The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, | |
| 313 | +suspicious keywords, IOCs, auto-executable macros, etc. | |
| 314 | + | |
| 315 | +First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the | |
| 316 | +extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis. | |
| 317 | + | |
| 318 | +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 319 | +strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 320 | + | |
| 321 | +**scan** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 322 | + | |
| 323 | +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'. | |
| 324 | +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 325 | + the decoded value of the string. | |
| 326 | +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 327 | + | |
| 328 | +Example: | |
| 329 | + | |
| 330 | + :::python | |
| 331 | + vba_scanner = VBA_Scanner(vba_code) | |
| 332 | + results = vba_scanner.scan(include_decoded_strings=True) | |
| 333 | + for kw_type, keyword, description in results: | |
| 334 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 335 | + | |
| 336 | +The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan(): | |
| 337 | + | |
| 338 | + :::python | |
| 339 | + results = scan_vba(vba_code, include_decoded_strings=True) | |
| 340 | + for kw_type, keyword, description in results: | |
| 341 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 342 | + | |
| 343 | +**scan_summary** returns a tuple with the number of items found for each category: | |
| 344 | +(autoexec, suspicious, IOCs, hex, base64, dridex). | |
| 345 | + | |
| 346 | + | |
| 347 | +### Detect auto-executable macros (deprecated) | |
| 348 | + | |
| 349 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 191 | 350 | |
| 192 | 351 | The function **detect_autoexec** checks if VBA macro code contains specific macro names |
| 193 | 352 | that will be triggered when the document/workbook is opened, closed, changed, etc. |
| ... | ... | @@ -208,7 +367,9 @@ Sample usage: |
| 208 | 367 | print 'Auto-executable macro keywords: None found' |
| 209 | 368 | |
| 210 | 369 | |
| 211 | -### Detect suspicious VBA keywords | |
| 370 | +### Detect suspicious VBA keywords (deprecated) | |
| 371 | + | |
| 372 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 212 | 373 | |
| 213 | 374 | The function **detect_suspicious** checks if VBA macro code contains specific |
| 214 | 375 | keywords often used by malware to act on the system (create files, run |
| ... | ... | @@ -230,7 +391,9 @@ Sample usage: |
| 230 | 391 | print 'Suspicious VBA keywords: None found' |
| 231 | 392 | |
| 232 | 393 | |
| 233 | -### Extract potential IOCs | |
| 394 | +### Extract potential IOCs (deprecated) | |
| 395 | + | |
| 396 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 234 | 397 | |
| 235 | 398 | The function **detect_patterns** checks if VBA macro code contains specific |
| 236 | 399 | patterns of interest, that may be useful for malware analysis and detection | ... | ... |