Commit 430c7b98a68cf38ed3f67b8a23209d431aba0d0b

Authored by Philippe Lagadec
1 parent cdbcd101

updated doc and readme

README.md
@@ -22,7 +22,7 @@ Note: python-oletools is not related to OLETools published by BeCubed Software. @@ -22,7 +22,7 @@ Note: python-oletools is not related to OLETools published by BeCubed Software.
22 News 22 News
23 ---- 23 ----
24 24
25 -- **2015-02-05 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings 25 +- **2015-02-08 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings
26 obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western 26 obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western
27 codepages with olefile 0.42, improved API and display, several bugfixes. 27 codepages with olefile 0.42, improved API and display, several bugfixes.
28 - 2015-01-05 v0.07: improved [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) to detect suspicious 28 - 2015-01-05 v0.07: improved [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) to detect suspicious
oletools/README.html
@@ -13,7 +13,7 @@ @@ -13,7 +13,7 @@
13 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p> 13 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p>
14 <h2 id="news">News</h2> 14 <h2 id="news">News</h2>
15 <ul> 15 <ul>
16 -<li><strong>2015-02-05 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li> 16 +<li><strong>2015-02-08 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li>
17 <li>2015-01-05 v0.07: improved <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> to detect suspicious keywords and IOCs in VBA macros, can now scan several files and open password-protected zip archives, added a Python API, upgraded OleFileIO_PL to olefile v0.41</li> 17 <li>2015-01-05 v0.07: improved <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> to detect suspicious keywords and IOCs in VBA macros, can now scan several files and open password-protected zip archives, added a Python API, upgraded OleFileIO_PL to olefile v0.41</li>
18 <li>2014-08-28 v0.06: added <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved <a href="https://bitbucket.org/decalage/oletools/wiki">documentation</a></li> 18 <li>2014-08-28 v0.06: added <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved <a href="https://bitbucket.org/decalage/oletools/wiki">documentation</a></li>
19 <li>2013-07-24 v0.05: added new tools <a href="https://bitbucket.org/decalage/oletools/wiki/olemeta">olemeta</a> and <a href="https://bitbucket.org/decalage/oletools/wiki/oletimes">oletimes</a></li> 19 <li>2013-07-24 v0.05: added new tools <a href="https://bitbucket.org/decalage/oletools/wiki/olemeta">olemeta</a> and <a href="https://bitbucket.org/decalage/oletools/wiki/oletimes">oletimes</a></li>
oletools/README.rst
@@ -27,7 +27,7 @@ Software. @@ -27,7 +27,7 @@ Software.
27 News 27 News
28 ---- 28 ----
29 29
30 -- **2015-02-05 v0.08**: 30 +- **2015-02-08 v0.08**:
31 `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`_ can 31 `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`_ can
32 now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and 32 now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and
33 extract IOCs. Added new triage mode, support for non-western 33 extract IOCs. Added new triage mode, support for non-western
oletools/doc/Home.html
@@ -7,7 +7,7 @@ @@ -7,7 +7,7 @@
7 <title></title> 7 <title></title>
8 </head> 8 </head>
9 <body> 9 <body>
10 -<h1 id="python-oletools-v0.07-documentation">python-oletools v0.07 documentation</h1> 10 +<h1 id="python-oletools-v0.08-documentation">python-oletools v0.08 documentation</h1>
11 <p>This is the home page of the documentation for python-oletools. The latest version can be found <a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> 11 <p>This is the home page of the documentation for python-oletools. The latest version can be found <a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
12 <p><a href="http://www.decalage.info/python/oletools">python-oletools</a> is a package of python tools to analyze <a href="http://en.wikipedia.org/wiki/Compound_File_Binary_Format">Microsoft OLE2 files</a> (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the <a href="http://www.decalage.info/olefile">olefile</a> parser. See <a href="http://www.decalage.info/python/oletools">http://www.decalage.info/python/oletools</a> for more info.</p> 12 <p><a href="http://www.decalage.info/python/oletools">python-oletools</a> is a package of python tools to analyze <a href="http://en.wikipedia.org/wiki/Compound_File_Binary_Format">Microsoft OLE2 files</a> (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the <a href="http://www.decalage.info/olefile">olefile</a> parser. See <a href="http://www.decalage.info/python/oletools">http://www.decalage.info/python/oletools</a> for more info.</p>
13 <p><strong>Quick links:</strong> <a href="http://www.decalage.info/python/oletools">Home page</a> - <a href="https://bitbucket.org/decalage/oletools/wiki/Install">Download/Install</a> - <a href="https://bitbucket.org/decalage/oletools/wiki">Documentation</a> - <a href="https://bitbucket.org/decalage/oletools/issues?status=new&amp;status=open">Report Issues/Suggestions/Questions</a> - <a href="http://decalage.info/contact">Contact the author</a> - <a href="https://bitbucket.org/decalage/oletools">Repository</a> - <a href="https://twitter.com/decalage2">Updates on Twitter</a></p> 13 <p><strong>Quick links:</strong> <a href="http://www.decalage.info/python/oletools">Home page</a> - <a href="https://bitbucket.org/decalage/oletools/wiki/Install">Download/Install</a> - <a href="https://bitbucket.org/decalage/oletools/wiki">Documentation</a> - <a href="https://bitbucket.org/decalage/oletools/issues?status=new&amp;status=open">Report Issues/Suggestions/Questions</a> - <a href="http://decalage.info/contact">Contact the author</a> - <a href="https://bitbucket.org/decalage/oletools">Repository</a> - <a href="https://twitter.com/decalage2">Updates on Twitter</a></p>
@@ -18,7 +18,7 @@ @@ -18,7 +18,7 @@
18 <li><strong><a href="oleid.html">oleid</a></strong>: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.</li> 18 <li><strong><a href="oleid.html">oleid</a></strong>: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.</li>
19 <li><strong><a href="olemeta.html">olemeta</a></strong>: a tool to extract all standard properties (metadata) from OLE files.</li> 19 <li><strong><a href="olemeta.html">olemeta</a></strong>: a tool to extract all standard properties (metadata) from OLE files.</li>
20 <li><strong><a href="oletimes.html">oletimes</a></strong>: a tool to extract creation and modification timestamps of all streams and storages.</li> 20 <li><strong><a href="oletimes.html">oletimes</a></strong>: a tool to extract creation and modification timestamps of all streams and storages.</li>
21 -<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).</li> 21 +<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).</li>
22 <li><strong><a href="pyxswf.html">pyxswf</a></strong>: a tool to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.</li> 22 <li><strong><a href="pyxswf.html">pyxswf</a></strong>: a tool to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.</li>
23 <li><strong><a href="rtfobj.html">rtfobj</a></strong>: a tool and python module to extract embedded objects from RTF files.</li> 23 <li><strong><a href="rtfobj.html">rtfobj</a></strong>: a tool and python module to extract embedded objects from RTF files.</li>
24 <li>and a few others (coming soon)</li> 24 <li>and a few others (coming soon)</li>
oletools/doc/Home.md
1 -python-oletools v0.07 documentation 1 +python-oletools v0.08 documentation
2 =================================== 2 ===================================
3 3
4 This is the home page of the documentation for python-oletools. The latest version can be found 4 This is the home page of the documentation for python-oletools. The latest version can be found
@@ -29,7 +29,7 @@ Tools in python-oletools: @@ -29,7 +29,7 @@ Tools in python-oletools:
29 - **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files. 29 - **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.
30 - **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files. 30 - **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files.
31 - **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages. 31 - **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages.
32 -- **[[olevba]]**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML). 32 +- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
33 - **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may 33 - **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may
34 be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, 34 be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
35 which is especially useful for malware analysis. 35 which is especially useful for malware analysis.
oletools/doc/Install.html
@@ -9,14 +9,15 @@ @@ -9,14 +9,15 @@
9 <body> 9 <body>
10 <h1 id="how-to-download-and-install-python-oletools">How to Download and Install python-oletools</h1> 10 <h1 id="how-to-download-and-install-python-oletools">How to Download and Install python-oletools</h1>
11 <h2 id="pre-requisites">Pre-requisites</h2> 11 <h2 id="pre-requisites">Pre-requisites</h2>
12 -<p>For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet.</p>  
13 -<h2 id="for-command-line-tools">For command-line tools</h2>  
14 -<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice.</p>  
15 -<p>You may then add the directory to your PATH environment variable to access the tools from anywhere.</p>  
16 -<p>To get the latest development version, click on &quot;Download repository&quot; on the <a href="https://bitbucket.org/decalage/oletools/downloads">downloads page</a>, or use mercurial to clone the repository.</p> 12 +<p>For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.</p>
  13 +<h2 id="to-use-oletools-as-command-line-tools">To use oletools as command-line tools</h2>
  14 +<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice. Pick the latest release version, or click on &quot;Download Repository&quot; to get the latest development version with the most recent features.</p>
  15 +<p>Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily in the future.</p>
  16 +<p>You may add the oletools directory to your PATH environment variable to access the tools from anywhere.</p>
17 <h2 id="for-python-applications">For python applications</h2> 17 <h2 id="for-python-applications">For python applications</h2>
18 -<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use &quot;<strong>pip install oletools</strong>&quot; or &quot;<strong>easy_install oletools</strong>&quot; to download and install the package in one go.</p>  
19 -<p>Otherwise you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run &quot;<strong>python setup.py install</strong>&quot;.</p> 18 +<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use <strong>&quot;pip install oletools&quot;</strong> or <strong>&quot;easy_install oletools&quot;</strong> to download and install the package in one go. Pip is included with Python since version 2.7.9.</p>
  19 +<p><strong>Important: to update oletools</strong> if it is already installed, you must run <strong>&quot;pip install -U oletools&quot;</strong>, otherwise pip will not update it.</p>
  20 +<p>Alternatively, you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run <strong>&quot;python setup.py install&quot;</strong>.</p>
20 <hr /> 21 <hr />
21 <h2 id="python-oletools-documentation">python-oletools documentation</h2> 22 <h2 id="python-oletools-documentation">python-oletools documentation</h2>
22 <ul> 23 <ul>
oletools/doc/Install.md
@@ -4,30 +4,35 @@ How to Download and Install python-oletools @@ -4,30 +4,35 @@ How to Download and Install python-oletools
4 Pre-requisites 4 Pre-requisites
5 -------------- 5 --------------
6 6
7 -For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet. 7 +For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.
8 8
9 9
10 -For command-line tools  
11 ----------------------- 10 +To use oletools as command-line tools
  11 +-------------------------------------
12 12
13 To use python-oletools from the command line as analysis tools, you may simply 13 To use python-oletools from the command line as analysis tools, you may simply
14 [download the zip archive](https://bitbucket.org/decalage/oletools/downloads) 14 [download the zip archive](https://bitbucket.org/decalage/oletools/downloads)
15 -and extract the files in the directory of your choice. 15 +and extract the files in the directory of your choice. Pick the latest release version, or click on "Download Repository"
  16 +to get the latest development version with the most recent features.
16 17
17 -You may then add the directory to your PATH environment variable to access the tools from anywhere. 18 +Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily
  19 +in the future.
18 20
19 -To get the latest development version, click on "Download repository" on the  
20 -[downloads page](https://bitbucket.org/decalage/oletools/downloads), or use mercurial to clone the repository. 21 +You may add the oletools directory to your PATH environment variable to access the tools from anywhere.
21 22
22 23
23 For python applications 24 For python applications
24 ----------------------- 25 +-----------------------
25 26
26 If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use 27 If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use
27 -"**pip install oletools**" or "**easy_install oletools**" to download and install the package in one go. 28 +**"pip install oletools"** or **"easy_install oletools"** to download and install the package in one go. Pip is included
  29 +with Python since version 2.7.9.
28 30
29 -Otherwise you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary  
30 -directory and run "**python setup.py install**". 31 +**Important: to update oletools** if it is already installed, you must run **"pip install -U oletools"**, otherwise pip
  32 +will not update it.
  33 +
  34 +Alternatively, you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary
  35 +directory and run **"python setup.py install"**.
31 36
32 -------------------------------------------------------------------------- 37 --------------------------------------------------------------------------
33 38
oletools/doc/oleid.html
@@ -39,19 +39,43 @@ @@ -39,19 +39,43 @@
39 <pre><code>C:\oletools&gt;oleid.py word_flash_vba.doc 39 <pre><code>C:\oletools&gt;oleid.py word_flash_vba.doc
40 40
41 Filename: word_flash_vba.doc 41 Filename: word_flash_vba.doc
42 -OLE format: True  
43 -Has SummaryInformation stream: True  
44 -Application name: Microsoft Office Word  
45 -Encrypted: False  
46 -Word Document: True  
47 -VBA Macros: True  
48 -Excel Workbook: False  
49 -PowerPoint Presentation: False  
50 -Visio Drawing: False  
51 -ObjectPool: True  
52 -Flash objects: 1</code></pre>  
53 -<h2 id="how-to-use-oleid-in-python-applications">How to use oleid in Python applications</h2>  
54 -<p>TODO</p> 42 ++-------------------------------+-----------------------+
  43 +| Indicator | Value |
  44 ++-------------------------------+-----------------------+
  45 +| OLE format | True |
  46 +| Has SummaryInformation stream | True |
  47 +| Application name | Microsoft Office Word |
  48 +| Encrypted | False |
  49 +| Word Document | True |
  50 +| VBA Macros | True |
  51 +| Excel Workbook | False |
  52 +| PowerPoint Presentation | False |
  53 +| Visio Drawing | False |
  54 +| ObjectPool | True |
  55 +| Flash objects | 1 |
  56 ++-------------------------------+-----------------------+</code></pre>
  57 +<h2 id="how-to-use-oleid-in-your-python-applications">How to use oleid in your Python applications</h2>
  58 +<p>First, import oletools.oleid, and create an <strong>OleID</strong> object to scan a file:</p>
  59 +<pre><code>import oletools.oleid
  60 +
  61 +oid = oletools.oleid.OleID(filename)</code></pre>
  62 +<p>Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.</p>
  63 +<p>Second, call the <strong>check()</strong> method. It returns a list of <strong>Indicator</strong> objects.</p>
  64 +<p>Each Indicator object has the following attributes:</p>
  65 +<ul>
  66 +<li><strong>id</strong>: str, identifier for the indicator</li>
  67 +<li><strong>name</strong>: str, name to display the indicator</li>
  68 +<li><strong>description</strong>: str, long description of the indicator</li>
  69 +<li><strong>type</strong>: class of the indicator (e.g. bool, str, int)</li>
  70 +<li><strong>value</strong>: value of the indicator</li>
  71 +</ul>
  72 +<p>For example, the following code displays all the indicators:</p>
  73 +<pre><code>indicators = oid.check()
  74 +for i in indicators:
  75 + print &#39;Indicator id=%s name=&quot;%s&quot; type=%s value=%s&#39; % (i.id, i.name, i.type, repr(i.value))
  76 + print &#39;description:&#39;, i.description
  77 + print &#39;&#39;</code></pre>
  78 +<p>See the source code of oleid.py for more details.</p>
55 <hr /> 79 <hr />
56 <h2 id="python-oletools-documentation">python-oletools documentation</h2> 80 <h2 id="python-oletools-documentation">python-oletools documentation</h2>
57 <ul> 81 <ul>
oletools/doc/oleid.md
@@ -42,21 +42,53 @@ Analyzing a Word document containing a Flash object and VBA macros: @@ -42,21 +42,53 @@ Analyzing a Word document containing a Flash object and VBA macros:
42 C:\oletools>oleid.py word_flash_vba.doc 42 C:\oletools>oleid.py word_flash_vba.doc
43 43
44 Filename: word_flash_vba.doc 44 Filename: word_flash_vba.doc
45 - OLE format: True  
46 - Has SummaryInformation stream: True  
47 - Application name: Microsoft Office Word  
48 - Encrypted: False  
49 - Word Document: True  
50 - VBA Macros: True  
51 - Excel Workbook: False  
52 - PowerPoint Presentation: False  
53 - Visio Drawing: False  
54 - ObjectPool: True  
55 - Flash objects: 1  
56 -  
57 -## How to use oleid in Python applications  
58 -  
59 -TODO 45 + +-------------------------------+-----------------------+
  46 + | Indicator | Value |
  47 + +-------------------------------+-----------------------+
  48 + | OLE format | True |
  49 + | Has SummaryInformation stream | True |
  50 + | Application name | Microsoft Office Word |
  51 + | Encrypted | False |
  52 + | Word Document | True |
  53 + | VBA Macros | True |
  54 + | Excel Workbook | False |
  55 + | PowerPoint Presentation | False |
  56 + | Visio Drawing | False |
  57 + | ObjectPool | True |
  58 + | Flash objects | 1 |
  59 + +-------------------------------+-----------------------+
  60 +
  61 +## How to use oleid in your Python applications
  62 +
  63 +First, import oletools.oleid, and create an **OleID** object to scan a file:
  64 +
  65 + :::python
  66 + import oletools.oleid
  67 +
  68 + oid = oletools.oleid.OleID(filename)
  69 +
  70 +Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.
  71 +
  72 +Second, call the **check()** method. It returns a list of **Indicator** objects.
  73 +
  74 +Each Indicator object has the following attributes:
  75 +
  76 +- **id**: str, identifier for the indicator
  77 +- **name**: str, name to display the indicator
  78 +- **description**: str, long description of the indicator
  79 +- **type**: class of the indicator (e.g. bool, str, int)
  80 +- **value**: value of the indicator
  81 +
  82 +For example, the following code displays all the indicators:
  83 +
  84 + :::python
  85 + indicators = oid.check()
  86 + for i in indicators:
  87 + print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
  88 + print 'description:', i.description
  89 + print ''
  90 +
  91 +See the source code of oleid.py for more details.
60 92
61 -------------------------------------------------------------------------- 93 --------------------------------------------------------------------------
62 94
oletools/doc/oletimes.html
@@ -16,24 +16,29 @@ @@ -16,24 +16,29 @@
16 <p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p> 16 <p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p>
17 <pre><code>&gt;oletimes.py DIAN_caso-5415.doc 17 <pre><code>&gt;oletimes.py DIAN_caso-5415.doc
18 18
19 -- Root mtime=2014-05-14 12:45:24.752000 ctime=None  
20 -- &#39;\x01CompObj&#39;: mtime=None ctime=None  
21 -- &#39;\x05DocumentSummaryInformation&#39;: mtime=None ctime=None  
22 -- &#39;\x05SummaryInformation&#39;: mtime=None ctime=None  
23 -- &#39;1Table&#39;: mtime=None ctime=None  
24 -- &#39;Data&#39;: mtime=None ctime=None  
25 -- &#39;Macros&#39;: mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000  
26 -- &#39;Macros/PROJECT&#39;: mtime=None ctime=None  
27 -- &#39;Macros/PROJECTwm&#39;: mtime=None ctime=None  
28 -- &#39;Macros/VBA&#39;: mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000  
29 -- &#39;Macros/VBA/ThisDocument&#39;: mtime=None ctime=None  
30 -- &#39;Macros/VBA/_VBA_PROJECT&#39;: mtime=None ctime=None  
31 -- &#39;Macros/VBA/__SRP_0&#39;: mtime=None ctime=None  
32 -- &#39;Macros/VBA/__SRP_1&#39;: mtime=None ctime=None  
33 -- &#39;Macros/VBA/__SRP_2&#39;: mtime=None ctime=None  
34 -- &#39;Macros/VBA/__SRP_3&#39;: mtime=None ctime=None  
35 -- &#39;Macros/VBA/dir&#39;: mtime=None ctime=None  
36 -- &#39;WordDocument&#39;: mtime=None ctime=None</code></pre> 19 ++----------------------------+---------------------+---------------------+
  20 +| Stream/Storage name | Modification Time | Creation Time |
  21 ++----------------------------+---------------------+---------------------+
  22 +| Root | 2014-05-14 12:45:24 | None |
  23 +| &#39;\x01CompObj&#39; | None | None |
  24 +| &#39;\x05DocumentSummaryInform | None | None |
  25 +| ation&#39; | | |
  26 +| &#39;\x05SummaryInformation&#39; | None | None |
  27 +| &#39;1Table&#39; | None | None |
  28 +| &#39;Data&#39; | None | None |
  29 +| &#39;Macros&#39; | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  30 +| &#39;Macros/PROJECT&#39; | None | None |
  31 +| &#39;Macros/PROJECTwm&#39; | None | None |
  32 +| &#39;Macros/VBA&#39; | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  33 +| &#39;Macros/VBA/ThisDocument&#39; | None | None |
  34 +| &#39;Macros/VBA/_VBA_PROJECT&#39; | None | None |
  35 +| &#39;Macros/VBA/__SRP_0&#39; | None | None |
  36 +| &#39;Macros/VBA/__SRP_1&#39; | None | None |
  37 +| &#39;Macros/VBA/__SRP_2&#39; | None | None |
  38 +| &#39;Macros/VBA/__SRP_3&#39; | None | None |
  39 +| &#39;Macros/VBA/dir&#39; | None | None |
  40 +| &#39;WordDocument&#39; | None | None |
  41 ++----------------------------+---------------------+---------------------+</code></pre>
37 <h2 id="how-to-use-oletimes-in-python-applications">How to use oletimes in Python applications</h2> 42 <h2 id="how-to-use-oletimes-in-python-applications">How to use oletimes in Python applications</h2>
38 <p>TODO</p> 43 <p>TODO</p>
39 <hr /> 44 <hr />
oletools/doc/oletimes.md
@@ -19,24 +19,29 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4 @@ -19,24 +19,29 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4
19 :::text 19 :::text
20 >oletimes.py DIAN_caso-5415.doc 20 >oletimes.py DIAN_caso-5415.doc
21 21
22 - - Root mtime=2014-05-14 12:45:24.752000 ctime=None  
23 - - '\x01CompObj': mtime=None ctime=None  
24 - - '\x05DocumentSummaryInformation': mtime=None ctime=None  
25 - - '\x05SummaryInformation': mtime=None ctime=None  
26 - - '1Table': mtime=None ctime=None  
27 - - 'Data': mtime=None ctime=None  
28 - - 'Macros': mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000  
29 - - 'Macros/PROJECT': mtime=None ctime=None  
30 - - 'Macros/PROJECTwm': mtime=None ctime=None  
31 - - 'Macros/VBA': mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000  
32 - - 'Macros/VBA/ThisDocument': mtime=None ctime=None  
33 - - 'Macros/VBA/_VBA_PROJECT': mtime=None ctime=None  
34 - - 'Macros/VBA/__SRP_0': mtime=None ctime=None  
35 - - 'Macros/VBA/__SRP_1': mtime=None ctime=None  
36 - - 'Macros/VBA/__SRP_2': mtime=None ctime=None  
37 - - 'Macros/VBA/__SRP_3': mtime=None ctime=None  
38 - - 'Macros/VBA/dir': mtime=None ctime=None  
39 - - 'WordDocument': mtime=None ctime=None 22 + +----------------------------+---------------------+---------------------+
  23 + | Stream/Storage name | Modification Time | Creation Time |
  24 + +----------------------------+---------------------+---------------------+
  25 + | Root | 2014-05-14 12:45:24 | None |
  26 + | '\x01CompObj' | None | None |
  27 + | '\x05DocumentSummaryInform | None | None |
  28 + | ation' | | |
  29 + | '\x05SummaryInformation' | None | None |
  30 + | '1Table' | None | None |
  31 + | 'Data' | None | None |
  32 + | 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  33 + | 'Macros/PROJECT' | None | None |
  34 + | 'Macros/PROJECTwm' | None | None |
  35 + | 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  36 + | 'Macros/VBA/ThisDocument' | None | None |
  37 + | 'Macros/VBA/_VBA_PROJECT' | None | None |
  38 + | 'Macros/VBA/__SRP_0' | None | None |
  39 + | 'Macros/VBA/__SRP_1' | None | None |
  40 + | 'Macros/VBA/__SRP_2' | None | None |
  41 + | 'Macros/VBA/__SRP_3' | None | None |
  42 + | 'Macros/VBA/dir' | None | None |
  43 + | 'WordDocument' | None | None |
  44 + +----------------------------+---------------------+---------------------+
40 45
41 ## How to use oletimes in Python applications 46 ## How to use oletimes in Python applications
42 47
oletools/doc/olevba.html
@@ -8,7 +8,7 @@ @@ -8,7 +8,7 @@
8 </head> 8 </head>
9 <body> 9 <body>
10 <h1 id="olevba">olevba</h1> 10 <h1 id="olevba">olevba</h1>
11 -<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc).</p> 11 +<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc). It also detects and decodes several common <strong>obfuscation methods including Hex encoding, StrReverse, Base64, Dridex</strong>, and extracts IOCs from decoded strings.</p>
12 <p>It can be used either as a command-line tool, or as a python module from your own applications.</p> 12 <p>It can be used either as a command-line tool, or as a python module from your own applications.</p>
13 <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p> 13 <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p>
14 <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a> by John William Davison, with significant modifications.</p> 14 <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a> by John William Davison, with significant modifications.</p>
@@ -24,14 +24,26 @@ @@ -24,14 +24,26 @@
24 <li>Extract VBA macro source code</li> 24 <li>Extract VBA macro source code</li>
25 <li>Detect auto-executable macros</li> 25 <li>Detect auto-executable macros</li>
26 <li>Detect suspicious VBA keywords often used by malware</li> 26 <li>Detect suspicious VBA keywords often used by malware</li>
  27 +<li>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</li>
27 <li>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</li> 28 <li>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</li>
28 <li>Scan multiple files and sample collections (wildcards, recursive)</li> 29 <li>Scan multiple files and sample collections (wildcards, recursive)</li>
  30 +<li>Triage mode for a summary view of multiple files</li>
29 <li>Scan malware samples in password-protected Zip archives</li> 31 <li>Scan malware samples in password-protected Zip archives</li>
30 <li>Python API to use olevba from your applications</li> 32 <li>Python API to use olevba from your applications</li>
31 </ul> 33 </ul>
32 <p>MS Office files encrypted with a password are also supported, because VBA macro code is never encrypted, only the content of the document.</p> 34 <p>MS Office files encrypted with a password are also supported, because VBA macro code is never encrypted, only the content of the document.</p>
33 <h2 id="about-vba-macros">About VBA Macros</h2> 35 <h2 id="about-vba-macros">About VBA Macros</h2>
34 <p>See <a href="http://www.decalage.info/en/vba_tools">this article</a> for more information and technical details about VBA Macros and how they are stored in MS Office documents.</p> 36 <p>See <a href="http://www.decalage.info/en/vba_tools">this article</a> for more information and technical details about VBA Macros and how they are stored in MS Office documents.</p>
  37 +<h2 id="how-it-works">How it works</h2>
  38 +<ol style="list-style-type: decimal">
  39 +<li>olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.</li>
  40 +<li>If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.</li>
  41 +<li>olevba identifies all the VBA projects stored in the OLE structure.</li>
  42 +<li>Each VBA project is parsed to find the corresponding OLE streams containing macro code.</li>
  43 +<li>In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).</li>
  44 +<li>olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).</li>
  45 +<li>olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).</li>
  46 +</ol>
35 <h2 id="usage">Usage</h2> 47 <h2 id="usage">Usage</h2>
36 <pre><code>Usage: olevba.py [options] &lt;filename&gt; [filename2 ...] 48 <pre><code>Usage: olevba.py [options] &lt;filename&gt; [filename2 ...]
37 49
@@ -44,9 +56,34 @@ Options: @@ -44,9 +56,34 @@ Options:
44 -f ZIP_FNAME, --zipfname=ZIP_FNAME 56 -f ZIP_FNAME, --zipfname=ZIP_FNAME
45 if the file is a zip archive, file(s) to be opened 57 if the file is a zip archive, file(s) to be opened
46 within the zip. Wildcards * and ? are supported. 58 within the zip. Wildcards * and ? are supported.
47 - (default:*)</code></pre>  
48 -<h3 id="example">Example</h3>  
49 -<p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p> 59 + (default:*)
  60 + -t triage mode, display results as a summary table
  61 + (default for multiple files)
  62 + -d detailed mode, display full results (default for
  63 + single file)
  64 + -i INPUT, --input=INPUT
  65 + input file containing VBA source code to be analyzed
  66 + (no parsing)
  67 + --hex display all the hex-encoded strings with their decoded
  68 + content.</code></pre>
  69 +<h3 id="examples">Examples</h3>
  70 +<p>Scan a single file:</p>
  71 +<pre><code>olevba.py file.doc</code></pre>
  72 +<p>Scan a single file, stored in a Zip archive with password &quot;infected&quot;:</p>
  73 +<pre><code>olevba.py malicious_file.xls.zip -z infected</code></pre>
  74 +<p>Scan a single file, showing all obfuscated strings decoded:</p>
  75 +<pre><code>olevba.py file.doc --hex</code></pre>
  76 +<p>Scan VBA source code extracted into a text file:</p>
  77 +<pre><code>olevba.py -i source_code.vba</code></pre>
  78 +<p>Scan a collection of files stored in a folder:</p>
  79 +<pre><code>olevba.py MalwareZoo/VBA/*</code></pre>
  80 +<p>Scan all .doc and .xls files, recursively in all subfolders:</p>
  81 +<pre><code>olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r</code></pre>
  82 +<p>Scan all .doc files within all .zip files with password, recursively:</p>
  83 +<pre><code>olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc</code></pre>
  84 +<h3 id="detailed-analysis-mode-default-for-single-file">Detailed analysis mode (default for single file)</h3>
  85 +<p>When a single file is scanned, or when using the option -d, all details of the analysis are displayed.</p>
  86 +<p>For example, checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p>
50 <pre><code>&gt;olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected 87 <pre><code>&gt;olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
51 =============================================================================== 88 ===============================================================================
52 FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip 89 FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
@@ -108,22 +145,63 @@ ANALYSIS: @@ -108,22 +145,63 @@ ANALYSIS:
108 | IOC | test.exe | Executable file name | 145 | IOC | test.exe | Executable file name |
109 | IOC | sfjozjero.exe | Executable file name | 146 | IOC | sfjozjero.exe | Executable file name |
110 +------------+----------------------+-----------------------------------------+</code></pre> 147 +------------+----------------------+-----------------------------------------+</code></pre>
  148 +<h3 id="triage-mode-default-for-multiple-files">Triage mode (default for multiple files)</h3>
  149 +<p>When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. This is more convenient for quick triage of a collection of suspicious files.</p>
  150 +<p>The following flags show the results of the analysis:</p>
  151 +<ul>
  152 +<li><strong>OLE</strong>: the file type is OLE, for example MS Office 97-2003</li>
  153 +<li><strong>OpX</strong>: the file type is OpenXML, for example MS Office 2007+</li>
  154 +<li><strong>?</strong>: the file type is not supported</li>
  155 +<li><strong>M</strong>: contains VBA Macros</li>
  156 +<li><strong>A</strong>: auto-executable macros</li>
  157 +<li><strong>S</strong>: suspicious VBA keywords</li>
  158 +<li><strong>I</strong>: potential IOCs</li>
  159 +<li><strong>H</strong>: hex-encoded strings (potential obfuscation)</li>
  160 +<li><strong>B</strong>: Base64-encoded strings (potential obfuscation)</li>
  161 +<li><strong>D</strong>: Dridex-encoded strings (potential obfuscation)</li>
  162 +</ul>
  163 +<p>Here is an example:</p>
  164 +<pre><code>c:\&gt;olevba.py \MalwareZoo\VBA\samples\*
  165 +Flags Filename
  166 +----------- -----------------------------------------------------------------
  167 +OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
  168 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
  169 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
  170 +OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
  171 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
  172 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
  173 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
  174 +OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
  175 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
  176 +OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
  177 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
  178 +OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
  179 +OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
  180 +OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
  181 +OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
  182 +OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc</code></pre>
  183 +<hr />
111 <h2 id="how-to-use-olevba-in-python-applications">How to use olevba in Python applications</h2> 184 <h2 id="how-to-use-olevba-in-python-applications">How to use olevba in Python applications</h2>
112 <p>olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code from your own python applications.</p> 185 <p>olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code from your own python applications.</p>
  186 +<p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p>
113 <h3 id="import-olevba">Import olevba</h3> 187 <h3 id="import-olevba">Import olevba</h3>
114 -<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser class:</p>  
115 -<pre><code>from oletools.olevba import VBA_Parser</code></pre> 188 +<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p>
  189 +<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre>
116 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3> 190 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3>
117 -<p>Create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual filename may be provided as a second parameter, if available.</p> 191 +<p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. For example:</p>
118 <pre><code>vba = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre> 192 <pre><code>vba = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
  193 +<p>The file may also be provided as a bytes string containing its data. In that case, the actual filename must be provided for reference, and the file content with the data parameter. For example:</p>
  194 +<pre><code>myfile = &#39;my_file_with_macros.doc&#39;
  195 +filedata = open(myfile, &#39;rb&#39;).read()
  196 +vba = VBA_Parser(myfile, data=filedata)</code></pre>
119 <p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML (MS Office 2007+).</p> 197 <p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML (MS Office 2007+).</p>
120 <h3 id="detect-vba-macros">Detect VBA macros</h3> 198 <h3 id="detect-vba-macros">Detect VBA macros</h3>
121 -<p>The method <strong>detect_vba_macros</strong> returns True if VBA macros have been found in the file, False otherwise.</p> 199 +<p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file, False otherwise.</p>
122 <pre><code>if vba.detect_vba_macros(): 200 <pre><code>if vba.detect_vba_macros():
123 print &#39;VBA Macros found&#39; 201 print &#39;VBA Macros found&#39;
124 else: 202 else:
125 print &#39;No VBA Macros found&#39;</code></pre> 203 print &#39;No VBA Macros found&#39;</code></pre>
126 -<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way.</p> 204 +<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way which is not yet supported by olevba.</p>
127 <p>Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.</p> 205 <p>Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.</p>
128 <h3 id="extract-vba-macro-source-code">Extract VBA Macro Source Code</h3> 206 <h3 id="extract-vba-macro-source-code">Extract VBA Macro Source Code</h3>
129 <p>The method <strong>extract_macros</strong> extracts and decompresses source code for each VBA macro found in the file (possibly including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) for each VBA macro found.</p> 207 <p>The method <strong>extract_macros</strong> extracts and decompresses source code for each VBA macro found in the file (possibly including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) for each VBA macro found.</p>
@@ -141,7 +219,29 @@ else: @@ -141,7 +219,29 @@ else:
141 print &#39;VBA filename:&#39;, vba_filename 219 print &#39;VBA filename:&#39;, vba_filename
142 print &#39;- &#39;*39 220 print &#39;- &#39;*39
143 print vba_code</code></pre> 221 print vba_code</code></pre>
144 -<h3 id="detect-auto-executable-macros">Detect auto-executable macros</h3> 222 +<h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3>
  223 +<p>Note: this API is under active development and may change in the future.</p>
  224 +<p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p>
  225 +<p>First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the extract_macros method). Then call the methods <strong>scan</strong> or <strong>scan_summary</strong> to get the results of the analysis.</p>
  226 +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded strings found in the code (Hex, Base64, Dridex) with their decoded value.</p>
  227 +<p><strong>scan</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p>
  228 +<ul>
  229 +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.</li>
  230 +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is the decoded value of the string.</li>
  231 +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li>
  232 +</ul>
  233 +<p>Example:</p>
  234 +<pre><code>vba_scanner = VBA_Scanner(vba_code)
  235 +results = vba_scanner.scan(include_decoded_strings=True)
  236 +for kw_type, keyword, description in results:
  237 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  238 +<p>The function <strong>scan_vba</strong> is a shortcut for VBA_Scanner(vba_code).scan():</p>
  239 +<pre><code>results = scan_vba(vba_code, include_decoded_strings=True)
  240 +for kw_type, keyword, description in results:
  241 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  242 +<p><strong>scan_summary</strong> returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).</p>
  243 +<h3 id="detect-auto-executable-macros-deprecated">Detect auto-executable macros (deprecated)</h3>
  244 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
145 <p>The function <strong>detect_autoexec</strong> checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.</p> 245 <p>The function <strong>detect_autoexec</strong> checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.</p>
146 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)</p> 246 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)</p>
147 <p>Sample usage:</p> 247 <p>Sample usage:</p>
@@ -153,7 +253,8 @@ if autoexec_keywords: @@ -153,7 +253,8 @@ if autoexec_keywords:
153 print &#39;%s: %s&#39; % (keyword, description) 253 print &#39;%s: %s&#39; % (keyword, description)
154 else: 254 else:
155 print &#39;Auto-executable macro keywords: None found&#39;</code></pre> 255 print &#39;Auto-executable macro keywords: None found&#39;</code></pre>
156 -<h3 id="detect-suspicious-vba-keywords">Detect suspicious VBA keywords</h3> 256 +<h3 id="detect-suspicious-vba-keywords-deprecated">Detect suspicious VBA keywords (deprecated)</h3>
  257 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
157 <p>The function <strong>detect_suspicious</strong> checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).</p> 258 <p>The function <strong>detect_suspicious</strong> checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).</p>
158 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)</p> 259 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)</p>
159 <p>Sample usage:</p> 260 <p>Sample usage:</p>
@@ -165,7 +266,8 @@ if suspicious_keywords: @@ -165,7 +266,8 @@ if suspicious_keywords:
165 print &#39;%s: %s&#39; % (keyword, description) 266 print &#39;%s: %s&#39; % (keyword, description)
166 else: 267 else:
167 print &#39;Suspicious VBA keywords: None found&#39;</code></pre> 268 print &#39;Suspicious VBA keywords: None found&#39;</code></pre>
168 -<h3 id="extract-potential-iocs">Extract potential IOCs</h3> 269 +<h3 id="extract-potential-iocs-deprecated">Extract potential IOCs (deprecated)</h3>
  270 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
169 <p>The function <strong>detect_patterns</strong> checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.</p> 271 <p>The function <strong>detect_patterns</strong> checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.</p>
170 <p>It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)</p> 272 <p>It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)</p>
171 <p>Sample usage:</p> 273 <p>Sample usage:</p>
oletools/doc/olevba.md
@@ -5,7 +5,8 @@ olevba is a script to parse OLE and OpenXML files such as MS Office documents @@ -5,7 +5,8 @@ olevba is a script to parse OLE and OpenXML files such as MS Office documents
5 (e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text, 5 (e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text,
6 and detect security-related patterns such as **auto-executable macros**, **suspicious 6 and detect security-related patterns such as **auto-executable macros**, **suspicious
7 VBA keywords** used by malware, and potential **IOCs** (IP addresses, URLs, executable 7 VBA keywords** used by malware, and potential **IOCs** (IP addresses, URLs, executable
8 -filenames, etc). 8 +filenames, etc). It also detects and decodes several common **obfuscation methods including Hex encoding,
  9 +StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings.
9 10
10 It can be used either as a command-line tool, or as a python module from your own applications. 11 It can be used either as a command-line tool, or as a python module from your own applications.
11 12
@@ -26,8 +27,10 @@ by John William Davison, with significant modifications. @@ -26,8 +27,10 @@ by John William Davison, with significant modifications.
26 - Extract VBA macro source code 27 - Extract VBA macro source code
27 - Detect auto-executable macros 28 - Detect auto-executable macros
28 - Detect suspicious VBA keywords often used by malware 29 - Detect suspicious VBA keywords often used by malware
  30 +- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
29 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names 31 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
30 - Scan multiple files and sample collections (wildcards, recursive) 32 - Scan multiple files and sample collections (wildcards, recursive)
  33 +- Triage mode for a summary view of multiple files
31 - Scan malware samples in password-protected Zip archives 34 - Scan malware samples in password-protected Zip archives
32 - Python API to use olevba from your applications 35 - Python API to use olevba from your applications
33 36
@@ -39,6 +42,18 @@ encrypted, only the content of the document. @@ -39,6 +42,18 @@ encrypted, only the content of the document.
39 See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros 42 See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros
40 and how they are stored in MS Office documents. 43 and how they are stored in MS Office documents.
41 44
  45 +## How it works
  46 +
  47 +1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
  48 +1. If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.
  49 +1. olevba identifies all the VBA projects stored in the OLE structure.
  50 +1. Each VBA project is parsed to find the corresponding OLE streams containing macro code.
  51 +1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
  52 +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).
  53 +1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
  54 +and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
  55 +
  56 +
42 ## Usage 57 ## Usage
43 58
44 :::text 59 :::text
@@ -54,10 +69,59 @@ and how they are stored in MS Office documents. @@ -54,10 +69,59 @@ and how they are stored in MS Office documents.
54 if the file is a zip archive, file(s) to be opened 69 if the file is a zip archive, file(s) to be opened
55 within the zip. Wildcards * and ? are supported. 70 within the zip. Wildcards * and ? are supported.
56 (default:*) 71 (default:*)
  72 + -t triage mode, display results as a summary table
  73 + (default for multiple files)
  74 + -d detailed mode, display full results (default for
  75 + single file)
  76 + -i INPUT, --input=INPUT
  77 + input file containing VBA source code to be analyzed
  78 + (no parsing)
  79 + --hex display all the hex-encoded strings with their decoded
  80 + content.
57 81
58 -### Example 82 +### Examples
  83 +
  84 +Scan a single file:
  85 +
  86 + :::text
  87 + olevba.py file.doc
  88 +
  89 +Scan a single file, stored in a Zip archive with password "infected":
  90 +
  91 + :::text
  92 + olevba.py malicious_file.xls.zip -z infected
  93 +
  94 +Scan a single file, showing all obfuscated strings decoded:
  95 +
  96 + :::text
  97 + olevba.py file.doc --hex
  98 +
  99 +Scan VBA source code extracted into a text file:
  100 +
  101 + :::text
  102 + olevba.py -i source_code.vba
  103 +
  104 +Scan a collection of files stored in a folder:
59 105
60 -Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/): 106 + :::text
  107 + olevba.py MalwareZoo/VBA/*
  108 +
  109 +Scan all .doc and .xls files, recursively in all subfolders:
  110 +
  111 + :::text
  112 + olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
  113 +
  114 +Scan all .doc files within all .zip files with password, recursively:
  115 +
  116 + :::text
  117 + olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
  118 +
  119 +
  120 +### Detailed analysis mode (default for single file)
  121 +
  122 +When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
  123 +
  124 +For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/):
61 125
62 :::text 126 :::text
63 >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected 127 >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
@@ -122,33 +186,87 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4 @@ -122,33 +186,87 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4
122 | IOC | sfjozjero.exe | Executable file name | 186 | IOC | sfjozjero.exe | Executable file name |
123 +------------+----------------------+-----------------------------------------+ 187 +------------+----------------------+-----------------------------------------+
124 188
  189 +### Triage mode (default for multiple files)
  190 +
  191 +When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed.
  192 +This is more convenient for quick triage of a collection of suspicious files.
  193 +
  194 +The following flags show the results of the analysis:
  195 +
  196 +- **OLE**: the file type is OLE, for example MS Office 97-2003
  197 +- **OpX**: the file type is OpenXML, for example MS Office 2007+
  198 +- **?**: the file type is not supported
  199 +- **M**: contains VBA Macros
  200 +- **A**: auto-executable macros
  201 +- **S**: suspicious VBA keywords
  202 +- **I**: potential IOCs
  203 +- **H**: hex-encoded strings (potential obfuscation)
  204 +- **B**: Base64-encoded strings (potential obfuscation)
  205 +- **D**: Dridex-encoded strings (potential obfuscation)
  206 +
  207 +Here is an example:
  208 +
  209 + :::text
  210 + c:\>olevba.py \MalwareZoo\VBA\samples\*
  211 + Flags Filename
  212 + ----------- -----------------------------------------------------------------
  213 + OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
  214 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
  215 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
  216 + OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
  217 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
  218 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
  219 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
  220 + OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
  221 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
  222 + OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
  223 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
  224 + OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
  225 + OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
  226 + OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
  227 + OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
  228 + OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
  229 +
  230 +
  231 +--------------------------------------------------------------------------
  232 +
125 ## How to use olevba in Python applications 233 ## How to use olevba in Python applications
126 234
127 olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code 235 olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code
128 from your own python applications. 236 from your own python applications.
129 237
  238 +IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
  239 +
130 ### Import olevba 240 ### Import olevba
131 241
132 -First, import the **oletools.olevba** package, using at least the VBA_Parser class: 242 +First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
133 243
134 :::python 244 :::python
135 - from oletools.olevba import VBA_Parser 245 + from oletools.olevba import VBA_Parser, VBA_Scanner
136 246
137 ### Parse a MS Office file 247 ### Parse a MS Office file
138 248
139 -Create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.  
140 -The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual  
141 -filename may be provided as a second parameter, if available. 249 +To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
  250 +For example:
142 251
143 :::python 252 :::python
144 vba = VBA_Parser('my_file_with_macros.doc') 253 vba = VBA_Parser('my_file_with_macros.doc')
  254 +
  255 +The file may also be provided as a bytes string containing its data. In that case, the actual
  256 +filename must be provided for reference, and the file content with the data parameter. For example:
  257 +
  258 + :::python
  259 + myfile = 'my_file_with_macros.doc'
  260 + filedata = open(myfile, 'rb').read()
  261 + vba = VBA_Parser(myfile, data=filedata)
145 262
146 VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML 263 VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML
147 (MS Office 2007+). 264 (MS Office 2007+).
148 265
149 ### Detect VBA macros 266 ### Detect VBA macros
150 267
151 -The method **detect_vba_macros** returns True if VBA macros have been found in the file, False otherwise. 268 +The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file,
  269 +False otherwise.
152 270
153 :::python 271 :::python
154 if vba.detect_vba_macros(): 272 if vba.detect_vba_macros():
@@ -158,7 +276,7 @@ The method **detect_vba_macros** returns True if VBA macros have been found in t @@ -158,7 +276,7 @@ The method **detect_vba_macros** returns True if VBA macros have been found in t
158 276
159 Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine 277 Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine
160 for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will 278 for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will
161 -always return False because VBA Macros are stored in a different way. 279 +always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
162 280
163 Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method 281 Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method
164 may return True if the embedded document contains VBA Macros, even if the main document does not. 282 may return True if the embedded document contains VBA Macros, even if the main document does not.
@@ -186,8 +304,49 @@ Example: @@ -186,8 +304,49 @@ Example:
186 print 'VBA filename:', vba_filename 304 print 'VBA filename:', vba_filename
187 print '- '*39 305 print '- '*39
188 print vba_code 306 print vba_code
189 -  
190 -### Detect auto-executable macros 307 +
  308 +### Analyze VBA Source Code
  309 +
  310 +Note: this API is under active development and may change in the future.
  311 +
  312 +The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
  313 +suspicious keywords, IOCs, auto-executable macros, etc.
  314 +
  315 +First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the
  316 +extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis.
  317 +
  318 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  319 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  320 +
  321 +**scan** returns a list of tuples (type, keyword, description), one for each item in the results.
  322 +
  323 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.
  324 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  325 + the decoded value of the string.
  326 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  327 +
  328 +Example:
  329 +
  330 + :::python
  331 + vba_scanner = VBA_Scanner(vba_code)
  332 + results = vba_scanner.scan(include_decoded_strings=True)
  333 + for kw_type, keyword, description in results:
  334 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  335 +
  336 +The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan():
  337 +
  338 + :::python
  339 + results = scan_vba(vba_code, include_decoded_strings=True)
  340 + for kw_type, keyword, description in results:
  341 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  342 +
  343 +**scan_summary** returns a tuple with the number of items found for each category:
  344 +(autoexec, suspicious, IOCs, hex, base64, dridex).
  345 +
  346 +
  347 +### Detect auto-executable macros (deprecated)
  348 +
  349 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
191 350
192 The function **detect_autoexec** checks if VBA macro code contains specific macro names 351 The function **detect_autoexec** checks if VBA macro code contains specific macro names
193 that will be triggered when the document/workbook is opened, closed, changed, etc. 352 that will be triggered when the document/workbook is opened, closed, changed, etc.
@@ -208,7 +367,9 @@ Sample usage: @@ -208,7 +367,9 @@ Sample usage:
208 print 'Auto-executable macro keywords: None found' 367 print 'Auto-executable macro keywords: None found'
209 368
210 369
211 -### Detect suspicious VBA keywords 370 +### Detect suspicious VBA keywords (deprecated)
  371 +
  372 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
212 373
213 The function **detect_suspicious** checks if VBA macro code contains specific 374 The function **detect_suspicious** checks if VBA macro code contains specific
214 keywords often used by malware to act on the system (create files, run 375 keywords often used by malware to act on the system (create files, run
@@ -230,7 +391,9 @@ Sample usage: @@ -230,7 +391,9 @@ Sample usage:
230 print 'Suspicious VBA keywords: None found' 391 print 'Suspicious VBA keywords: None found'
231 392
232 393
233 -### Extract potential IOCs 394 +### Extract potential IOCs (deprecated)
  395 +
  396 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
234 397
235 The function **detect_patterns** checks if VBA macro code contains specific 398 The function **detect_patterns** checks if VBA macro code contains specific
236 patterns of interest, that may be useful for malware analysis and detection 399 patterns of interest, that may be useful for malware analysis and detection