Commit 430c7b98a68cf38ed3f67b8a23209d431aba0d0b

Authored by Philippe Lagadec
1 parent cdbcd101

updated doc and readme

README.md
... ... @@ -22,7 +22,7 @@ Note: python-oletools is not related to OLETools published by BeCubed Software.
22 22 News
23 23 ----
24 24  
25   -- **2015-02-05 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings
  25 +- **2015-02-08 v0.08**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now decode strings
26 26 obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western
27 27 codepages with olefile 0.42, improved API and display, several bugfixes.
28 28 - 2015-01-05 v0.07: improved [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) to detect suspicious
... ...
oletools/README.html
... ... @@ -13,7 +13,7 @@
13 13 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p>
14 14 <h2 id="news">News</h2>
15 15 <ul>
16   -<li><strong>2015-02-05 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li>
  16 +<li><strong>2015-02-08 v0.08</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li>
17 17 <li>2015-01-05 v0.07: improved <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> to detect suspicious keywords and IOCs in VBA macros, can now scan several files and open password-protected zip archives, added a Python API, upgraded OleFileIO_PL to olefile v0.41</li>
18 18 <li>2014-08-28 v0.06: added <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved <a href="https://bitbucket.org/decalage/oletools/wiki">documentation</a></li>
19 19 <li>2013-07-24 v0.05: added new tools <a href="https://bitbucket.org/decalage/oletools/wiki/olemeta">olemeta</a> and <a href="https://bitbucket.org/decalage/oletools/wiki/oletimes">oletimes</a></li>
... ...
oletools/README.rst
... ... @@ -27,7 +27,7 @@ Software.
27 27 News
28 28 ----
29 29  
30   -- **2015-02-05 v0.08**:
  30 +- **2015-02-08 v0.08**:
31 31 `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`_ can
32 32 now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and
33 33 extract IOCs. Added new triage mode, support for non-western
... ...
oletools/doc/Home.html
... ... @@ -7,7 +7,7 @@
7 7 <title></title>
8 8 </head>
9 9 <body>
10   -<h1 id="python-oletools-v0.07-documentation">python-oletools v0.07 documentation</h1>
  10 +<h1 id="python-oletools-v0.08-documentation">python-oletools v0.08 documentation</h1>
11 11 <p>This is the home page of the documentation for python-oletools. The latest version can be found <a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
12 12 <p><a href="http://www.decalage.info/python/oletools">python-oletools</a> is a package of python tools to analyze <a href="http://en.wikipedia.org/wiki/Compound_File_Binary_Format">Microsoft OLE2 files</a> (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the <a href="http://www.decalage.info/olefile">olefile</a> parser. See <a href="http://www.decalage.info/python/oletools">http://www.decalage.info/python/oletools</a> for more info.</p>
13 13 <p><strong>Quick links:</strong> <a href="http://www.decalage.info/python/oletools">Home page</a> - <a href="https://bitbucket.org/decalage/oletools/wiki/Install">Download/Install</a> - <a href="https://bitbucket.org/decalage/oletools/wiki">Documentation</a> - <a href="https://bitbucket.org/decalage/oletools/issues?status=new&amp;status=open">Report Issues/Suggestions/Questions</a> - <a href="http://decalage.info/contact">Contact the author</a> - <a href="https://bitbucket.org/decalage/oletools">Repository</a> - <a href="https://twitter.com/decalage2">Updates on Twitter</a></p>
... ... @@ -18,7 +18,7 @@
18 18 <li><strong><a href="oleid.html">oleid</a></strong>: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.</li>
19 19 <li><strong><a href="olemeta.html">olemeta</a></strong>: a tool to extract all standard properties (metadata) from OLE files.</li>
20 20 <li><strong><a href="oletimes.html">oletimes</a></strong>: a tool to extract creation and modification timestamps of all streams and storages.</li>
21   -<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).</li>
  21 +<li><strong><a href="olevba.html">olevba</a></strong>: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).</li>
22 22 <li><strong><a href="pyxswf.html">pyxswf</a></strong>: a tool to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.</li>
23 23 <li><strong><a href="rtfobj.html">rtfobj</a></strong>: a tool and python module to extract embedded objects from RTF files.</li>
24 24 <li>and a few others (coming soon)</li>
... ...
oletools/doc/Home.md
1   -python-oletools v0.07 documentation
  1 +python-oletools v0.08 documentation
2 2 ===================================
3 3  
4 4 This is the home page of the documentation for python-oletools. The latest version can be found
... ... @@ -29,7 +29,7 @@ Tools in python-oletools:
29 29 - **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.
30 30 - **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files.
31 31 - **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages.
32   -- **[[olevba]]**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).
  32 +- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
33 33 - **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may
34 34 be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
35 35 which is especially useful for malware analysis.
... ...
oletools/doc/Install.html
... ... @@ -9,14 +9,15 @@
9 9 <body>
10 10 <h1 id="how-to-download-and-install-python-oletools">How to Download and Install python-oletools</h1>
11 11 <h2 id="pre-requisites">Pre-requisites</h2>
12   -<p>For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet.</p>
13   -<h2 id="for-command-line-tools">For command-line tools</h2>
14   -<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice.</p>
15   -<p>You may then add the directory to your PATH environment variable to access the tools from anywhere.</p>
16   -<p>To get the latest development version, click on &quot;Download repository&quot; on the <a href="https://bitbucket.org/decalage/oletools/downloads">downloads page</a>, or use mercurial to clone the repository.</p>
  12 +<p>For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.</p>
  13 +<h2 id="to-use-oletools-as-command-line-tools">To use oletools as command-line tools</h2>
  14 +<p>To use python-oletools from the command line as analysis tools, you may simply <a href="https://bitbucket.org/decalage/oletools/downloads">download the zip archive</a> and extract the files in the directory of your choice. Pick the latest release version, or click on &quot;Download Repository&quot; to get the latest development version with the most recent features.</p>
  15 +<p>Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily in the future.</p>
  16 +<p>You may add the oletools directory to your PATH environment variable to access the tools from anywhere.</p>
17 17 <h2 id="for-python-applications">For python applications</h2>
18   -<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use &quot;<strong>pip install oletools</strong>&quot; or &quot;<strong>easy_install oletools</strong>&quot; to download and install the package in one go.</p>
19   -<p>Otherwise you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run &quot;<strong>python setup.py install</strong>&quot;.</p>
  18 +<p>If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use <strong>&quot;pip install oletools&quot;</strong> or <strong>&quot;easy_install oletools&quot;</strong> to download and install the package in one go. Pip is included with Python since version 2.7.9.</p>
  19 +<p><strong>Important: to update oletools</strong> if it is already installed, you must run <strong>&quot;pip install -U oletools&quot;</strong>, otherwise pip will not update it.</p>
  20 +<p>Alternatively, you may download/extract the <a href="https://bitbucket.org/decalage/oletools/downloads">zip archive</a> in a temporary directory and run <strong>&quot;python setup.py install&quot;</strong>.</p>
20 21 <hr />
21 22 <h2 id="python-oletools-documentation">python-oletools documentation</h2>
22 23 <ul>
... ...
oletools/doc/Install.md
... ... @@ -4,30 +4,35 @@ How to Download and Install python-oletools
4 4 Pre-requisites
5 5 --------------
6 6  
7   -For now, python-oletools require Python 2.x. They are not compatible with Python 3.x yet.
  7 +For now, python-oletools require Python 2.x, if possible 2.6 or 2.7. They are not compatible with Python 3.x yet.
8 8  
9 9  
10   -For command-line tools
11   -----------------------
  10 +To use oletools as command-line tools
  11 +-------------------------------------
12 12  
13 13 To use python-oletools from the command line as analysis tools, you may simply
14 14 [download the zip archive](https://bitbucket.org/decalage/oletools/downloads)
15   -and extract the files in the directory of your choice.
  15 +and extract the files in the directory of your choice. Pick the latest release version, or click on "Download Repository"
  16 +to get the latest development version with the most recent features.
16 17  
17   -You may then add the directory to your PATH environment variable to access the tools from anywhere.
  18 +Another possibility is to use a Mercurial client (hg) to clone the repository in a folder. You can then update it easily
  19 +in the future.
18 20  
19   -To get the latest development version, click on "Download repository" on the
20   -[downloads page](https://bitbucket.org/decalage/oletools/downloads), or use mercurial to clone the repository.
  21 +You may add the oletools directory to your PATH environment variable to access the tools from anywhere.
21 22  
22 23  
23 24 For python applications
24   -----------------------
  25 +-----------------------
25 26  
26 27 If you plan to use python-oletools with other Python applications or your own scripts, the simplest solution is to use
27   -"**pip install oletools**" or "**easy_install oletools**" to download and install the package in one go.
  28 +**"pip install oletools"** or **"easy_install oletools"** to download and install the package in one go. Pip is included
  29 +with Python since version 2.7.9.
28 30  
29   -Otherwise you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary
30   -directory and run "**python setup.py install**".
  31 +**Important: to update oletools** if it is already installed, you must run **"pip install -U oletools"**, otherwise pip
  32 +will not update it.
  33 +
  34 +Alternatively, you may download/extract the [zip archive](https://bitbucket.org/decalage/oletools/downloads) in a temporary
  35 +directory and run **"python setup.py install"**.
31 36  
32 37 --------------------------------------------------------------------------
33 38  
... ...
oletools/doc/oleid.html
... ... @@ -39,19 +39,43 @@
39 39 <pre><code>C:\oletools&gt;oleid.py word_flash_vba.doc
40 40  
41 41 Filename: word_flash_vba.doc
42   -OLE format: True
43   -Has SummaryInformation stream: True
44   -Application name: Microsoft Office Word
45   -Encrypted: False
46   -Word Document: True
47   -VBA Macros: True
48   -Excel Workbook: False
49   -PowerPoint Presentation: False
50   -Visio Drawing: False
51   -ObjectPool: True
52   -Flash objects: 1</code></pre>
53   -<h2 id="how-to-use-oleid-in-python-applications">How to use oleid in Python applications</h2>
54   -<p>TODO</p>
  42 ++-------------------------------+-----------------------+
  43 +| Indicator | Value |
  44 ++-------------------------------+-----------------------+
  45 +| OLE format | True |
  46 +| Has SummaryInformation stream | True |
  47 +| Application name | Microsoft Office Word |
  48 +| Encrypted | False |
  49 +| Word Document | True |
  50 +| VBA Macros | True |
  51 +| Excel Workbook | False |
  52 +| PowerPoint Presentation | False |
  53 +| Visio Drawing | False |
  54 +| ObjectPool | True |
  55 +| Flash objects | 1 |
  56 ++-------------------------------+-----------------------+</code></pre>
  57 +<h2 id="how-to-use-oleid-in-your-python-applications">How to use oleid in your Python applications</h2>
  58 +<p>First, import oletools.oleid, and create an <strong>OleID</strong> object to scan a file:</p>
  59 +<pre><code>import oletools.oleid
  60 +
  61 +oid = oletools.oleid.OleID(filename)</code></pre>
  62 +<p>Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.</p>
  63 +<p>Second, call the <strong>check()</strong> method. It returns a list of <strong>Indicator</strong> objects.</p>
  64 +<p>Each Indicator object has the following attributes:</p>
  65 +<ul>
  66 +<li><strong>id</strong>: str, identifier for the indicator</li>
  67 +<li><strong>name</strong>: str, name to display the indicator</li>
  68 +<li><strong>description</strong>: str, long description of the indicator</li>
  69 +<li><strong>type</strong>: class of the indicator (e.g. bool, str, int)</li>
  70 +<li><strong>value</strong>: value of the indicator</li>
  71 +</ul>
  72 +<p>For example, the following code displays all the indicators:</p>
  73 +<pre><code>indicators = oid.check()
  74 +for i in indicators:
  75 + print &#39;Indicator id=%s name=&quot;%s&quot; type=%s value=%s&#39; % (i.id, i.name, i.type, repr(i.value))
  76 + print &#39;description:&#39;, i.description
  77 + print &#39;&#39;</code></pre>
  78 +<p>See the source code of oleid.py for more details.</p>
55 79 <hr />
56 80 <h2 id="python-oletools-documentation">python-oletools documentation</h2>
57 81 <ul>
... ...
oletools/doc/oleid.md
... ... @@ -42,21 +42,53 @@ Analyzing a Word document containing a Flash object and VBA macros:
42 42 C:\oletools>oleid.py word_flash_vba.doc
43 43  
44 44 Filename: word_flash_vba.doc
45   - OLE format: True
46   - Has SummaryInformation stream: True
47   - Application name: Microsoft Office Word
48   - Encrypted: False
49   - Word Document: True
50   - VBA Macros: True
51   - Excel Workbook: False
52   - PowerPoint Presentation: False
53   - Visio Drawing: False
54   - ObjectPool: True
55   - Flash objects: 1
56   -
57   -## How to use oleid in Python applications
58   -
59   -TODO
  45 + +-------------------------------+-----------------------+
  46 + | Indicator | Value |
  47 + +-------------------------------+-----------------------+
  48 + | OLE format | True |
  49 + | Has SummaryInformation stream | True |
  50 + | Application name | Microsoft Office Word |
  51 + | Encrypted | False |
  52 + | Word Document | True |
  53 + | VBA Macros | True |
  54 + | Excel Workbook | False |
  55 + | PowerPoint Presentation | False |
  56 + | Visio Drawing | False |
  57 + | ObjectPool | True |
  58 + | Flash objects | 1 |
  59 + +-------------------------------+-----------------------+
  60 +
  61 +## How to use oleid in your Python applications
  62 +
  63 +First, import oletools.oleid, and create an **OleID** object to scan a file:
  64 +
  65 + :::python
  66 + import oletools.oleid
  67 +
  68 + oid = oletools.oleid.OleID(filename)
  69 +
  70 +Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.
  71 +
  72 +Second, call the **check()** method. It returns a list of **Indicator** objects.
  73 +
  74 +Each Indicator object has the following attributes:
  75 +
  76 +- **id**: str, identifier for the indicator
  77 +- **name**: str, name to display the indicator
  78 +- **description**: str, long description of the indicator
  79 +- **type**: class of the indicator (e.g. bool, str, int)
  80 +- **value**: value of the indicator
  81 +
  82 +For example, the following code displays all the indicators:
  83 +
  84 + :::python
  85 + indicators = oid.check()
  86 + for i in indicators:
  87 + print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
  88 + print 'description:', i.description
  89 + print ''
  90 +
  91 +See the source code of oleid.py for more details.
60 92  
61 93 --------------------------------------------------------------------------
62 94  
... ...
oletools/doc/oletimes.html
... ... @@ -16,24 +16,29 @@
16 16 <p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p>
17 17 <pre><code>&gt;oletimes.py DIAN_caso-5415.doc
18 18  
19   -- Root mtime=2014-05-14 12:45:24.752000 ctime=None
20   -- &#39;\x01CompObj&#39;: mtime=None ctime=None
21   -- &#39;\x05DocumentSummaryInformation&#39;: mtime=None ctime=None
22   -- &#39;\x05SummaryInformation&#39;: mtime=None ctime=None
23   -- &#39;1Table&#39;: mtime=None ctime=None
24   -- &#39;Data&#39;: mtime=None ctime=None
25   -- &#39;Macros&#39;: mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000
26   -- &#39;Macros/PROJECT&#39;: mtime=None ctime=None
27   -- &#39;Macros/PROJECTwm&#39;: mtime=None ctime=None
28   -- &#39;Macros/VBA&#39;: mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000
29   -- &#39;Macros/VBA/ThisDocument&#39;: mtime=None ctime=None
30   -- &#39;Macros/VBA/_VBA_PROJECT&#39;: mtime=None ctime=None
31   -- &#39;Macros/VBA/__SRP_0&#39;: mtime=None ctime=None
32   -- &#39;Macros/VBA/__SRP_1&#39;: mtime=None ctime=None
33   -- &#39;Macros/VBA/__SRP_2&#39;: mtime=None ctime=None
34   -- &#39;Macros/VBA/__SRP_3&#39;: mtime=None ctime=None
35   -- &#39;Macros/VBA/dir&#39;: mtime=None ctime=None
36   -- &#39;WordDocument&#39;: mtime=None ctime=None</code></pre>
  19 ++----------------------------+---------------------+---------------------+
  20 +| Stream/Storage name | Modification Time | Creation Time |
  21 ++----------------------------+---------------------+---------------------+
  22 +| Root | 2014-05-14 12:45:24 | None |
  23 +| &#39;\x01CompObj&#39; | None | None |
  24 +| &#39;\x05DocumentSummaryInform | None | None |
  25 +| ation&#39; | | |
  26 +| &#39;\x05SummaryInformation&#39; | None | None |
  27 +| &#39;1Table&#39; | None | None |
  28 +| &#39;Data&#39; | None | None |
  29 +| &#39;Macros&#39; | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  30 +| &#39;Macros/PROJECT&#39; | None | None |
  31 +| &#39;Macros/PROJECTwm&#39; | None | None |
  32 +| &#39;Macros/VBA&#39; | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  33 +| &#39;Macros/VBA/ThisDocument&#39; | None | None |
  34 +| &#39;Macros/VBA/_VBA_PROJECT&#39; | None | None |
  35 +| &#39;Macros/VBA/__SRP_0&#39; | None | None |
  36 +| &#39;Macros/VBA/__SRP_1&#39; | None | None |
  37 +| &#39;Macros/VBA/__SRP_2&#39; | None | None |
  38 +| &#39;Macros/VBA/__SRP_3&#39; | None | None |
  39 +| &#39;Macros/VBA/dir&#39; | None | None |
  40 +| &#39;WordDocument&#39; | None | None |
  41 ++----------------------------+---------------------+---------------------+</code></pre>
37 42 <h2 id="how-to-use-oletimes-in-python-applications">How to use oletimes in Python applications</h2>
38 43 <p>TODO</p>
39 44 <hr />
... ...
oletools/doc/oletimes.md
... ... @@ -19,24 +19,29 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4
19 19 :::text
20 20 >oletimes.py DIAN_caso-5415.doc
21 21  
22   - - Root mtime=2014-05-14 12:45:24.752000 ctime=None
23   - - '\x01CompObj': mtime=None ctime=None
24   - - '\x05DocumentSummaryInformation': mtime=None ctime=None
25   - - '\x05SummaryInformation': mtime=None ctime=None
26   - - '1Table': mtime=None ctime=None
27   - - 'Data': mtime=None ctime=None
28   - - 'Macros': mtime=2014-05-14 12:45:24.708000 ctime=2014-05-14 12:45:24.355000
29   - - 'Macros/PROJECT': mtime=None ctime=None
30   - - 'Macros/PROJECTwm': mtime=None ctime=None
31   - - 'Macros/VBA': mtime=2014-05-14 12:45:24.684000 ctime=2014-05-14 12:45:24.355000
32   - - 'Macros/VBA/ThisDocument': mtime=None ctime=None
33   - - 'Macros/VBA/_VBA_PROJECT': mtime=None ctime=None
34   - - 'Macros/VBA/__SRP_0': mtime=None ctime=None
35   - - 'Macros/VBA/__SRP_1': mtime=None ctime=None
36   - - 'Macros/VBA/__SRP_2': mtime=None ctime=None
37   - - 'Macros/VBA/__SRP_3': mtime=None ctime=None
38   - - 'Macros/VBA/dir': mtime=None ctime=None
39   - - 'WordDocument': mtime=None ctime=None
  22 + +----------------------------+---------------------+---------------------+
  23 + | Stream/Storage name | Modification Time | Creation Time |
  24 + +----------------------------+---------------------+---------------------+
  25 + | Root | 2014-05-14 12:45:24 | None |
  26 + | '\x01CompObj' | None | None |
  27 + | '\x05DocumentSummaryInform | None | None |
  28 + | ation' | | |
  29 + | '\x05SummaryInformation' | None | None |
  30 + | '1Table' | None | None |
  31 + | 'Data' | None | None |
  32 + | 'Macros' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  33 + | 'Macros/PROJECT' | None | None |
  34 + | 'Macros/PROJECTwm' | None | None |
  35 + | 'Macros/VBA' | 2014-05-14 12:45:24 | 2014-05-14 12:45:24 |
  36 + | 'Macros/VBA/ThisDocument' | None | None |
  37 + | 'Macros/VBA/_VBA_PROJECT' | None | None |
  38 + | 'Macros/VBA/__SRP_0' | None | None |
  39 + | 'Macros/VBA/__SRP_1' | None | None |
  40 + | 'Macros/VBA/__SRP_2' | None | None |
  41 + | 'Macros/VBA/__SRP_3' | None | None |
  42 + | 'Macros/VBA/dir' | None | None |
  43 + | 'WordDocument' | None | None |
  44 + +----------------------------+---------------------+---------------------+
40 45  
41 46 ## How to use oletimes in Python applications
42 47  
... ...
oletools/doc/olevba.html
... ... @@ -8,7 +8,7 @@
8 8 </head>
9 9 <body>
10 10 <h1 id="olevba">olevba</h1>
11   -<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc).</p>
  11 +<p>olevba is a script to parse OLE and OpenXML files such as MS Office documents (e.g. Word, Excel), to <strong>detect VBA Macros</strong>, extract their <strong>source code</strong> in clear text, and detect security-related patterns such as <strong>auto-executable macros</strong>, <strong>suspicious VBA keywords</strong> used by malware, and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc). It also detects and decodes several common <strong>obfuscation methods including Hex encoding, StrReverse, Base64, Dridex</strong>, and extracts IOCs from decoded strings.</p>
12 12 <p>It can be used either as a command-line tool, or as a python module from your own applications.</p>
13 13 <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p>
14 14 <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a> by John William Davison, with significant modifications.</p>
... ... @@ -24,14 +24,26 @@
24 24 <li>Extract VBA macro source code</li>
25 25 <li>Detect auto-executable macros</li>
26 26 <li>Detect suspicious VBA keywords often used by malware</li>
  27 +<li>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</li>
27 28 <li>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</li>
28 29 <li>Scan multiple files and sample collections (wildcards, recursive)</li>
  30 +<li>Triage mode for a summary view of multiple files</li>
29 31 <li>Scan malware samples in password-protected Zip archives</li>
30 32 <li>Python API to use olevba from your applications</li>
31 33 </ul>
32 34 <p>MS Office files encrypted with a password are also supported, because VBA macro code is never encrypted, only the content of the document.</p>
33 35 <h2 id="about-vba-macros">About VBA Macros</h2>
34 36 <p>See <a href="http://www.decalage.info/en/vba_tools">this article</a> for more information and technical details about VBA Macros and how they are stored in MS Office documents.</p>
  37 +<h2 id="how-it-works">How it works</h2>
  38 +<ol style="list-style-type: decimal">
  39 +<li>olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.</li>
  40 +<li>If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.</li>
  41 +<li>olevba identifies all the VBA projects stored in the OLE structure.</li>
  42 +<li>Each VBA project is parsed to find the corresponding OLE streams containing macro code.</li>
  43 +<li>In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).</li>
  44 +<li>olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).</li>
  45 +<li>olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).</li>
  46 +</ol>
35 47 <h2 id="usage">Usage</h2>
36 48 <pre><code>Usage: olevba.py [options] &lt;filename&gt; [filename2 ...]
37 49  
... ... @@ -44,9 +56,34 @@ Options:
44 56 -f ZIP_FNAME, --zipfname=ZIP_FNAME
45 57 if the file is a zip archive, file(s) to be opened
46 58 within the zip. Wildcards * and ? are supported.
47   - (default:*)</code></pre>
48   -<h3 id="example">Example</h3>
49   -<p>Checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p>
  59 + (default:*)
  60 + -t triage mode, display results as a summary table
  61 + (default for multiple files)
  62 + -d detailed mode, display full results (default for
  63 + single file)
  64 + -i INPUT, --input=INPUT
  65 + input file containing VBA source code to be analyzed
  66 + (no parsing)
  67 + --hex display all the hex-encoded strings with their decoded
  68 + content.</code></pre>
  69 +<h3 id="examples">Examples</h3>
  70 +<p>Scan a single file:</p>
  71 +<pre><code>olevba.py file.doc</code></pre>
  72 +<p>Scan a single file, stored in a Zip archive with password &quot;infected&quot;:</p>
  73 +<pre><code>olevba.py malicious_file.xls.zip -z infected</code></pre>
  74 +<p>Scan a single file, showing all obfuscated strings decoded:</p>
  75 +<pre><code>olevba.py file.doc --hex</code></pre>
  76 +<p>Scan VBA source code extracted into a text file:</p>
  77 +<pre><code>olevba.py -i source_code.vba</code></pre>
  78 +<p>Scan a collection of files stored in a folder:</p>
  79 +<pre><code>olevba.py MalwareZoo/VBA/*</code></pre>
  80 +<p>Scan all .doc and .xls files, recursively in all subfolders:</p>
  81 +<pre><code>olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r</code></pre>
  82 +<p>Scan all .doc files within all .zip files with password, recursively:</p>
  83 +<pre><code>olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc</code></pre>
  84 +<h3 id="detailed-analysis-mode-default-for-single-file">Detailed analysis mode (default for single file)</h3>
  85 +<p>When a single file is scanned, or when using the option -d, all details of the analysis are displayed.</p>
  86 +<p>For example, checking the malware sample <a href="https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/">DIAN_caso-5415.doc</a>:</p>
50 87 <pre><code>&gt;olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
51 88 ===============================================================================
52 89 FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
... ... @@ -108,22 +145,63 @@ ANALYSIS:
108 145 | IOC | test.exe | Executable file name |
109 146 | IOC | sfjozjero.exe | Executable file name |
110 147 +------------+----------------------+-----------------------------------------+</code></pre>
  148 +<h3 id="triage-mode-default-for-multiple-files">Triage mode (default for multiple files)</h3>
  149 +<p>When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. This is more convenient for quick triage of a collection of suspicious files.</p>
  150 +<p>The following flags show the results of the analysis:</p>
  151 +<ul>
  152 +<li><strong>OLE</strong>: the file type is OLE, for example MS Office 97-2003</li>
  153 +<li><strong>OpX</strong>: the file type is OpenXML, for example MS Office 2007+</li>
  154 +<li><strong>?</strong>: the file type is not supported</li>
  155 +<li><strong>M</strong>: contains VBA Macros</li>
  156 +<li><strong>A</strong>: auto-executable macros</li>
  157 +<li><strong>S</strong>: suspicious VBA keywords</li>
  158 +<li><strong>I</strong>: potential IOCs</li>
  159 +<li><strong>H</strong>: hex-encoded strings (potential obfuscation)</li>
  160 +<li><strong>B</strong>: Base64-encoded strings (potential obfuscation)</li>
  161 +<li><strong>D</strong>: Dridex-encoded strings (potential obfuscation)</li>
  162 +</ul>
  163 +<p>Here is an example:</p>
  164 +<pre><code>c:\&gt;olevba.py \MalwareZoo\VBA\samples\*
  165 +Flags Filename
  166 +----------- -----------------------------------------------------------------
  167 +OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
  168 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
  169 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
  170 +OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
  171 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
  172 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
  173 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
  174 +OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
  175 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
  176 +OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
  177 +OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
  178 +OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
  179 +OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
  180 +OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
  181 +OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
  182 +OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc</code></pre>
  183 +<hr />
111 184 <h2 id="how-to-use-olevba-in-python-applications">How to use olevba in Python applications</h2>
112 185 <p>olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code from your own python applications.</p>
  186 +<p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p>
113 187 <h3 id="import-olevba">Import olevba</h3>
114   -<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser class:</p>
115   -<pre><code>from oletools.olevba import VBA_Parser</code></pre>
  188 +<p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p>
  189 +<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre>
116 190 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3>
117   -<p>Create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual filename may be provided as a second parameter, if available.</p>
  191 +<p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter. For example:</p>
118 192 <pre><code>vba = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
  193 +<p>The file may also be provided as a bytes string containing its data. In that case, the actual filename must be provided for reference, and the file content with the data parameter. For example:</p>
  194 +<pre><code>myfile = &#39;my_file_with_macros.doc&#39;
  195 +filedata = open(myfile, &#39;rb&#39;).read()
  196 +vba = VBA_Parser(myfile, data=filedata)</code></pre>
119 197 <p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML (MS Office 2007+).</p>
120 198 <h3 id="detect-vba-macros">Detect VBA macros</h3>
121   -<p>The method <strong>detect_vba_macros</strong> returns True if VBA macros have been found in the file, False otherwise.</p>
  199 +<p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file, False otherwise.</p>
122 200 <pre><code>if vba.detect_vba_macros():
123 201 print &#39;VBA Macros found&#39;
124 202 else:
125 203 print &#39;No VBA Macros found&#39;</code></pre>
126   -<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way.</p>
  204 +<p>Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way which is not yet supported by olevba.</p>
127 205 <p>Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.</p>
128 206 <h3 id="extract-vba-macro-source-code">Extract VBA Macro Source Code</h3>
129 207 <p>The method <strong>extract_macros</strong> extracts and decompresses source code for each VBA macro found in the file (possibly including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) for each VBA macro found.</p>
... ... @@ -141,7 +219,29 @@ else:
141 219 print &#39;VBA filename:&#39;, vba_filename
142 220 print &#39;- &#39;*39
143 221 print vba_code</code></pre>
144   -<h3 id="detect-auto-executable-macros">Detect auto-executable macros</h3>
  222 +<h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3>
  223 +<p>Note: this API is under active development and may change in the future.</p>
  224 +<p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p>
  225 +<p>First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the extract_macros method). Then call the methods <strong>scan</strong> or <strong>scan_summary</strong> to get the results of the analysis.</p>
  226 +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded strings found in the code (Hex, Base64, Dridex) with their decoded value.</p>
  227 +<p><strong>scan</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p>
  228 +<ul>
  229 +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.</li>
  230 +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is the decoded value of the string.</li>
  231 +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li>
  232 +</ul>
  233 +<p>Example:</p>
  234 +<pre><code>vba_scanner = VBA_Scanner(vba_code)
  235 +results = vba_scanner.scan(include_decoded_strings=True)
  236 +for kw_type, keyword, description in results:
  237 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  238 +<p>The function <strong>scan_vba</strong> is a shortcut for VBA_Scanner(vba_code).scan():</p>
  239 +<pre><code>results = scan_vba(vba_code, include_decoded_strings=True)
  240 +for kw_type, keyword, description in results:
  241 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  242 +<p><strong>scan_summary</strong> returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).</p>
  243 +<h3 id="detect-auto-executable-macros-deprecated">Detect auto-executable macros (deprecated)</h3>
  244 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
145 245 <p>The function <strong>detect_autoexec</strong> checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.</p>
146 246 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)</p>
147 247 <p>Sample usage:</p>
... ... @@ -153,7 +253,8 @@ if autoexec_keywords:
153 253 print &#39;%s: %s&#39; % (keyword, description)
154 254 else:
155 255 print &#39;Auto-executable macro keywords: None found&#39;</code></pre>
156   -<h3 id="detect-suspicious-vba-keywords">Detect suspicious VBA keywords</h3>
  256 +<h3 id="detect-suspicious-vba-keywords-deprecated">Detect suspicious VBA keywords (deprecated)</h3>
  257 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
157 258 <p>The function <strong>detect_suspicious</strong> checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).</p>
158 259 <p>It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)</p>
159 260 <p>Sample usage:</p>
... ... @@ -165,7 +266,8 @@ if suspicious_keywords:
165 266 print &#39;%s: %s&#39; % (keyword, description)
166 267 else:
167 268 print &#39;Suspicious VBA keywords: None found&#39;</code></pre>
168   -<h3 id="extract-potential-iocs">Extract potential IOCs</h3>
  269 +<h3 id="extract-potential-iocs-deprecated">Extract potential IOCs (deprecated)</h3>
  270 +<p><strong>Deprecated</strong>: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.</p>
169 271 <p>The function <strong>detect_patterns</strong> checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.</p>
170 272 <p>It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)</p>
171 273 <p>Sample usage:</p>
... ...
oletools/doc/olevba.md
... ... @@ -5,7 +5,8 @@ olevba is a script to parse OLE and OpenXML files such as MS Office documents
5 5 (e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text,
6 6 and detect security-related patterns such as **auto-executable macros**, **suspicious
7 7 VBA keywords** used by malware, and potential **IOCs** (IP addresses, URLs, executable
8   -filenames, etc).
  8 +filenames, etc). It also detects and decodes several common **obfuscation methods including Hex encoding,
  9 +StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings.
9 10  
10 11 It can be used either as a command-line tool, or as a python module from your own applications.
11 12  
... ... @@ -26,8 +27,10 @@ by John William Davison, with significant modifications.
26 27 - Extract VBA macro source code
27 28 - Detect auto-executable macros
28 29 - Detect suspicious VBA keywords often used by malware
  30 +- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
29 31 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
30 32 - Scan multiple files and sample collections (wildcards, recursive)
  33 +- Triage mode for a summary view of multiple files
31 34 - Scan malware samples in password-protected Zip archives
32 35 - Python API to use olevba from your applications
33 36  
... ... @@ -39,6 +42,18 @@ encrypted, only the content of the document.
39 42 See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros
40 43 and how they are stored in MS Office documents.
41 44  
  45 +## How it works
  46 +
  47 +1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
  48 +1. If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.
  49 +1. olevba identifies all the VBA projects stored in the OLE structure.
  50 +1. Each VBA project is parsed to find the corresponding OLE streams containing macro code.
  51 +1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
  52 +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).
  53 +1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
  54 +and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
  55 +
  56 +
42 57 ## Usage
43 58  
44 59 :::text
... ... @@ -54,10 +69,59 @@ and how they are stored in MS Office documents.
54 69 if the file is a zip archive, file(s) to be opened
55 70 within the zip. Wildcards * and ? are supported.
56 71 (default:*)
  72 + -t triage mode, display results as a summary table
  73 + (default for multiple files)
  74 + -d detailed mode, display full results (default for
  75 + single file)
  76 + -i INPUT, --input=INPUT
  77 + input file containing VBA source code to be analyzed
  78 + (no parsing)
  79 + --hex display all the hex-encoded strings with their decoded
  80 + content.
57 81  
58   -### Example
  82 +### Examples
  83 +
  84 +Scan a single file:
  85 +
  86 + :::text
  87 + olevba.py file.doc
  88 +
  89 +Scan a single file, stored in a Zip archive with password "infected":
  90 +
  91 + :::text
  92 + olevba.py malicious_file.xls.zip -z infected
  93 +
  94 +Scan a single file, showing all obfuscated strings decoded:
  95 +
  96 + :::text
  97 + olevba.py file.doc --hex
  98 +
  99 +Scan VBA source code extracted into a text file:
  100 +
  101 + :::text
  102 + olevba.py -i source_code.vba
  103 +
  104 +Scan a collection of files stored in a folder:
59 105  
60   -Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/):
  106 + :::text
  107 + olevba.py MalwareZoo/VBA/*
  108 +
  109 +Scan all .doc and .xls files, recursively in all subfolders:
  110 +
  111 + :::text
  112 + olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
  113 +
  114 +Scan all .doc files within all .zip files with password, recursively:
  115 +
  116 + :::text
  117 + olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
  118 +
  119 +
  120 +### Detailed analysis mode (default for single file)
  121 +
  122 +When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
  123 +
  124 +For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/):
61 125  
62 126 :::text
63 127 >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
... ... @@ -122,33 +186,87 @@ Checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4
122 186 | IOC | sfjozjero.exe | Executable file name |
123 187 +------------+----------------------+-----------------------------------------+
124 188  
  189 +### Triage mode (default for multiple files)
  190 +
  191 +When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed.
  192 +This is more convenient for quick triage of a collection of suspicious files.
  193 +
  194 +The following flags show the results of the analysis:
  195 +
  196 +- **OLE**: the file type is OLE, for example MS Office 97-2003
  197 +- **OpX**: the file type is OpenXML, for example MS Office 2007+
  198 +- **?**: the file type is not supported
  199 +- **M**: contains VBA Macros
  200 +- **A**: auto-executable macros
  201 +- **S**: suspicious VBA keywords
  202 +- **I**: potential IOCs
  203 +- **H**: hex-encoded strings (potential obfuscation)
  204 +- **B**: Base64-encoded strings (potential obfuscation)
  205 +- **D**: Dridex-encoded strings (potential obfuscation)
  206 +
  207 +Here is an example:
  208 +
  209 + :::text
  210 + c:\>olevba.py \MalwareZoo\VBA\samples\*
  211 + Flags Filename
  212 + ----------- -----------------------------------------------------------------
  213 + OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
  214 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
  215 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
  216 + OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
  217 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
  218 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
  219 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
  220 + OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
  221 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
  222 + OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
  223 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
  224 + OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
  225 + OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
  226 + OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
  227 + OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
  228 + OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
  229 +
  230 +
  231 +--------------------------------------------------------------------------
  232 +
125 233 ## How to use olevba in Python applications
126 234  
127 235 olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code
128 236 from your own python applications.
129 237  
  238 +IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
  239 +
130 240 ### Import olevba
131 241  
132   -First, import the **oletools.olevba** package, using at least the VBA_Parser class:
  242 +First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
133 243  
134 244 :::python
135   - from oletools.olevba import VBA_Parser
  245 + from oletools.olevba import VBA_Parser, VBA_Scanner
136 246  
137 247 ### Parse a MS Office file
138 248  
139   -Create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
140   -The file may also be provided as a bytes string containing its data, or a file-like object. In that case, the actual
141   -filename may be provided as a second parameter, if available.
  249 +To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
  250 +For example:
142 251  
143 252 :::python
144 253 vba = VBA_Parser('my_file_with_macros.doc')
  254 +
  255 +The file may also be provided as a bytes string containing its data. In that case, the actual
  256 +filename must be provided for reference, and the file content with the data parameter. For example:
  257 +
  258 + :::python
  259 + myfile = 'my_file_with_macros.doc'
  260 + filedata = open(myfile, 'rb').read()
  261 + vba = VBA_Parser(myfile, data=filedata)
145 262  
146 263 VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML
147 264 (MS Office 2007+).
148 265  
149 266 ### Detect VBA macros
150 267  
151   -The method **detect_vba_macros** returns True if VBA macros have been found in the file, False otherwise.
  268 +The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file,
  269 +False otherwise.
152 270  
153 271 :::python
154 272 if vba.detect_vba_macros():
... ... @@ -158,7 +276,7 @@ The method **detect_vba_macros** returns True if VBA macros have been found in t
158 276  
159 277 Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine
160 278 for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will
161   -always return False because VBA Macros are stored in a different way.
  279 +always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
162 280  
163 281 Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method
164 282 may return True if the embedded document contains VBA Macros, even if the main document does not.
... ... @@ -186,8 +304,49 @@ Example:
186 304 print 'VBA filename:', vba_filename
187 305 print '- '*39
188 306 print vba_code
189   -
190   -### Detect auto-executable macros
  307 +
  308 +### Analyze VBA Source Code
  309 +
  310 +Note: this API is under active development and may change in the future.
  311 +
  312 +The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
  313 +suspicious keywords, IOCs, auto-executable macros, etc.
  314 +
  315 +First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the
  316 +extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis.
  317 +
  318 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  319 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  320 +
  321 +**scan** returns a list of tuples (type, keyword, description), one for each item in the results.
  322 +
  323 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.
  324 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  325 + the decoded value of the string.
  326 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  327 +
  328 +Example:
  329 +
  330 + :::python
  331 + vba_scanner = VBA_Scanner(vba_code)
  332 + results = vba_scanner.scan(include_decoded_strings=True)
  333 + for kw_type, keyword, description in results:
  334 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  335 +
  336 +The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan():
  337 +
  338 + :::python
  339 + results = scan_vba(vba_code, include_decoded_strings=True)
  340 + for kw_type, keyword, description in results:
  341 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  342 +
  343 +**scan_summary** returns a tuple with the number of items found for each category:
  344 +(autoexec, suspicious, IOCs, hex, base64, dridex).
  345 +
  346 +
  347 +### Detect auto-executable macros (deprecated)
  348 +
  349 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
191 350  
192 351 The function **detect_autoexec** checks if VBA macro code contains specific macro names
193 352 that will be triggered when the document/workbook is opened, closed, changed, etc.
... ... @@ -208,7 +367,9 @@ Sample usage:
208 367 print 'Auto-executable macro keywords: None found'
209 368  
210 369  
211   -### Detect suspicious VBA keywords
  370 +### Detect suspicious VBA keywords (deprecated)
  371 +
  372 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
212 373  
213 374 The function **detect_suspicious** checks if VBA macro code contains specific
214 375 keywords often used by malware to act on the system (create files, run
... ... @@ -230,7 +391,9 @@ Sample usage:
230 391 print 'Suspicious VBA keywords: None found'
231 392  
232 393  
233   -### Extract potential IOCs
  394 +### Extract potential IOCs (deprecated)
  395 +
  396 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
234 397  
235 398 The function **detect_patterns** checks if VBA macro code contains specific
236 399 patterns of interest, that may be useful for malware analysis and detection
... ...