Commit 6e4e7d5f0e9076620c4f46636e3449a8ba4c01ae

Authored by Philippe Lagadec
1 parent bd53eff6

updated readme and doc for oletools 0.12

README.md
@@ -22,7 +22,11 @@ Note: python-oletools is not related to OLETools published by BeCubed Software. @@ -22,7 +22,11 @@ Note: python-oletools is not related to OLETools published by BeCubed Software.
22 News 22 News
23 ---- 23 ----
24 24
25 -- **2015-05-29 v0.11**: Improved parsing of MHTML and ActiveMime/MSO files in 25 +- **2015-06-19 v0.12**: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) can now deobfuscate VBA
  26 +expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, \&, using a VBA parser built with
  27 +[pyparsing](http://pyparsing.wikispaces.com). New options to display only the analysis results or only the macros source code.
  28 +The analysis is now done on all the VBA modules at once.
  29 +- 2015-05-29 v0.11: Improved parsing of MHTML and ActiveMime/MSO files in
26 [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba), added several suspicious keywords to VBA scanner 30 [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba), added several suspicious keywords to VBA scanner
27 (thanks to @ozhermit and Davy Douhine for the suggestions) 31 (thanks to @ozhermit and Davy Douhine for the suggestions)
28 - 2015-05-06 v0.10: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) now supports Word MHTML files 32 - 2015-05-06 v0.10: [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) now supports Word MHTML files
oletools/README.html
@@ -4,7 +4,8 @@ @@ -4,7 +4,8 @@
4 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p> 4 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p>
5 <h2 id="news">News</h2> 5 <h2 id="news">News</h2>
6 <ul> 6 <ul>
7 -<li><strong>2015-05-29 v0.11</strong>: Improved parsing of MHTML and ActiveMime/MSO files in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, added several suspicious keywords to VBA scanner (thanks to <span class="citation">@ozhermit</span> and Davy Douhine for the suggestions)</li> 7 +<li><strong>2015-06-19 v0.12</strong>: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now deobfuscate VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with <a href="http://pyparsing.wikispaces.com">pyparsing</a>. New options to display only the analysis results or only the macros source code. The analysis is now done on all the VBA modules at once.</li>
  8 +<li>2015-05-29 v0.11: Improved parsing of MHTML and ActiveMime/MSO files in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, added several suspicious keywords to VBA scanner (thanks to <span class="citation">@ozhermit</span> and Davy Douhine for the suggestions)</li>
8 <li>2015-05-06 v0.10: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word MHTML files with macros, aka &quot;Single File Web Page&quot; (.mht) - see <a href="https://bitbucket.org/decalage/oletools/issue/10">issue #10</a> for more info</li> 9 <li>2015-05-06 v0.10: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word MHTML files with macros, aka &quot;Single File Web Page&quot; (.mht) - see <a href="https://bitbucket.org/decalage/oletools/issue/10">issue #10</a> for more info</li>
9 <li>2015-03-23 v0.09: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word 2003 XML files, added anti-sandboxing/VM detection</li> 10 <li>2015-03-23 v0.09: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word 2003 XML files, added anti-sandboxing/VM detection</li>
10 <li>2015-02-08 v0.08: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li> 11 <li>2015-02-08 v0.08: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now decode strings obfuscated with Hex/StrReverse/Base64/Dridex and extract IOCs. Added new triage mode, support for non-western codepages with olefile 0.42, improved API and display, several bugfixes.</li>
oletools/README.rst
@@ -26,9 +26,15 @@ Software. @@ -26,9 +26,15 @@ Software.
26 News 26 News
27 ---- 27 ----
28 28
29 -- **2015-05-29 v0.11**: Improved parsing of MHTML and ActiveMime/MSO  
30 - files in  
31 - `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__, 29 +- **2015-06-19 v0.12**:
  30 + `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__ can
  31 + now deobfuscate VBA expressions with any combination of Chr, Asc,
  32 + Val, StrReverse, Environ, +, &, using a VBA parser built with
  33 + `pyparsing <http://pyparsing.wikispaces.com>`__. New options to
  34 + display only the analysis results or only the macros source code. The
  35 + analysis is now done on all the VBA modules at once.
  36 +- 2015-05-29 v0.11: Improved parsing of MHTML and ActiveMime/MSO files
  37 + in `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__,
32 added several suspicious keywords to VBA scanner (thanks to @ozhermit 38 added several suspicious keywords to VBA scanner (thanks to @ozhermit
33 and Davy Douhine for the suggestions) 39 and Davy Douhine for the suggestions)
34 - 2015-05-06 v0.10: 40 - 2015-05-06 v0.10:
oletools/doc/Home.html
1 -<p>python-oletools v0.11 documentation</p> 1 +<p>python-oletools v0.12 documentation</p>
2 <p>===================================</p> 2 <p>===================================</p>
3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p> 3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p>
4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> 4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
oletools/doc/Home.md
1 -python-oletools v0.11 documentation 1 +python-oletools v0.12 documentation
2 =================================== 2 ===================================
3 3
4 This is the home page of the documentation for python-oletools. The latest version can be found 4 This is the home page of the documentation for python-oletools. The latest version can be found
oletools/doc/olevba.html
@@ -6,7 +6,7 @@ @@ -6,7 +6,7 @@
6 <p>VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,</p> 6 <p>VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,</p>
7 <p>and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc).</p> 7 <p>and potential <strong>IOCs</strong> (IP addresses, URLs, executable filenames, etc).</p>
8 <p>It also detects and decodes several common **obfuscation methods including Hex encoding,</p> 8 <p>It also detects and decodes several common **obfuscation methods including Hex encoding,</p>
9 -<p>StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings.</p> 9 +<p>StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings.</p>
10 <p>It can be used either as a command-line tool, or as a python module from your own applications.</p> 10 <p>It can be used either as a command-line tool, or as a python module from your own applications.</p>
11 <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p> 11 <p>It is part of the <a href="http://www.decalage.info/python/oletools">python-oletools</a> package.</p>
12 <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a></p> 12 <p>olevba is based on source code from <a href="https://github.com/unixfreak0037/officeparser">officeparser</a></p>
@@ -29,6 +29,10 @@ @@ -29,6 +29,10 @@
29 <li><p>Detect suspicious VBA keywords often used by malware</p></li> 29 <li><p>Detect suspicious VBA keywords often used by malware</p></li>
30 <li><p>Detect anti-sandboxing and anti-virtualization techniques</p></li> 30 <li><p>Detect anti-sandboxing and anti-virtualization techniques</p></li>
31 <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li> 31 <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li>
  32 +<li><p>Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with</p></li>
  33 +</ul>
  34 +<p><a href="http://pyparsing.wikispaces.com">pyparsing</a></p>
  35 +<ul>
32 <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li> 36 <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li>
33 <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li> 37 <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li>
34 <li><p>Triage mode for a summary view of multiple files</p></li> 38 <li><p>Triage mode for a summary view of multiple files</p></li>
@@ -43,11 +47,11 @@ @@ -43,11 +47,11 @@
43 <h2 id="how-it-works">How it works</h2> 47 <h2 id="how-it-works">How it works</h2>
44 <ol style="list-style-type: decimal"> 48 <ol style="list-style-type: decimal">
45 <li><p>olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.</p></li> 49 <li><p>olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.</p></li>
46 -<li><p>If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them.</p></li> 50 +<li><p>If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them.</p></li>
47 <li><p>olevba identifies all the VBA projects stored in the OLE structure.</p></li> 51 <li><p>olevba identifies all the VBA projects stored in the OLE structure.</p></li>
48 <li><p>Each VBA project is parsed to find the corresponding OLE streams containing macro code.</p></li> 52 <li><p>Each VBA project is parsed to find the corresponding OLE streams containing macro code.</p></li>
49 <li><p>In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).</p></li> 53 <li><p>In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).</p></li>
50 -<li><p>olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex).</p></li> 54 +<li><p>olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions).</p></li>
51 <li><p>olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros</p></li> 55 <li><p>olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros</p></li>
52 </ol> 56 </ol>
53 <p>and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).</p> 57 <p>and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).</p>
@@ -76,14 +80,20 @@ Options: @@ -76,14 +80,20 @@ Options:
76 80
77 (default:*) 81 (default:*)
78 82
79 - -t triage mode, display results as a summary table 83 + -t, --triage triage mode, display results as a summary table
80 84
81 (default for multiple files) 85 (default for multiple files)
82 86
83 - -d detailed mode, display full results (default for 87 + -d, --detailed detailed mode, display full results (default for
84 88
85 single file) 89 single file)
86 90
  91 + -a, --analysis display only analysis results, not the macro source
  92 +
  93 + code
  94 +
  95 + -c, --code display only VBA source code, do not analyze it
  96 +
87 -i INPUT, --input=INPUT 97 -i INPUT, --input=INPUT
88 98
89 input file containing VBA source code to be analyzed 99 input file containing VBA source code to be analyzed
@@ -92,7 +102,13 @@ Options: @@ -92,7 +102,13 @@ Options:
92 102
93 --decode display all the obfuscated strings with their decoded 103 --decode display all the obfuscated strings with their decoded
94 104
95 - content (Hex, Base64, StrReverse, Dridex). </code></pre> 105 + content (Hex, Base64, StrReverse, Dridex, VBA).
  106 +
  107 + --attr display the attribute lines at the beginning of VBA
  108 +
  109 + source code
  110 +
  111 + --each analyze each VBA module separately</code></pre>
96 <h3 id="examples">Examples</h3> 112 <h3 id="examples">Examples</h3>
97 <p>Scan a single file:</p> 113 <p>Scan a single file:</p>
98 <pre><code>olevba.py file.doc</code></pre> 114 <pre><code>olevba.py file.doc</code></pre>
@@ -249,6 +265,7 @@ ANALYSIS: @@ -249,6 +265,7 @@ ANALYSIS:
249 <li><p><strong>H</strong>: hex-encoded strings (potential obfuscation)</p></li> 265 <li><p><strong>H</strong>: hex-encoded strings (potential obfuscation)</p></li>
250 <li><p><strong>B</strong>: Base64-encoded strings (potential obfuscation)</p></li> 266 <li><p><strong>B</strong>: Base64-encoded strings (potential obfuscation)</p></li>
251 <li><p><strong>D</strong>: Dridex-encoded strings (potential obfuscation)</p></li> 267 <li><p><strong>D</strong>: Dridex-encoded strings (potential obfuscation)</p></li>
  268 +<li><p><strong>V</strong>: VBA string expressions (potential obfuscation)</p></li>
252 </ul> 269 </ul>
253 <p>Here is an example:</p> 270 <p>Here is an example:</p>
254 <pre><code>c:\&gt;olevba.py \MalwareZoo\VBA\samples\* 271 <pre><code>c:\&gt;olevba.py \MalwareZoo\VBA\samples\*
oletools/doc/olevba.md
@@ -7,7 +7,7 @@ and detect security-related patterns such as **auto-executable macros**, **suspi @@ -7,7 +7,7 @@ and detect security-related patterns such as **auto-executable macros**, **suspi
7 VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques, 7 VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,
8 and potential **IOCs** (IP addresses, URLs, executable filenames, etc). 8 and potential **IOCs** (IP addresses, URLs, executable filenames, etc).
9 It also detects and decodes several common **obfuscation methods including Hex encoding, 9 It also detects and decodes several common **obfuscation methods including Hex encoding,
10 -StrReverse, Base64, Dridex**, and extracts IOCs from decoded strings. 10 +StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings.
11 11
12 It can be used either as a command-line tool, or as a python module from your own applications. 12 It can be used either as a command-line tool, or as a python module from your own applications.
13 13
@@ -34,6 +34,8 @@ by John William Davison, with significant modifications. @@ -34,6 +34,8 @@ by John William Davison, with significant modifications.
34 - Detect suspicious VBA keywords often used by malware 34 - Detect suspicious VBA keywords often used by malware
35 - Detect anti-sandboxing and anti-virtualization techniques 35 - Detect anti-sandboxing and anti-virtualization techniques
36 - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex 36 - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
  37 +- Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, \&, using a VBA parser built with
  38 +[pyparsing](http://pyparsing.wikispaces.com)
37 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names 39 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
38 - Scan multiple files and sample collections (wildcards, recursive) 40 - Scan multiple files and sample collections (wildcards, recursive)
39 - Triage mode for a summary view of multiple files 41 - Triage mode for a summary view of multiple files
@@ -51,11 +53,11 @@ and how they are stored in MS Office documents. @@ -51,11 +53,11 @@ and how they are stored in MS Office documents.
51 ## How it works 53 ## How it works
52 54
53 1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away. 55 1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
54 -1. If it is a zip file (i.e. MS Office 2007+), olevba looks for all OLE files stored in it (e.g. vbaProject.bin), and opens them. 56 +1. If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them.
55 1. olevba identifies all the VBA projects stored in the OLE structure. 57 1. olevba identifies all the VBA projects stored in the OLE structure.
56 1. Each VBA project is parsed to find the corresponding OLE streams containing macro code. 58 1. Each VBA project is parsed to find the corresponding OLE streams containing macro code.
57 1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression). 59 1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
58 -1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex). 60 +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions).
59 1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros 61 1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
60 and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc). 62 and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
61 63
@@ -75,15 +77,21 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, @@ -75,15 +77,21 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames,
75 if the file is a zip archive, file(s) to be opened 77 if the file is a zip archive, file(s) to be opened
76 within the zip. Wildcards * and ? are supported. 78 within the zip. Wildcards * and ? are supported.
77 (default:*) 79 (default:*)
78 - -t triage mode, display results as a summary table 80 + -t, --triage triage mode, display results as a summary table
79 (default for multiple files) 81 (default for multiple files)
80 - -d detailed mode, display full results (default for 82 + -d, --detailed detailed mode, display full results (default for
81 single file) 83 single file)
  84 + -a, --analysis display only analysis results, not the macro source
  85 + code
  86 + -c, --code display only VBA source code, do not analyze it
82 -i INPUT, --input=INPUT 87 -i INPUT, --input=INPUT
83 input file containing VBA source code to be analyzed 88 input file containing VBA source code to be analyzed
84 (no parsing) 89 (no parsing)
85 --decode display all the obfuscated strings with their decoded 90 --decode display all the obfuscated strings with their decoded
86 - content (Hex, Base64, StrReverse, Dridex). 91 + content (Hex, Base64, StrReverse, Dridex, VBA).
  92 + --attr display the attribute lines at the beginning of VBA
  93 + source code
  94 + --each analyze each VBA module separately
87 95
88 ### Examples 96 ### Examples
89 97
@@ -211,6 +219,7 @@ The following flags show the results of the analysis: @@ -211,6 +219,7 @@ The following flags show the results of the analysis:
211 - **H**: hex-encoded strings (potential obfuscation) 219 - **H**: hex-encoded strings (potential obfuscation)
212 - **B**: Base64-encoded strings (potential obfuscation) 220 - **B**: Base64-encoded strings (potential obfuscation)
213 - **D**: Dridex-encoded strings (potential obfuscation) 221 - **D**: Dridex-encoded strings (potential obfuscation)
  222 +- **V**: VBA string expressions (potential obfuscation)
214 223
215 Here is an example: 224 Here is an example:
216 225
oletools/olevba.py
@@ -145,6 +145,7 @@ __version__ = &#39;0.31&#39; @@ -145,6 +145,7 @@ __version__ = &#39;0.31&#39;
145 145
146 #------------------------------------------------------------------------------ 146 #------------------------------------------------------------------------------
147 # TODO: 147 # TODO:
  148 +# + option --fast to disable VBA expressions parsing
148 # + do not use logging, but a provided logger (null logger by default) 149 # + do not use logging, but a provided logger (null logger by default)
149 # + setup logging (common with other oletools) 150 # + setup logging (common with other oletools)
150 # + add xor bruteforcing like bbharvest 151 # + add xor bruteforcing like bbharvest