Commit 4b4775d3d2dae2365639fbbf002fcbcb03a72f0e

Authored by Philippe Lagadec
1 parent f1cefbd4

updated doc for v0.40

oletools/doc/Home.html
1 -<p>python-oletools v0.12 documentation</p> 1 +<p>python-oletools v0.40 documentation</p>
2 <p>===================================</p> 2 <p>===================================</p>
3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p> 3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p>
4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> 4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
oletools/doc/Home.md
1 -python-oletools v0.12 documentation 1 +python-oletools v0.40 documentation
2 =================================== 2 ===================================
3 3
4 This is the home page of the documentation for python-oletools. The latest version can be found 4 This is the home page of the documentation for python-oletools. The latest version can be found
oletools/doc/olevba.html
@@ -31,7 +31,7 @@ @@ -31,7 +31,7 @@
31 <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li> 31 <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li>
32 <li><p>Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with</p></li> 32 <li><p>Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with</p></li>
33 </ul> 33 </ul>
34 -<p><a href="http://pyparsing.wikispaces.com">pyparsing</a></p> 34 +<p><a href="http://pyparsing.wikispaces.com">pyparsing</a>, including custom Hex and Base64 encodings</p>
35 <ul> 35 <ul>
36 <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li> 36 <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li>
37 <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li> 37 <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li>
@@ -68,7 +68,7 @@ Options: @@ -68,7 +68,7 @@ Options:
68 68
69 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD 69 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
70 70
71 - if the file is a zip archive, open first file from it, 71 + if the file is a zip archive, open all files from it,
72 72
73 using the provided password (requires Python 2.6+) 73 using the provided password (requires Python 2.6+)
74 74
@@ -106,9 +106,7 @@ Options: @@ -106,9 +106,7 @@ Options:
106 106
107 --attr display the attribute lines at the beginning of VBA 107 --attr display the attribute lines at the beginning of VBA
108 108
109 - source code  
110 -  
111 - --each analyze each VBA module separately</code></pre> 109 + source code</code></pre>
112 <h3 id="examples">Examples</h3> 110 <h3 id="examples">Examples</h3>
113 <p>Scan a single file:</p> 111 <p>Scan a single file:</p>
114 <pre><code>olevba.py file.doc</code></pre> 112 <pre><code>olevba.py file.doc</code></pre>
@@ -292,24 +290,26 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc&lt;/code&gt;&lt;/pre&gt; @@ -292,24 +290,26 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc&lt;/code&gt;&lt;/pre&gt;
292 <p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p> 290 <p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p>
293 <h3 id="import-olevba">Import olevba</h3> 291 <h3 id="import-olevba">Import olevba</h3>
294 <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p> 292 <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p>
295 -<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre> 293 +<pre><code>from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML </code></pre>
296 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3> 294 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3>
297 <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p> 295 <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p>
298 <p>For example:</p> 296 <p>For example:</p>
299 -<pre><code>vba = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre> 297 +<pre><code>vbaparser = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
300 <p>The file may also be provided as a bytes string containing its data. In that case, the actual</p> 298 <p>The file may also be provided as a bytes string containing its data. In that case, the actual</p>
301 <p>filename must be provided for reference, and the file content with the data parameter. For example:</p> 299 <p>filename must be provided for reference, and the file content with the data parameter. For example:</p>
302 <pre><code>myfile = &#39;my_file_with_macros.doc&#39; 300 <pre><code>myfile = &#39;my_file_with_macros.doc&#39;
303 301
304 filedata = open(myfile, &#39;rb&#39;).read() 302 filedata = open(myfile, &#39;rb&#39;).read()
305 303
306 -vba = VBA_Parser(myfile, data=filedata)</code></pre>  
307 -<p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML</p>  
308 -<p>(MS Office 2007+).</p> 304 +vbaparser = VBA_Parser(myfile, data=filedata)</code></pre>
  305 +<p>VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML</p>
  306 +<p>(MS Office 2007+), MHTML or Word 2003 XML.</p>
  307 +<p>After parsing the file, the attribute <strong>VBA_Parser.type</strong> is a string indicating the file type.</p>
  308 +<p>It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)</p>
309 <h3 id="detect-vba-macros">Detect VBA macros</h3> 309 <h3 id="detect-vba-macros">Detect VBA macros</h3>
310 <p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file,</p> 310 <p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file,</p>
311 <p>False otherwise.</p> 311 <p>False otherwise.</p>
312 -<pre><code>if vba.detect_vba_macros(): 312 +<pre><code>if vbaparser.detect_vba_macros():
313 313
314 print &#39;VBA Macros found&#39; 314 print &#39;VBA Macros found&#39;
315 315
@@ -334,7 +334,7 @@ else: @@ -334,7 +334,7 @@ else:
334 <li><p>vba_code: string containing the VBA source code in clear text</p></li> 334 <li><p>vba_code: string containing the VBA source code in clear text</p></li>
335 </ul> 335 </ul>
336 <p>Example:</p> 336 <p>Example:</p>
337 -<pre><code>for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros(): 337 +<pre><code>for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
338 338
339 print &#39;-&#39;*79 339 print &#39;-&#39;*79
340 340
@@ -347,7 +347,44 @@ else: @@ -347,7 +347,44 @@ else:
347 print &#39;- &#39;*39 347 print &#39;- &#39;*39
348 348
349 print vba_code</code></pre> 349 print vba_code</code></pre>
  350 +<p>Alternatively, the VBA_Parser method <strong>extract_all_macros</strong> returns the same results as a list of tuples.</p>
350 <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3> 351 <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3>
  352 +<p>Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained</p>
  353 +<p>in a file:</p>
  354 +<p>The methods <strong>scan</strong> or <strong>scan_summary</strong> from the class <strong>VBA_Parser</strong> can be used to scan the source code of all</p>
  355 +<p>VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p>
  356 +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded</p>
  357 +<p>strings found in the code (Hex, Base64, Dridex) with their decoded value.</p>
  358 +<p>By default, it will include the strings which contain printable characters only.</p>
  359 +<p><strong>VBA_Parser.scan()</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p>
  360 +<ul>
  361 +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or</li>
  362 +</ul>
  363 +<p>'VBA obfuscated Strings'.</p>
  364 +<ul>
  365 +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is</li>
  366 +</ul>
  367 +<p>the decoded value of the string.</p>
  368 +<ul>
  369 +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li>
  370 +</ul>
  371 +<p>Example:</p>
  372 +<pre><code>results = vbaparser.scan()
  373 +
  374 +for kw_type, keyword, description in results:
  375 +
  376 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  377 +<p><strong>VBA_Parser.scan_summary()</strong> returns a tuple with the number of items found for each category:</p>
  378 +<p>(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).</p>
  379 +<h3 id="close-the-vba_parser">Close the VBA_Parser</h3>
  380 +<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p>
  381 +<p>especially if your application is parsing many files.</p>
  382 +<pre><code>vbaparser.close()</code></pre>
  383 +<hr />
  384 +<h2 id="deprecated-api">Deprecated API</h2>
  385 +<p>The following methods and functions are still functional, but their usage is not recommended</p>
  386 +<p>since they have been replaced by better solutions.</p>
  387 +<h3 id="vba_scanner-deprecated">VBA_Scanner (deprecated)</h3>
351 <p>Note: this API is under active development and may change in the future.</p> 388 <p>Note: this API is under active development and may change in the future.</p>
352 <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p> 389 <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p>
353 <p>suspicious keywords, IOCs, auto-executable macros, etc.</p> 390 <p>suspicious keywords, IOCs, auto-executable macros, etc.</p>
@@ -449,10 +486,6 @@ if patterns: @@ -449,10 +486,6 @@ if patterns:
449 else: 486 else:
450 487
451 print &#39;Patterns: None found&#39;</code></pre> 488 print &#39;Patterns: None found&#39;</code></pre>
452 -<h3 id="close-the-vba_parser">Close the VBA_Parser</h3>  
453 -<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p>  
454 -<p>especially if your application is parsing many files.</p>  
455 -<pre><code>vba.close()</code></pre>  
456 <hr /> 489 <hr />
457 <p>python-oletools documentation</p> 490 <p>python-oletools documentation</p>
458 <hr /> 491 <hr />
oletools/doc/olevba.md
@@ -35,7 +35,7 @@ by John William Davison, with significant modifications. @@ -35,7 +35,7 @@ by John William Davison, with significant modifications.
35 - Detect anti-sandboxing and anti-virtualization techniques 35 - Detect anti-sandboxing and anti-virtualization techniques
36 - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex 36 - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
37 - Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with 37 - Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with
38 -[pyparsing](http://pyparsing.wikispaces.com) 38 +[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings
39 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names 39 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
40 - Scan multiple files and sample collections (wildcards, recursive) 40 - Scan multiple files and sample collections (wildcards, recursive)
41 - Triage mode for a summary view of multiple files 41 - Triage mode for a summary view of multiple files
@@ -71,7 +71,7 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, @@ -71,7 +71,7 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames,
71 -h, --help show this help message and exit 71 -h, --help show this help message and exit
72 -r find files recursively in subdirectories. 72 -r find files recursively in subdirectories.
73 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD 73 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
74 - if the file is a zip archive, open first file from it, 74 + if the file is a zip archive, open all files from it,
75 using the provided password (requires Python 2.6+) 75 using the provided password (requires Python 2.6+)
76 -f ZIP_FNAME, --zipfname=ZIP_FNAME 76 -f ZIP_FNAME, --zipfname=ZIP_FNAME
77 if the file is a zip archive, file(s) to be opened 77 if the file is a zip archive, file(s) to be opened
@@ -91,7 +91,6 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, @@ -91,7 +91,6 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames,
91 content (Hex, Base64, StrReverse, Dridex, VBA). 91 content (Hex, Base64, StrReverse, Dridex, VBA).
92 --attr display the attribute lines at the beginning of VBA 92 --attr display the attribute lines at the beginning of VBA
93 source code 93 source code
94 - --each analyze each VBA module separately  
95 94
96 ### Examples 95 ### Examples
97 96
@@ -249,7 +248,7 @@ IMPORTANT: olevba is currently under active development, therefore this API is l @@ -249,7 +248,7 @@ IMPORTANT: olevba is currently under active development, therefore this API is l
249 First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes: 248 First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
250 249
251 :::python 250 :::python
252 - from oletools.olevba import VBA_Parser, VBA_Scanner 251 + from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
253 252
254 ### Parse a MS Office file 253 ### Parse a MS Office file
255 254
@@ -257,7 +256,7 @@ To parse a file on disk, create an instance of the **VBA_Parser** class, providi @@ -257,7 +256,7 @@ To parse a file on disk, create an instance of the **VBA_Parser** class, providi
257 For example: 256 For example:
258 257
259 :::python 258 :::python
260 - vba = VBA_Parser('my_file_with_macros.doc') 259 + vbaparser = VBA_Parser('my_file_with_macros.doc')
261 260
262 The file may also be provided as a bytes string containing its data. In that case, the actual 261 The file may also be provided as a bytes string containing its data. In that case, the actual
263 filename must be provided for reference, and the file content with the data parameter. For example: 262 filename must be provided for reference, and the file content with the data parameter. For example:
@@ -265,10 +264,13 @@ filename must be provided for reference, and the file content with the data para @@ -265,10 +264,13 @@ filename must be provided for reference, and the file content with the data para
265 :::python 264 :::python
266 myfile = 'my_file_with_macros.doc' 265 myfile = 'my_file_with_macros.doc'
267 filedata = open(myfile, 'rb').read() 266 filedata = open(myfile, 'rb').read()
268 - vba = VBA_Parser(myfile, data=filedata) 267 + vbaparser = VBA_Parser(myfile, data=filedata)
269 268
270 -VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML  
271 -(MS Office 2007+). 269 +VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML
  270 +(MS Office 2007+), MHTML or Word 2003 XML.
  271 +
  272 +After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type.
  273 +It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)
272 274
273 ### Detect VBA macros 275 ### Detect VBA macros
274 276
@@ -276,7 +278,7 @@ The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macr @@ -276,7 +278,7 @@ The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macr
276 False otherwise. 278 False otherwise.
277 279
278 :::python 280 :::python
279 - if vba.detect_vba_macros(): 281 + if vbaparser.detect_vba_macros():
280 print 'VBA Macros found' 282 print 'VBA Macros found'
281 else: 283 else:
282 print 'No VBA Macros found' 284 print 'No VBA Macros found'
@@ -304,16 +306,66 @@ for each VBA macro found. @@ -304,16 +306,66 @@ for each VBA macro found.
304 Example: 306 Example:
305 307
306 :::python 308 :::python
307 - for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros(): 309 + for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
308 print '-'*79 310 print '-'*79
309 print 'Filename :', filename 311 print 'Filename :', filename
310 print 'OLE stream :', stream_path 312 print 'OLE stream :', stream_path
311 print 'VBA filename:', vba_filename 313 print 'VBA filename:', vba_filename
312 print '- '*39 314 print '- '*39
313 print vba_code 315 print vba_code
  316 +
  317 +Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples.
314 318
315 ### Analyze VBA Source Code 319 ### Analyze VBA Source Code
316 320
  321 +Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained
  322 +in a file:
  323 +
  324 +The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all
  325 +VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
  326 +
  327 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  328 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  329 +By default, it will include the strings which contain printable characters only.
  330 +
  331 +**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results.
  332 +
  333 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or
  334 + 'VBA obfuscated Strings'.
  335 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  336 + the decoded value of the string.
  337 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  338 +
  339 +Example:
  340 +
  341 + :::python
  342 + results = vbaparser.scan()
  343 + for kw_type, keyword, description in results:
  344 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  345 +
  346 +**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category:
  347 +(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).
  348 +
  349 +
  350 +
  351 +### Close the VBA_Parser
  352 +
  353 +After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,
  354 +especially if your application is parsing many files.
  355 +
  356 + :::python
  357 + vbaparser.close()
  358 +
  359 +
  360 +--------------------------------------------------------------------------
  361 +
  362 +## Deprecated API
  363 +
  364 +The following methods and functions are still functional, but their usage is not recommended
  365 +since they have been replaced by better solutions.
  366 +
  367 +### VBA_Scanner (deprecated)
  368 +
317 Note: this API is under active development and may change in the future. 369 Note: this API is under active development and may change in the future.
318 370
319 The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, 371 The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
@@ -423,16 +475,6 @@ Sample usage: @@ -423,16 +475,6 @@ Sample usage:
423 print 'Patterns: None found' 475 print 'Patterns: None found'
424 476
425 477
426 -### Close the VBA_Parser  
427 -  
428 -After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,  
429 -especially if your application is parsing many files.  
430 -  
431 - :::python  
432 - vba.close()  
433 -  
434 -  
435 -  
436 -------------------------------------------------------------------------- 478 --------------------------------------------------------------------------
437 479
438 python-oletools documentation 480 python-oletools documentation