Commit 4b4775d3d2dae2365639fbbf002fcbcb03a72f0e

Authored by Philippe Lagadec
1 parent f1cefbd4

updated doc for v0.40

oletools/doc/Home.html
1   -<p>python-oletools v0.12 documentation</p>
  1 +<p>python-oletools v0.40 documentation</p>
2 2 <p>===================================</p>
3 3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p>
4 4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
... ...
oletools/doc/Home.md
1   -python-oletools v0.12 documentation
  1 +python-oletools v0.40 documentation
2 2 ===================================
3 3  
4 4 This is the home page of the documentation for python-oletools. The latest version can be found
... ...
oletools/doc/olevba.html
... ... @@ -31,7 +31,7 @@
31 31 <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li>
32 32 <li><p>Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with</p></li>
33 33 </ul>
34   -<p><a href="http://pyparsing.wikispaces.com">pyparsing</a></p>
  34 +<p><a href="http://pyparsing.wikispaces.com">pyparsing</a>, including custom Hex and Base64 encodings</p>
35 35 <ul>
36 36 <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li>
37 37 <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li>
... ... @@ -68,7 +68,7 @@ Options:
68 68  
69 69 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
70 70  
71   - if the file is a zip archive, open first file from it,
  71 + if the file is a zip archive, open all files from it,
72 72  
73 73 using the provided password (requires Python 2.6+)
74 74  
... ... @@ -106,9 +106,7 @@ Options:
106 106  
107 107 --attr display the attribute lines at the beginning of VBA
108 108  
109   - source code
110   -
111   - --each analyze each VBA module separately</code></pre>
  109 + source code</code></pre>
112 110 <h3 id="examples">Examples</h3>
113 111 <p>Scan a single file:</p>
114 112 <pre><code>olevba.py file.doc</code></pre>
... ... @@ -292,24 +290,26 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc&lt;/code&gt;&lt;/pre&gt;
292 290 <p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p>
293 291 <h3 id="import-olevba">Import olevba</h3>
294 292 <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p>
295   -<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre>
  293 +<pre><code>from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML </code></pre>
296 294 <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3>
297 295 <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p>
298 296 <p>For example:</p>
299   -<pre><code>vba = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
  297 +<pre><code>vbaparser = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
300 298 <p>The file may also be provided as a bytes string containing its data. In that case, the actual</p>
301 299 <p>filename must be provided for reference, and the file content with the data parameter. For example:</p>
302 300 <pre><code>myfile = &#39;my_file_with_macros.doc&#39;
303 301  
304 302 filedata = open(myfile, &#39;rb&#39;).read()
305 303  
306   -vba = VBA_Parser(myfile, data=filedata)</code></pre>
307   -<p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML</p>
308   -<p>(MS Office 2007+).</p>
  304 +vbaparser = VBA_Parser(myfile, data=filedata)</code></pre>
  305 +<p>VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML</p>
  306 +<p>(MS Office 2007+), MHTML or Word 2003 XML.</p>
  307 +<p>After parsing the file, the attribute <strong>VBA_Parser.type</strong> is a string indicating the file type.</p>
  308 +<p>It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)</p>
309 309 <h3 id="detect-vba-macros">Detect VBA macros</h3>
310 310 <p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file,</p>
311 311 <p>False otherwise.</p>
312   -<pre><code>if vba.detect_vba_macros():
  312 +<pre><code>if vbaparser.detect_vba_macros():
313 313  
314 314 print &#39;VBA Macros found&#39;
315 315  
... ... @@ -334,7 +334,7 @@ else:
334 334 <li><p>vba_code: string containing the VBA source code in clear text</p></li>
335 335 </ul>
336 336 <p>Example:</p>
337   -<pre><code>for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros():
  337 +<pre><code>for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
338 338  
339 339 print &#39;-&#39;*79
340 340  
... ... @@ -347,7 +347,44 @@ else:
347 347 print &#39;- &#39;*39
348 348  
349 349 print vba_code</code></pre>
  350 +<p>Alternatively, the VBA_Parser method <strong>extract_all_macros</strong> returns the same results as a list of tuples.</p>
350 351 <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3>
  352 +<p>Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained</p>
  353 +<p>in a file:</p>
  354 +<p>The methods <strong>scan</strong> or <strong>scan_summary</strong> from the class <strong>VBA_Parser</strong> can be used to scan the source code of all</p>
  355 +<p>VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p>
  356 +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded</p>
  357 +<p>strings found in the code (Hex, Base64, Dridex) with their decoded value.</p>
  358 +<p>By default, it will include the strings which contain printable characters only.</p>
  359 +<p><strong>VBA_Parser.scan()</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p>
  360 +<ul>
  361 +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or</li>
  362 +</ul>
  363 +<p>'VBA obfuscated Strings'.</p>
  364 +<ul>
  365 +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is</li>
  366 +</ul>
  367 +<p>the decoded value of the string.</p>
  368 +<ul>
  369 +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li>
  370 +</ul>
  371 +<p>Example:</p>
  372 +<pre><code>results = vbaparser.scan()
  373 +
  374 +for kw_type, keyword, description in results:
  375 +
  376 + print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
  377 +<p><strong>VBA_Parser.scan_summary()</strong> returns a tuple with the number of items found for each category:</p>
  378 +<p>(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).</p>
  379 +<h3 id="close-the-vba_parser">Close the VBA_Parser</h3>
  380 +<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p>
  381 +<p>especially if your application is parsing many files.</p>
  382 +<pre><code>vbaparser.close()</code></pre>
  383 +<hr />
  384 +<h2 id="deprecated-api">Deprecated API</h2>
  385 +<p>The following methods and functions are still functional, but their usage is not recommended</p>
  386 +<p>since they have been replaced by better solutions.</p>
  387 +<h3 id="vba_scanner-deprecated">VBA_Scanner (deprecated)</h3>
351 388 <p>Note: this API is under active development and may change in the future.</p>
352 389 <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p>
353 390 <p>suspicious keywords, IOCs, auto-executable macros, etc.</p>
... ... @@ -449,10 +486,6 @@ if patterns:
449 486 else:
450 487  
451 488 print &#39;Patterns: None found&#39;</code></pre>
452   -<h3 id="close-the-vba_parser">Close the VBA_Parser</h3>
453   -<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p>
454   -<p>especially if your application is parsing many files.</p>
455   -<pre><code>vba.close()</code></pre>
456 489 <hr />
457 490 <p>python-oletools documentation</p>
458 491 <hr />
... ...
oletools/doc/olevba.md
... ... @@ -35,7 +35,7 @@ by John William Davison, with significant modifications.
35 35 - Detect anti-sandboxing and anti-virtualization techniques
36 36 - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
37 37 - Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with
38   -[pyparsing](http://pyparsing.wikispaces.com)
  38 +[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings
39 39 - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
40 40 - Scan multiple files and sample collections (wildcards, recursive)
41 41 - Triage mode for a summary view of multiple files
... ... @@ -71,7 +71,7 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames,
71 71 -h, --help show this help message and exit
72 72 -r find files recursively in subdirectories.
73 73 -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
74   - if the file is a zip archive, open first file from it,
  74 + if the file is a zip archive, open all files from it,
75 75 using the provided password (requires Python 2.6+)
76 76 -f ZIP_FNAME, --zipfname=ZIP_FNAME
77 77 if the file is a zip archive, file(s) to be opened
... ... @@ -91,7 +91,6 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames,
91 91 content (Hex, Base64, StrReverse, Dridex, VBA).
92 92 --attr display the attribute lines at the beginning of VBA
93 93 source code
94   - --each analyze each VBA module separately
95 94  
96 95 ### Examples
97 96  
... ... @@ -249,7 +248,7 @@ IMPORTANT: olevba is currently under active development, therefore this API is l
249 248 First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
250 249  
251 250 :::python
252   - from oletools.olevba import VBA_Parser, VBA_Scanner
  251 + from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
253 252  
254 253 ### Parse a MS Office file
255 254  
... ... @@ -257,7 +256,7 @@ To parse a file on disk, create an instance of the **VBA_Parser** class, providi
257 256 For example:
258 257  
259 258 :::python
260   - vba = VBA_Parser('my_file_with_macros.doc')
  259 + vbaparser = VBA_Parser('my_file_with_macros.doc')
261 260  
262 261 The file may also be provided as a bytes string containing its data. In that case, the actual
263 262 filename must be provided for reference, and the file content with the data parameter. For example:
... ... @@ -265,10 +264,13 @@ filename must be provided for reference, and the file content with the data para
265 264 :::python
266 265 myfile = 'my_file_with_macros.doc'
267 266 filedata = open(myfile, 'rb').read()
268   - vba = VBA_Parser(myfile, data=filedata)
  267 + vbaparser = VBA_Parser(myfile, data=filedata)
269 268  
270   -VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML
271   -(MS Office 2007+).
  269 +VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML
  270 +(MS Office 2007+), MHTML or Word 2003 XML.
  271 +
  272 +After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type.
  273 +It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)
272 274  
273 275 ### Detect VBA macros
274 276  
... ... @@ -276,7 +278,7 @@ The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macr
276 278 False otherwise.
277 279  
278 280 :::python
279   - if vba.detect_vba_macros():
  281 + if vbaparser.detect_vba_macros():
280 282 print 'VBA Macros found'
281 283 else:
282 284 print 'No VBA Macros found'
... ... @@ -304,16 +306,66 @@ for each VBA macro found.
304 306 Example:
305 307  
306 308 :::python
307   - for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros():
  309 + for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
308 310 print '-'*79
309 311 print 'Filename :', filename
310 312 print 'OLE stream :', stream_path
311 313 print 'VBA filename:', vba_filename
312 314 print '- '*39
313 315 print vba_code
  316 +
  317 +Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples.
314 318  
315 319 ### Analyze VBA Source Code
316 320  
  321 +Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained
  322 +in a file:
  323 +
  324 +The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all
  325 +VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
  326 +
  327 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  328 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  329 +By default, it will include the strings which contain printable characters only.
  330 +
  331 +**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results.
  332 +
  333 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or
  334 + 'VBA obfuscated Strings'.
  335 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  336 + the decoded value of the string.
  337 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  338 +
  339 +Example:
  340 +
  341 + :::python
  342 + results = vbaparser.scan()
  343 + for kw_type, keyword, description in results:
  344 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  345 +
  346 +**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category:
  347 +(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).
  348 +
  349 +
  350 +
  351 +### Close the VBA_Parser
  352 +
  353 +After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,
  354 +especially if your application is parsing many files.
  355 +
  356 + :::python
  357 + vbaparser.close()
  358 +
  359 +
  360 +--------------------------------------------------------------------------
  361 +
  362 +## Deprecated API
  363 +
  364 +The following methods and functions are still functional, but their usage is not recommended
  365 +since they have been replaced by better solutions.
  366 +
  367 +### VBA_Scanner (deprecated)
  368 +
317 369 Note: this API is under active development and may change in the future.
318 370  
319 371 The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
... ... @@ -423,16 +475,6 @@ Sample usage:
423 475 print 'Patterns: None found'
424 476  
425 477  
426   -### Close the VBA_Parser
427   -
428   -After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,
429   -especially if your application is parsing many files.
430   -
431   - :::python
432   - vba.close()
433   -
434   -
435   -
436 478 --------------------------------------------------------------------------
437 479  
438 480 python-oletools documentation
... ...