Commit 145f062f199b1f6a7858b26cd19efc41c3601648

Authored by Philippe Lagadec
1 parent 7f6a924d

olevba: updated doc to v0.42

oletools/doc/olevba.html
@@ -20,6 +20,8 @@ @@ -20,6 +20,8 @@
20 <li><p>Excel 97-2003 (.xls)</p></li> 20 <li><p>Excel 97-2003 (.xls)</p></li>
21 <li><p>Excel 2007+ (.xlsm, .xlsb)</p></li> 21 <li><p>Excel 2007+ (.xlsm, .xlsb)</p></li>
22 <li><p>PowerPoint 2007+ (.pptm, .ppsm)</p></li> 22 <li><p>PowerPoint 2007+ (.pptm, .ppsm)</p></li>
  23 +<li><p>Text file containing VBA or VBScript source code</p></li>
  24 +<li><p>Password-protected Zip archive containing any of the above</p></li>
23 </ul> 25 </ul>
24 <h2 id="main-features">Main Features</h2> 26 <h2 id="main-features">Main Features</h2>
25 <ul> 27 <ul>
@@ -297,7 +299,7 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc&lt;/code&gt;&lt;/pre&gt; @@ -297,7 +299,7 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc&lt;/code&gt;&lt;/pre&gt;
297 <h3 id="import-olevba">Import olevba</h3> 299 <h3 id="import-olevba">Import olevba</h3>
298 <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p> 300 <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p>
299 <pre><code>from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML </code></pre> 301 <pre><code>from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML </code></pre>
300 -<h3 id="parse-a-ms-office-file">Parse a MS Office file</h3> 302 +<h3 id="parse-a-ms-office-file---vba_parser">Parse a MS Office file - VBA_Parser</h3>
301 <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p> 303 <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p>
302 <p>For example:</p> 304 <p>For example:</p>
303 <pre><code>vbaparser = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre> 305 <pre><code>vbaparser = VBA_Parser(&#39;my_file_with_macros.doc&#39;)</code></pre>
@@ -357,12 +359,12 @@ else: @@ -357,12 +359,12 @@ else:
357 <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3> 359 <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3>
358 <p>Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained</p> 360 <p>Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained</p>
359 <p>in a file:</p> 361 <p>in a file:</p>
360 -<p>The methods <strong>scan</strong> or <strong>scan_summary</strong> from the class <strong>VBA_Parser</strong> can be used to scan the source code of all</p> 362 +<p>The method <strong>analyze_macros</strong> from the class <strong>VBA_Parser</strong> can be used to scan the source code of all</p>
361 <p>VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p> 363 <p>VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p>
362 -<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded</p> 364 +<p>analyze_macros() takes an optional argument show_decoded_strings: if set to True, the results will contain all the encoded</p>
363 <p>strings found in the code (Hex, Base64, Dridex) with their decoded value.</p> 365 <p>strings found in the code (Hex, Base64, Dridex) with their decoded value.</p>
364 -<p>By default, it will include the strings which contain printable characters only.</p>  
365 -<p><strong>VBA_Parser.scan()</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p> 366 +<p>By default, it will only include the strings which contain printable characters.</p>
  367 +<p><strong>VBA_Parser.analyze_macros()</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p>
366 <ul> 368 <ul>
367 <li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or</li> 369 <li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or</li>
368 </ul> 370 </ul>
@@ -375,13 +377,31 @@ else: @@ -375,13 +377,31 @@ else:
375 <li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li> 377 <li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li>
376 </ul> 378 </ul>
377 <p>Example:</p> 379 <p>Example:</p>
378 -<pre><code>results = vbaparser.scan() 380 +<pre><code>results = vbaparser.analyze_macros()
379 381
380 for kw_type, keyword, description in results: 382 for kw_type, keyword, description in results:
381 383
382 print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre> 384 print &#39;type=%s - keyword=%s - description=%s&#39; % (kw_type, keyword, description)</code></pre>
383 -<p><strong>VBA_Parser.scan_summary()</strong> returns a tuple with the number of items found for each category:</p>  
384 -<p>(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).</p> 385 +<p>After calling analyze_macros, the following VBA_Parser attributes also provide the number</p>
  386 +<p>of items found for each category:</p>
  387 +<pre><code>print &#39;AutoExec keywords: %d&#39; % vbaparser.nb_autoexec
  388 +
  389 +print &#39;Suspicious keywords: %d&#39; % vbaparser.nb_suspicious
  390 +
  391 +print &#39;IOCs: %d&#39; % vbaparser.nb_iocs
  392 +
  393 +print &#39;Hex obfuscated strings: %d&#39; % vbaparser.nb_hexstrings
  394 +
  395 +print &#39;Base64 obfuscated strings: %d&#39; % vbaparser.nb_base64strings
  396 +
  397 +print &#39;Dridex obfuscated strings: %d&#39; % vbaparser.nb_dridexstrings
  398 +
  399 +print &#39;VBA obfuscated strings: %d&#39; % vbaparser.nb_vbastrings</code></pre>
  400 +<h3 id="deobfuscate-vba-macro-source-code">Deobfuscate VBA Macro Source Code</h3>
  401 +<p>The method <strong>reveal</strong> attempts to deobfuscate the macro source code by replacing all</p>
  402 +<p>the obfuscated strings by their decoded content. Returns a single string.</p>
  403 +<p>Example:</p>
  404 +<pre><code>print vbaparser.reveal()</code></pre>
385 <h3 id="close-the-vba_parser">Close the VBA_Parser</h3> 405 <h3 id="close-the-vba_parser">Close the VBA_Parser</h3>
386 <p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p> 406 <p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p>
387 <p>especially if your application is parsing many files.</p> 407 <p>especially if your application is parsing many files.</p>
@@ -391,7 +411,6 @@ for kw_type, keyword, description in results: @@ -391,7 +411,6 @@ for kw_type, keyword, description in results:
391 <p>The following methods and functions are still functional, but their usage is not recommended</p> 411 <p>The following methods and functions are still functional, but their usage is not recommended</p>
392 <p>since they have been replaced by better solutions.</p> 412 <p>since they have been replaced by better solutions.</p>
393 <h3 id="vba_scanner-deprecated">VBA_Scanner (deprecated)</h3> 413 <h3 id="vba_scanner-deprecated">VBA_Scanner (deprecated)</h3>
394 -<p>Note: this API is under active development and may change in the future.</p>  
395 <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p> 414 <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p>
396 <p>suspicious keywords, IOCs, auto-executable macros, etc.</p> 415 <p>suspicious keywords, IOCs, auto-executable macros, etc.</p>
397 <p>First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the</p> 416 <p>First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the</p>
oletools/doc/olevba.md
@@ -25,6 +25,8 @@ by John William Davison, with significant modifications. @@ -25,6 +25,8 @@ by John William Davison, with significant modifications.
25 - Excel 97-2003 (.xls) 25 - Excel 97-2003 (.xls)
26 - Excel 2007+ (.xlsm, .xlsb) 26 - Excel 2007+ (.xlsm, .xlsb)
27 - PowerPoint 2007+ (.pptm, .ppsm) 27 - PowerPoint 2007+ (.pptm, .ppsm)
  28 +- Text file containing VBA or VBScript source code
  29 +- Password-protected Zip archive containing any of the above
28 30
29 ## Main Features 31 ## Main Features
30 32
@@ -257,7 +259,7 @@ First, import the **oletools.olevba** package, using at least the VBA_Parser and @@ -257,7 +259,7 @@ First, import the **oletools.olevba** package, using at least the VBA_Parser and
257 :::python 259 :::python
258 from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML 260 from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
259 261
260 -### Parse a MS Office file 262 +### Parse a MS Office file - VBA_Parser
261 263
262 To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter. 264 To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
263 For example: 265 For example:
@@ -323,29 +325,19 @@ Example: @@ -323,29 +325,19 @@ Example:
323 325
324 Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples. 326 Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples.
325 327
326 -### Extract Experimental Deobfuscated VBA Macro Source Code  
327 -  
328 -The method **reveal** extracts, decompresses, and deofuscates VBA source code into a single string.  
329 -  
330 -Example:  
331 -  
332 - :::python  
333 - print vbaparser.reveal()  
334 -  
335 -  
336 ### Analyze VBA Source Code 328 ### Analyze VBA Source Code
337 329
338 Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained 330 Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained
339 in a file: 331 in a file:
340 332
341 -The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all 333 +The method **analyze_macros** from the class **VBA_Parser** can be used to scan the source code of all
342 VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc. 334 VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
343 335
344 -scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded 336 +analyze_macros() takes an optional argument show_decoded_strings: if set to True, the results will contain all the encoded
345 strings found in the code (Hex, Base64, Dridex) with their decoded value. 337 strings found in the code (Hex, Base64, Dridex) with their decoded value.
346 -By default, it will include the strings which contain printable characters only. 338 +By default, it will only include the strings which contain printable characters.
347 339
348 -**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results. 340 +**VBA_Parser.analyze_macros()** returns a list of tuples (type, keyword, description), one for each item in the results.
349 341
350 - type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or 342 - type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or
351 'VBA obfuscated Strings'. 343 'VBA obfuscated Strings'.
@@ -356,13 +348,32 @@ By default, it will include the strings which contain printable characters only. @@ -356,13 +348,32 @@ By default, it will include the strings which contain printable characters only.
356 Example: 348 Example:
357 349
358 :::python 350 :::python
359 - results = vbaparser.scan() 351 + results = vbaparser.analyze_macros()
360 for kw_type, keyword, description in results: 352 for kw_type, keyword, description in results:
361 print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) 353 print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
362 354
363 -**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category:  
364 -(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings). 355 +After calling analyze_macros, the following VBA_Parser attributes also provide the number
  356 +of items found for each category:
365 357
  358 + :::python
  359 + print 'AutoExec keywords: %d' % vbaparser.nb_autoexec
  360 + print 'Suspicious keywords: %d' % vbaparser.nb_suspicious
  361 + print 'IOCs: %d' % vbaparser.nb_iocs
  362 + print 'Hex obfuscated strings: %d' % vbaparser.nb_hexstrings
  363 + print 'Base64 obfuscated strings: %d' % vbaparser.nb_base64strings
  364 + print 'Dridex obfuscated strings: %d' % vbaparser.nb_dridexstrings
  365 + print 'VBA obfuscated strings: %d' % vbaparser.nb_vbastrings
  366 +
  367 +
  368 +### Deobfuscate VBA Macro Source Code
  369 +
  370 +The method **reveal** attempts to deobfuscate the macro source code by replacing all
  371 +the obfuscated strings by their decoded content. Returns a single string.
  372 +
  373 +Example:
  374 +
  375 + :::python
  376 + print vbaparser.reveal()
366 377
367 378
368 ### Close the VBA_Parser 379 ### Close the VBA_Parser
@@ -383,8 +394,6 @@ since they have been replaced by better solutions. @@ -383,8 +394,6 @@ since they have been replaced by better solutions.
383 394
384 ### VBA_Scanner (deprecated) 395 ### VBA_Scanner (deprecated)
385 396
386 -Note: this API is under active development and may change in the future.  
387 -  
388 The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, 397 The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
389 suspicious keywords, IOCs, auto-executable macros, etc. 398 suspicious keywords, IOCs, auto-executable macros, etc.
390 399
@@ -508,4 +517,4 @@ python-oletools documentation @@ -508,4 +517,4 @@ python-oletools documentation
508 - [[oletimes]] 517 - [[oletimes]]
509 - [[olevba]] 518 - [[olevba]]
510 - [[pyxswf]] 519 - [[pyxswf]]
511 - - [[rtfobj]] 520 - - [[rtfobj]]
  521 + - [[rtfobj]]
512 \ No newline at end of file 522 \ No newline at end of file