Commit 4b4775d3d2dae2365639fbbf002fcbcb03a72f0e
1 parent
f1cefbd4
updated doc for v0.40
Showing
4 changed files
with
113 additions
and
38 deletions
oletools/doc/Home.html
| 1 | -<p>python-oletools v0.12 documentation</p> | |
| 1 | +<p>python-oletools v0.40 documentation</p> | |
| 2 | 2 | <p>===================================</p> |
| 3 | 3 | <p>This is the home page of the documentation for python-oletools. The latest version can be found</p> |
| 4 | 4 | <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> | ... | ... |
oletools/doc/Home.md
oletools/doc/olevba.html
| ... | ... | @@ -31,7 +31,7 @@ |
| 31 | 31 | <li><p>Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex</p></li> |
| 32 | 32 | <li><p>Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with</p></li> |
| 33 | 33 | </ul> |
| 34 | -<p><a href="http://pyparsing.wikispaces.com">pyparsing</a></p> | |
| 34 | +<p><a href="http://pyparsing.wikispaces.com">pyparsing</a>, including custom Hex and Base64 encodings</p> | |
| 35 | 35 | <ul> |
| 36 | 36 | <li><p>Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names</p></li> |
| 37 | 37 | <li><p>Scan multiple files and sample collections (wildcards, recursive)</p></li> |
| ... | ... | @@ -68,7 +68,7 @@ Options: |
| 68 | 68 | |
| 69 | 69 | -z ZIP_PASSWORD, --zip=ZIP_PASSWORD |
| 70 | 70 | |
| 71 | - if the file is a zip archive, open first file from it, | |
| 71 | + if the file is a zip archive, open all files from it, | |
| 72 | 72 | |
| 73 | 73 | using the provided password (requires Python 2.6+) |
| 74 | 74 | |
| ... | ... | @@ -106,9 +106,7 @@ Options: |
| 106 | 106 | |
| 107 | 107 | --attr display the attribute lines at the beginning of VBA |
| 108 | 108 | |
| 109 | - source code | |
| 110 | - | |
| 111 | - --each analyze each VBA module separately</code></pre> | |
| 109 | + source code</code></pre> | |
| 112 | 110 | <h3 id="examples">Examples</h3> |
| 113 | 111 | <p>Scan a single file:</p> |
| 114 | 112 | <pre><code>olevba.py file.doc</code></pre> |
| ... | ... | @@ -292,24 +290,26 @@ OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc</code></pre> |
| 292 | 290 | <p>IMPORTANT: olevba is currently under active development, therefore this API is likely to change.</p> |
| 293 | 291 | <h3 id="import-olevba">Import olevba</h3> |
| 294 | 292 | <p>First, import the <strong>oletools.olevba</strong> package, using at least the VBA_Parser and VBA_Scanner classes:</p> |
| 295 | -<pre><code>from oletools.olevba import VBA_Parser, VBA_Scanner</code></pre> | |
| 293 | +<pre><code>from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML </code></pre> | |
| 296 | 294 | <h3 id="parse-a-ms-office-file">Parse a MS Office file</h3> |
| 297 | 295 | <p>To parse a file on disk, create an instance of the <strong>VBA_Parser</strong> class, providing the name of the file to open as parameter.</p> |
| 298 | 296 | <p>For example:</p> |
| 299 | -<pre><code>vba = VBA_Parser('my_file_with_macros.doc')</code></pre> | |
| 297 | +<pre><code>vbaparser = VBA_Parser('my_file_with_macros.doc')</code></pre> | |
| 300 | 298 | <p>The file may also be provided as a bytes string containing its data. In that case, the actual</p> |
| 301 | 299 | <p>filename must be provided for reference, and the file content with the data parameter. For example:</p> |
| 302 | 300 | <pre><code>myfile = 'my_file_with_macros.doc' |
| 303 | 301 | |
| 304 | 302 | filedata = open(myfile, 'rb').read() |
| 305 | 303 | |
| 306 | -vba = VBA_Parser(myfile, data=filedata)</code></pre> | |
| 307 | -<p>VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML</p> | |
| 308 | -<p>(MS Office 2007+).</p> | |
| 304 | +vbaparser = VBA_Parser(myfile, data=filedata)</code></pre> | |
| 305 | +<p>VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML</p> | |
| 306 | +<p>(MS Office 2007+), MHTML or Word 2003 XML.</p> | |
| 307 | +<p>After parsing the file, the attribute <strong>VBA_Parser.type</strong> is a string indicating the file type.</p> | |
| 308 | +<p>It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)</p> | |
| 309 | 309 | <h3 id="detect-vba-macros">Detect VBA macros</h3> |
| 310 | 310 | <p>The method <strong>detect_vba_macros</strong> of a VBA_Parser object returns True if VBA macros have been found in the file,</p> |
| 311 | 311 | <p>False otherwise.</p> |
| 312 | -<pre><code>if vba.detect_vba_macros(): | |
| 312 | +<pre><code>if vbaparser.detect_vba_macros(): | |
| 313 | 313 | |
| 314 | 314 | print 'VBA Macros found' |
| 315 | 315 | |
| ... | ... | @@ -334,7 +334,7 @@ else: |
| 334 | 334 | <li><p>vba_code: string containing the VBA source code in clear text</p></li> |
| 335 | 335 | </ul> |
| 336 | 336 | <p>Example:</p> |
| 337 | -<pre><code>for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros(): | |
| 337 | +<pre><code>for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros(): | |
| 338 | 338 | |
| 339 | 339 | print '-'*79 |
| 340 | 340 | |
| ... | ... | @@ -347,7 +347,44 @@ else: |
| 347 | 347 | print '- '*39 |
| 348 | 348 | |
| 349 | 349 | print vba_code</code></pre> |
| 350 | +<p>Alternatively, the VBA_Parser method <strong>extract_all_macros</strong> returns the same results as a list of tuples.</p> | |
| 350 | 351 | <h3 id="analyze-vba-source-code">Analyze VBA Source Code</h3> |
| 352 | +<p>Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained</p> | |
| 353 | +<p>in a file:</p> | |
| 354 | +<p>The methods <strong>scan</strong> or <strong>scan_summary</strong> from the class <strong>VBA_Parser</strong> can be used to scan the source code of all</p> | |
| 355 | +<p>VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.</p> | |
| 356 | +<p>scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded</p> | |
| 357 | +<p>strings found in the code (Hex, Base64, Dridex) with their decoded value.</p> | |
| 358 | +<p>By default, it will include the strings which contain printable characters only.</p> | |
| 359 | +<p><strong>VBA_Parser.scan()</strong> returns a list of tuples (type, keyword, description), one for each item in the results.</p> | |
| 360 | +<ul> | |
| 361 | +<li>type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or</li> | |
| 362 | +</ul> | |
| 363 | +<p>'VBA obfuscated Strings'.</p> | |
| 364 | +<ul> | |
| 365 | +<li>keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is</li> | |
| 366 | +</ul> | |
| 367 | +<p>the decoded value of the string.</p> | |
| 368 | +<ul> | |
| 369 | +<li>description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.</li> | |
| 370 | +</ul> | |
| 371 | +<p>Example:</p> | |
| 372 | +<pre><code>results = vbaparser.scan() | |
| 373 | + | |
| 374 | +for kw_type, keyword, description in results: | |
| 375 | + | |
| 376 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)</code></pre> | |
| 377 | +<p><strong>VBA_Parser.scan_summary()</strong> returns a tuple with the number of items found for each category:</p> | |
| 378 | +<p>(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).</p> | |
| 379 | +<h3 id="close-the-vba_parser">Close the VBA_Parser</h3> | |
| 380 | +<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p> | |
| 381 | +<p>especially if your application is parsing many files.</p> | |
| 382 | +<pre><code>vbaparser.close()</code></pre> | |
| 383 | +<hr /> | |
| 384 | +<h2 id="deprecated-api">Deprecated API</h2> | |
| 385 | +<p>The following methods and functions are still functional, but their usage is not recommended</p> | |
| 386 | +<p>since they have been replaced by better solutions.</p> | |
| 387 | +<h3 id="vba_scanner-deprecated">VBA_Scanner (deprecated)</h3> | |
| 351 | 388 | <p>Note: this API is under active development and may change in the future.</p> |
| 352 | 389 | <p>The class <strong>VBA_Scanner</strong> can be used to scan the source code of a VBA module to find obfuscated strings,</p> |
| 353 | 390 | <p>suspicious keywords, IOCs, auto-executable macros, etc.</p> |
| ... | ... | @@ -449,10 +486,6 @@ if patterns: |
| 449 | 486 | else: |
| 450 | 487 | |
| 451 | 488 | print 'Patterns: None found'</code></pre> |
| 452 | -<h3 id="close-the-vba_parser">Close the VBA_Parser</h3> | |
| 453 | -<p>After usage, it is better to call the <strong>close</strong> method of the VBA_Parser object, to make sure the file is closed,</p> | |
| 454 | -<p>especially if your application is parsing many files.</p> | |
| 455 | -<pre><code>vba.close()</code></pre> | |
| 456 | 489 | <hr /> |
| 457 | 490 | <p>python-oletools documentation</p> |
| 458 | 491 | <hr /> | ... | ... |
oletools/doc/olevba.md
| ... | ... | @@ -35,7 +35,7 @@ by John William Davison, with significant modifications. |
| 35 | 35 | - Detect anti-sandboxing and anti-virtualization techniques |
| 36 | 36 | - Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex |
| 37 | 37 | - Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with |
| 38 | -[pyparsing](http://pyparsing.wikispaces.com) | |
| 38 | +[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings | |
| 39 | 39 | - Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names |
| 40 | 40 | - Scan multiple files and sample collections (wildcards, recursive) |
| 41 | 41 | - Triage mode for a summary view of multiple files |
| ... | ... | @@ -71,7 +71,7 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, |
| 71 | 71 | -h, --help show this help message and exit |
| 72 | 72 | -r find files recursively in subdirectories. |
| 73 | 73 | -z ZIP_PASSWORD, --zip=ZIP_PASSWORD |
| 74 | - if the file is a zip archive, open first file from it, | |
| 74 | + if the file is a zip archive, open all files from it, | |
| 75 | 75 | using the provided password (requires Python 2.6+) |
| 76 | 76 | -f ZIP_FNAME, --zipfname=ZIP_FNAME |
| 77 | 77 | if the file is a zip archive, file(s) to be opened |
| ... | ... | @@ -91,7 +91,6 @@ and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, |
| 91 | 91 | content (Hex, Base64, StrReverse, Dridex, VBA). |
| 92 | 92 | --attr display the attribute lines at the beginning of VBA |
| 93 | 93 | source code |
| 94 | - --each analyze each VBA module separately | |
| 95 | 94 | |
| 96 | 95 | ### Examples |
| 97 | 96 | |
| ... | ... | @@ -249,7 +248,7 @@ IMPORTANT: olevba is currently under active development, therefore this API is l |
| 249 | 248 | First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes: |
| 250 | 249 | |
| 251 | 250 | :::python |
| 252 | - from oletools.olevba import VBA_Parser, VBA_Scanner | |
| 251 | + from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML | |
| 253 | 252 | |
| 254 | 253 | ### Parse a MS Office file |
| 255 | 254 | |
| ... | ... | @@ -257,7 +256,7 @@ To parse a file on disk, create an instance of the **VBA_Parser** class, providi |
| 257 | 256 | For example: |
| 258 | 257 | |
| 259 | 258 | :::python |
| 260 | - vba = VBA_Parser('my_file_with_macros.doc') | |
| 259 | + vbaparser = VBA_Parser('my_file_with_macros.doc') | |
| 261 | 260 | |
| 262 | 261 | The file may also be provided as a bytes string containing its data. In that case, the actual |
| 263 | 262 | filename must be provided for reference, and the file content with the data parameter. For example: |
| ... | ... | @@ -265,10 +264,13 @@ filename must be provided for reference, and the file content with the data para |
| 265 | 264 | :::python |
| 266 | 265 | myfile = 'my_file_with_macros.doc' |
| 267 | 266 | filedata = open(myfile, 'rb').read() |
| 268 | - vba = VBA_Parser(myfile, data=filedata) | |
| 267 | + vbaparser = VBA_Parser(myfile, data=filedata) | |
| 269 | 268 | |
| 270 | -VBA_Parser will raise an exception if the file is not a supported format, either OLE (MS Office 97-2003) or OpenXML | |
| 271 | -(MS Office 2007+). | |
| 269 | +VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML | |
| 270 | +(MS Office 2007+), MHTML or Word 2003 XML. | |
| 271 | + | |
| 272 | +After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type. | |
| 273 | +It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module) | |
| 272 | 274 | |
| 273 | 275 | ### Detect VBA macros |
| 274 | 276 | |
| ... | ... | @@ -276,7 +278,7 @@ The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macr |
| 276 | 278 | False otherwise. |
| 277 | 279 | |
| 278 | 280 | :::python |
| 279 | - if vba.detect_vba_macros(): | |
| 281 | + if vbaparser.detect_vba_macros(): | |
| 280 | 282 | print 'VBA Macros found' |
| 281 | 283 | else: |
| 282 | 284 | print 'No VBA Macros found' |
| ... | ... | @@ -304,16 +306,66 @@ for each VBA macro found. |
| 304 | 306 | Example: |
| 305 | 307 | |
| 306 | 308 | :::python |
| 307 | - for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros(): | |
| 309 | + for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros(): | |
| 308 | 310 | print '-'*79 |
| 309 | 311 | print 'Filename :', filename |
| 310 | 312 | print 'OLE stream :', stream_path |
| 311 | 313 | print 'VBA filename:', vba_filename |
| 312 | 314 | print '- '*39 |
| 313 | 315 | print vba_code |
| 316 | + | |
| 317 | +Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples. | |
| 314 | 318 | |
| 315 | 319 | ### Analyze VBA Source Code |
| 316 | 320 | |
| 321 | +Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained | |
| 322 | +in a file: | |
| 323 | + | |
| 324 | +The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all | |
| 325 | +VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc. | |
| 326 | + | |
| 327 | +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 328 | +strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 329 | +By default, it will include the strings which contain printable characters only. | |
| 330 | + | |
| 331 | +**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 332 | + | |
| 333 | +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or | |
| 334 | + 'VBA obfuscated Strings'. | |
| 335 | +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 336 | + the decoded value of the string. | |
| 337 | +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 338 | + | |
| 339 | +Example: | |
| 340 | + | |
| 341 | + :::python | |
| 342 | + results = vbaparser.scan() | |
| 343 | + for kw_type, keyword, description in results: | |
| 344 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 345 | + | |
| 346 | +**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category: | |
| 347 | +(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings). | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | +### Close the VBA_Parser | |
| 352 | + | |
| 353 | +After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed, | |
| 354 | +especially if your application is parsing many files. | |
| 355 | + | |
| 356 | + :::python | |
| 357 | + vbaparser.close() | |
| 358 | + | |
| 359 | + | |
| 360 | +-------------------------------------------------------------------------- | |
| 361 | + | |
| 362 | +## Deprecated API | |
| 363 | + | |
| 364 | +The following methods and functions are still functional, but their usage is not recommended | |
| 365 | +since they have been replaced by better solutions. | |
| 366 | + | |
| 367 | +### VBA_Scanner (deprecated) | |
| 368 | + | |
| 317 | 369 | Note: this API is under active development and may change in the future. |
| 318 | 370 | |
| 319 | 371 | The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, |
| ... | ... | @@ -423,16 +475,6 @@ Sample usage: |
| 423 | 475 | print 'Patterns: None found' |
| 424 | 476 | |
| 425 | 477 | |
| 426 | -### Close the VBA_Parser | |
| 427 | - | |
| 428 | -After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed, | |
| 429 | -especially if your application is parsing many files. | |
| 430 | - | |
| 431 | - :::python | |
| 432 | - vba.close() | |
| 433 | - | |
| 434 | - | |
| 435 | - | |
| 436 | 478 | -------------------------------------------------------------------------- |
| 437 | 479 | |
| 438 | 480 | python-oletools documentation | ... | ... |