Commit c47b13c16461fd24a8217e12419989c0a81721fc
1 parent
dfd6b4f0
updated documentation for v0.41
Showing
7 changed files
with
569 additions
and
551 deletions
oletools/README.html
| ... | ... | @@ -4,7 +4,8 @@ |
| 4 | 4 | <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p> |
| 5 | 5 | <h2 id="news">News</h2> |
| 6 | 6 | <ul> |
| 7 | -<li><strong>2015-09-17 v0.40</strong>: Improved macro deobfuscation in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to decode Hex and Base64 within VBA expressions. Display printable deobfuscated strings by default. Improved the VBA_Parser API. Improved performance. Fixed <a href="https://bitbucket.org/decalage/oletools/issue/23">issue #23</a> with sys.stderr.</li> | |
| 7 | +<li><strong>2015-09-22 v0.41</strong>: added new --reveal option to <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to show the macro code with VBA strings deobfuscated.</li> | |
| 8 | +<li>2015-09-17 v0.40: Improved macro deobfuscation in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to decode Hex and Base64 within VBA expressions. Display printable deobfuscated strings by default. Improved the VBA_Parser API. Improved performance. Fixed <a href="https://bitbucket.org/decalage/oletools/issue/23">issue #23</a> with sys.stderr.</li> | |
| 8 | 9 | <li>2015-06-19 v0.12: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now deobfuscate VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with <a href="http://pyparsing.wikispaces.com">pyparsing</a>. New options to display only the analysis results or only the macros source code. The analysis is now done on all the VBA modules at once.</li> |
| 9 | 10 | <li>2015-05-29 v0.11: Improved parsing of MHTML and ActiveMime/MSO files in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, added several suspicious keywords to VBA scanner (thanks to <span class="citation">@ozhermit</span> and Davy Douhine for the suggestions)</li> |
| 10 | 11 | <li>2015-05-06 v0.10: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word MHTML files with macros, aka "Single File Web Page" (.mht) - see <a href="https://bitbucket.org/decalage/oletools/issue/10">issue #10</a> for more info</li> | ... | ... |
oletools/README.rst
| ... | ... | @@ -26,7 +26,10 @@ Software. |
| 26 | 26 | News |
| 27 | 27 | ---- |
| 28 | 28 | |
| 29 | -- **2015-09-17 v0.40**: Improved macro deobfuscation in | |
| 29 | +- **2015-09-22 v0.41**: added new --reveal option to | |
| 30 | + `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__, to | |
| 31 | + show the macro code with VBA strings deobfuscated. | |
| 32 | +- 2015-09-17 v0.40: Improved macro deobfuscation in | |
| 30 | 33 | `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__, to |
| 31 | 34 | decode Hex and Base64 within VBA expressions. Display printable |
| 32 | 35 | deobfuscated strings by default. Improved the VBA\_Parser API. | ... | ... |
oletools/doc/Home.html
| 1 | -<p>python-oletools v0.40 documentation</p> | |
| 1 | +<p>python-oletools v0.41 documentation</p> | |
| 2 | 2 | <p>===================================</p> |
| 3 | 3 | <p>This is the home page of the documentation for python-oletools. The latest version can be found</p> |
| 4 | 4 | <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p> | ... | ... |
oletools/doc/Home.md
| 1 | -python-oletools v0.40 documentation | |
| 2 | -=================================== | |
| 3 | - | |
| 4 | -This is the home page of the documentation for python-oletools. The latest version can be found | |
| 5 | -[online](https://bitbucket.org/decalage/oletools/wiki), otherwise a copy is provided in the doc subfolder of the package. | |
| 6 | - | |
| 7 | -[python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze | |
| 8 | -[Microsoft OLE2 files](http://en.wikipedia.org/wiki/Compound_File_Binary_Format) | |
| 9 | -(also called Structured Storage, Compound File Binary Format or Compound Document File Format), | |
| 10 | -such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. | |
| 11 | -It is based on the [olefile](http://www.decalage.info/olefile) parser. | |
| 12 | -See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info. | |
| 13 | - | |
| 14 | -**Quick links:** [Home page](http://www.decalage.info/python/oletools) - | |
| 15 | -[Download/Install](https://bitbucket.org/decalage/oletools/wiki/Install) - | |
| 16 | -[Documentation](https://bitbucket.org/decalage/oletools/wiki) - | |
| 17 | -[Report Issues/Suggestions/Questions](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) - | |
| 18 | -[Contact the author](http://decalage.info/contact) - | |
| 19 | -[Repository](https://bitbucket.org/decalage/oletools) - | |
| 20 | -[Updates on Twitter](https://twitter.com/decalage2) | |
| 21 | - | |
| 22 | -Note: python-oletools is not related to OLETools published by BeCubed Software. | |
| 23 | - | |
| 24 | -Tools in python-oletools: | |
| 25 | -------------------------- | |
| 26 | - | |
| 27 | -- **[[olebrowse]]**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to | |
| 28 | - view and extract individual data streams. | |
| 29 | -- **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files. | |
| 30 | -- **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files. | |
| 31 | -- **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages. | |
| 32 | -- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML). | |
| 33 | -- **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may | |
| 34 | - be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, | |
| 35 | - which is especially useful for malware analysis. | |
| 36 | -- **[[rtfobj]]**: a tool and python module to extract embedded objects from RTF files. | |
| 37 | -- and a few others (coming soon) | |
| 38 | - | |
| 39 | --------------------------------------------------------------------------- | |
| 40 | - | |
| 41 | -python-oletools documentation | |
| 42 | ------------------------------ | |
| 43 | - | |
| 44 | -- [[Home]] | |
| 45 | -- [[License]] | |
| 46 | -- [[Install]] | |
| 47 | -- [[Contribute]], Suggest Improvements or Report Issues | |
| 48 | -- Tools: | |
| 49 | - - [[olebrowse]] | |
| 50 | - - [[oleid]] | |
| 51 | - - [[olemeta]] | |
| 52 | - - [[oletimes]] | |
| 53 | - - [[olevba]] | |
| 54 | - - [[pyxswf]] | |
| 1 | +python-oletools v0.41 documentation | |
| 2 | +=================================== | |
| 3 | + | |
| 4 | +This is the home page of the documentation for python-oletools. The latest version can be found | |
| 5 | +[online](https://bitbucket.org/decalage/oletools/wiki), otherwise a copy is provided in the doc subfolder of the package. | |
| 6 | + | |
| 7 | +[python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze | |
| 8 | +[Microsoft OLE2 files](http://en.wikipedia.org/wiki/Compound_File_Binary_Format) | |
| 9 | +(also called Structured Storage, Compound File Binary Format or Compound Document File Format), | |
| 10 | +such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. | |
| 11 | +It is based on the [olefile](http://www.decalage.info/olefile) parser. | |
| 12 | +See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info. | |
| 13 | + | |
| 14 | +**Quick links:** [Home page](http://www.decalage.info/python/oletools) - | |
| 15 | +[Download/Install](https://bitbucket.org/decalage/oletools/wiki/Install) - | |
| 16 | +[Documentation](https://bitbucket.org/decalage/oletools/wiki) - | |
| 17 | +[Report Issues/Suggestions/Questions](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) - | |
| 18 | +[Contact the author](http://decalage.info/contact) - | |
| 19 | +[Repository](https://bitbucket.org/decalage/oletools) - | |
| 20 | +[Updates on Twitter](https://twitter.com/decalage2) | |
| 21 | + | |
| 22 | +Note: python-oletools is not related to OLETools published by BeCubed Software. | |
| 23 | + | |
| 24 | +Tools in python-oletools: | |
| 25 | +------------------------- | |
| 26 | + | |
| 27 | +- **[[olebrowse]]**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to | |
| 28 | + view and extract individual data streams. | |
| 29 | +- **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files. | |
| 30 | +- **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files. | |
| 31 | +- **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages. | |
| 32 | +- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML). | |
| 33 | +- **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may | |
| 34 | + be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, | |
| 35 | + which is especially useful for malware analysis. | |
| 36 | +- **[[rtfobj]]**: a tool and python module to extract embedded objects from RTF files. | |
| 37 | +- and a few others (coming soon) | |
| 38 | + | |
| 39 | +-------------------------------------------------------------------------- | |
| 40 | + | |
| 41 | +python-oletools documentation | |
| 42 | +----------------------------- | |
| 43 | + | |
| 44 | +- [[Home]] | |
| 45 | +- [[License]] | |
| 46 | +- [[Install]] | |
| 47 | +- [[Contribute]], Suggest Improvements or Report Issues | |
| 48 | +- Tools: | |
| 49 | + - [[olebrowse]] | |
| 50 | + - [[oleid]] | |
| 51 | + - [[olemeta]] | |
| 52 | + - [[oletimes]] | |
| 53 | + - [[olevba]] | |
| 54 | + - [[pyxswf]] | |
| 55 | 55 | - [[rtfobj]] |
| 56 | 56 | \ No newline at end of file | ... | ... |
oletools/doc/olevba.html
| ... | ... | @@ -106,7 +106,11 @@ Options: |
| 106 | 106 | |
| 107 | 107 | --attr display the attribute lines at the beginning of VBA |
| 108 | 108 | |
| 109 | - source code</code></pre> | |
| 109 | + source code | |
| 110 | + | |
| 111 | + --reveal display the macro source code after replacing all the | |
| 112 | + | |
| 113 | + obfuscated strings by their decoded content.</code></pre> | |
| 110 | 114 | <h3 id="examples">Examples</h3> |
| 111 | 115 | <p>Scan a single file:</p> |
| 112 | 116 | <pre><code>olevba.py file.doc</code></pre> |
| ... | ... | @@ -114,6 +118,8 @@ Options: |
| 114 | 118 | <pre><code>olevba.py malicious_file.xls.zip -z infected</code></pre> |
| 115 | 119 | <p>Scan a single file, showing all obfuscated strings decoded:</p> |
| 116 | 120 | <pre><code>olevba.py file.doc --decode</code></pre> |
| 121 | +<p>Scan a single file, showing the macro source code with VBA strings deobfuscated:</p> | |
| 122 | +<pre><code>olevba.py file.doc --reveal</code></pre> | |
| 117 | 123 | <p>Scan VBA source code extracted into a text file:</p> |
| 118 | 124 | <pre><code>olevba.py -i source_code.vba</code></pre> |
| 119 | 125 | <p>Scan a collection of files stored in a folder:</p> | ... | ... |
oletools/doc/olevba.md
| 1 | -olevba | |
| 2 | -====== | |
| 3 | - | |
| 4 | -olevba is a script to parse OLE and OpenXML files such as MS Office documents | |
| 5 | -(e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text, | |
| 6 | -and detect security-related patterns such as **auto-executable macros**, **suspicious | |
| 7 | -VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques, | |
| 8 | -and potential **IOCs** (IP addresses, URLs, executable filenames, etc). | |
| 9 | -It also detects and decodes several common **obfuscation methods including Hex encoding, | |
| 10 | -StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings. | |
| 11 | - | |
| 12 | -It can be used either as a command-line tool, or as a python module from your own applications. | |
| 13 | - | |
| 14 | -It is part of the [python-oletools](http://www.decalage.info/python/oletools) package. | |
| 15 | - | |
| 16 | -olevba is based on source code from [officeparser](https://github.com/unixfreak0037/officeparser) | |
| 17 | -by John William Davison, with significant modifications. | |
| 18 | - | |
| 19 | -## Supported formats | |
| 20 | - | |
| 21 | -- Word 97-2003 (.doc, .dot) | |
| 22 | -- Word 2007+ (.docm, .dotm) | |
| 23 | -- Word 2003 XML (.xml) | |
| 24 | -- Word/Excel MHTML, aka Single File Web Page (.mht) | |
| 25 | -- Excel 97-2003 (.xls) | |
| 26 | -- Excel 2007+ (.xlsm, .xlsb) | |
| 27 | -- PowerPoint 2007+ (.pptm, .ppsm) | |
| 28 | - | |
| 29 | -## Main Features | |
| 30 | - | |
| 31 | -- Detect VBA macros in MS Office 97-2003 and 2007+ files, XML, MHT | |
| 32 | -- Extract VBA macro source code | |
| 33 | -- Detect auto-executable macros | |
| 34 | -- Detect suspicious VBA keywords often used by malware | |
| 35 | -- Detect anti-sandboxing and anti-virtualization techniques | |
| 36 | -- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex | |
| 37 | -- Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with | |
| 38 | -[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings | |
| 39 | -- Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names | |
| 40 | -- Scan multiple files and sample collections (wildcards, recursive) | |
| 41 | -- Triage mode for a summary view of multiple files | |
| 42 | -- Scan malware samples in password-protected Zip archives | |
| 43 | -- Python API to use olevba from your applications | |
| 44 | - | |
| 45 | -MS Office files encrypted with a password are also supported, because VBA macro code is never | |
| 46 | -encrypted, only the content of the document. | |
| 47 | - | |
| 48 | -## About VBA Macros | |
| 49 | - | |
| 50 | -See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros | |
| 51 | -and how they are stored in MS Office documents. | |
| 52 | - | |
| 53 | -## How it works | |
| 54 | - | |
| 55 | -1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away. | |
| 56 | -1. If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them. | |
| 57 | -1. olevba identifies all the VBA projects stored in the OLE structure. | |
| 58 | -1. Each VBA project is parsed to find the corresponding OLE streams containing macro code. | |
| 59 | -1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression). | |
| 60 | -1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions). | |
| 61 | -1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros | |
| 62 | -and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc). | |
| 63 | - | |
| 64 | - | |
| 65 | -## Usage | |
| 66 | - | |
| 67 | - :::text | |
| 68 | - Usage: olevba.py [options] <filename> [filename2 ...] | |
| 69 | - | |
| 70 | - Options: | |
| 71 | - -h, --help show this help message and exit | |
| 72 | - -r find files recursively in subdirectories. | |
| 73 | - -z ZIP_PASSWORD, --zip=ZIP_PASSWORD | |
| 74 | - if the file is a zip archive, open all files from it, | |
| 75 | - using the provided password (requires Python 2.6+) | |
| 76 | - -f ZIP_FNAME, --zipfname=ZIP_FNAME | |
| 77 | - if the file is a zip archive, file(s) to be opened | |
| 78 | - within the zip. Wildcards * and ? are supported. | |
| 79 | - (default:*) | |
| 80 | - -t, --triage triage mode, display results as a summary table | |
| 81 | - (default for multiple files) | |
| 82 | - -d, --detailed detailed mode, display full results (default for | |
| 83 | - single file) | |
| 84 | - -a, --analysis display only analysis results, not the macro source | |
| 85 | - code | |
| 86 | - -c, --code display only VBA source code, do not analyze it | |
| 87 | - -i INPUT, --input=INPUT | |
| 88 | - input file containing VBA source code to be analyzed | |
| 89 | - (no parsing) | |
| 90 | - --decode display all the obfuscated strings with their decoded | |
| 91 | - content (Hex, Base64, StrReverse, Dridex, VBA). | |
| 92 | - --attr display the attribute lines at the beginning of VBA | |
| 93 | - source code | |
| 94 | - | |
| 95 | -### Examples | |
| 96 | - | |
| 97 | -Scan a single file: | |
| 98 | - | |
| 99 | - :::text | |
| 100 | - olevba.py file.doc | |
| 101 | - | |
| 102 | -Scan a single file, stored in a Zip archive with password "infected": | |
| 103 | - | |
| 104 | - :::text | |
| 105 | - olevba.py malicious_file.xls.zip -z infected | |
| 106 | - | |
| 107 | -Scan a single file, showing all obfuscated strings decoded: | |
| 108 | - | |
| 109 | - :::text | |
| 110 | - olevba.py file.doc --decode | |
| 111 | - | |
| 112 | -Scan VBA source code extracted into a text file: | |
| 113 | - | |
| 114 | - :::text | |
| 115 | - olevba.py -i source_code.vba | |
| 116 | - | |
| 117 | -Scan a collection of files stored in a folder: | |
| 118 | - | |
| 119 | - :::text | |
| 120 | - olevba.py MalwareZoo/VBA/* | |
| 121 | - | |
| 122 | -Scan all .doc and .xls files, recursively in all subfolders: | |
| 123 | - | |
| 124 | - :::text | |
| 125 | - olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r | |
| 126 | - | |
| 127 | -Scan all .doc files within all .zip files with password, recursively: | |
| 128 | - | |
| 129 | - :::text | |
| 130 | - olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc | |
| 131 | - | |
| 132 | - | |
| 133 | -### Detailed analysis mode (default for single file) | |
| 134 | - | |
| 135 | -When a single file is scanned, or when using the option -d, all details of the analysis are displayed. | |
| 136 | - | |
| 137 | -For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/): | |
| 138 | - | |
| 139 | - :::text | |
| 140 | - >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected | |
| 141 | - =============================================================================== | |
| 142 | - FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip | |
| 143 | - Type: OLE | |
| 144 | - ------------------------------------------------------------------------------- | |
| 145 | - VBA MACRO ThisDocument.cls | |
| 146 | - in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument | |
| 147 | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
| 148 | - Option Explicit | |
| 149 | - Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_ | |
| 150 | - ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _ | |
| 151 | - ByVal HQTLDG As Long) As Long | |
| 152 | - Sub AutoOpen() | |
| 153 | - Auto_Open | |
| 154 | - End Sub | |
| 155 | - Sub Auto_Open() | |
| 156 | - SNVJYQ | |
| 157 | - End Sub | |
| 158 | - Public Sub SNVJYQ() | |
| 159 | - [Malicious Code...] | |
| 160 | - End Sub | |
| 161 | - Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean | |
| 162 | - [Malicious Code...] | |
| 163 | - Application.DisplayAlerts = False | |
| 164 | - Application.Quit | |
| 165 | - End Function | |
| 166 | - Sub Workbook_Open() | |
| 167 | - Auto_Open | |
| 168 | - End Sub | |
| 169 | - | |
| 170 | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
| 171 | - ANALYSIS: | |
| 172 | - +------------+----------------------+-----------------------------------------+ | |
| 173 | - | Type | Keyword | Description | | |
| 174 | - +------------+----------------------+-----------------------------------------+ | |
| 175 | - | AutoExec | AutoOpen | Runs when the Word document is opened | | |
| 176 | - | AutoExec | Auto_Open | Runs when the Excel Workbook is opened | | |
| 177 | - | AutoExec | Workbook_Open | Runs when the Excel Workbook is opened | | |
| 178 | - | Suspicious | Lib | May run code from a DLL | | |
| 179 | - | Suspicious | Shell | May run an executable file or a system | | |
| 180 | - | | | command | | |
| 181 | - | Suspicious | Environ | May read system environment variables | | |
| 182 | - | Suspicious | URLDownloadToFileA | May download files from the Internet | | |
| 183 | - | IOC | http://germanya.com. | URL | | |
| 184 | - | | ec/logs/test.exe" | | | |
| 185 | - | IOC | http://germanya.com. | URL | | |
| 186 | - | | ec/logs/counter.php" | | | |
| 187 | - | IOC | germanya.com | Executable file name | | |
| 188 | - | IOC | test.exe | Executable file name | | |
| 189 | - | IOC | sfjozjero.exe | Executable file name | | |
| 190 | - +------------+----------------------+-----------------------------------------+ | |
| 191 | - | |
| 192 | -### Triage mode (default for multiple files) | |
| 193 | - | |
| 194 | -When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. | |
| 195 | -This is more convenient for quick triage of a collection of suspicious files. | |
| 196 | - | |
| 197 | -The following flags show the results of the analysis: | |
| 198 | - | |
| 199 | -- **OLE**: the file type is OLE, for example MS Office 97-2003 | |
| 200 | -- **OpX**: the file type is OpenXML, for example MS Office 2007+ | |
| 201 | -- **XML**: the file type is Word 2003 XML | |
| 202 | -- **MHT**: the file type is Word MHTML, aka Single File Web Page (.mht) | |
| 203 | -- **?**: the file type is not supported | |
| 204 | -- **M**: contains VBA Macros | |
| 205 | -- **A**: auto-executable macros | |
| 206 | -- **S**: suspicious VBA keywords | |
| 207 | -- **I**: potential IOCs | |
| 208 | -- **H**: hex-encoded strings (potential obfuscation) | |
| 209 | -- **B**: Base64-encoded strings (potential obfuscation) | |
| 210 | -- **D**: Dridex-encoded strings (potential obfuscation) | |
| 211 | -- **V**: VBA string expressions (potential obfuscation) | |
| 212 | - | |
| 213 | -Here is an example: | |
| 214 | - | |
| 215 | - :::text | |
| 216 | - c:\>olevba.py \MalwareZoo\VBA\samples\* | |
| 217 | - Flags Filename | |
| 218 | - ----------- ----------------------------------------------------------------- | |
| 219 | - OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware | |
| 220 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware | |
| 221 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware | |
| 222 | - OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware | |
| 223 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware | |
| 224 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware | |
| 225 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware | |
| 226 | - OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware | |
| 227 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware | |
| 228 | - OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware | |
| 229 | - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware | |
| 230 | - OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc | |
| 231 | - OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc | |
| 232 | - OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware | |
| 233 | - OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware | |
| 234 | - OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc | |
| 235 | - | |
| 236 | - | |
| 237 | --------------------------------------------------------------------------- | |
| 238 | - | |
| 239 | -## How to use olevba in Python applications | |
| 240 | - | |
| 241 | -olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code | |
| 242 | -from your own python applications. | |
| 243 | - | |
| 244 | -IMPORTANT: olevba is currently under active development, therefore this API is likely to change. | |
| 245 | - | |
| 246 | -### Import olevba | |
| 247 | - | |
| 248 | -First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes: | |
| 249 | - | |
| 250 | - :::python | |
| 251 | - from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML | |
| 252 | - | |
| 253 | -### Parse a MS Office file | |
| 254 | - | |
| 255 | -To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter. | |
| 256 | -For example: | |
| 257 | - | |
| 258 | - :::python | |
| 259 | - vbaparser = VBA_Parser('my_file_with_macros.doc') | |
| 260 | - | |
| 261 | -The file may also be provided as a bytes string containing its data. In that case, the actual | |
| 262 | -filename must be provided for reference, and the file content with the data parameter. For example: | |
| 263 | - | |
| 264 | - :::python | |
| 265 | - myfile = 'my_file_with_macros.doc' | |
| 266 | - filedata = open(myfile, 'rb').read() | |
| 267 | - vbaparser = VBA_Parser(myfile, data=filedata) | |
| 268 | - | |
| 269 | -VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML | |
| 270 | -(MS Office 2007+), MHTML or Word 2003 XML. | |
| 271 | - | |
| 272 | -After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type. | |
| 273 | -It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module) | |
| 274 | - | |
| 275 | -### Detect VBA macros | |
| 276 | - | |
| 277 | -The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file, | |
| 278 | -False otherwise. | |
| 279 | - | |
| 280 | - :::python | |
| 281 | - if vbaparser.detect_vba_macros(): | |
| 282 | - print 'VBA Macros found' | |
| 283 | - else: | |
| 284 | - print 'No VBA Macros found' | |
| 285 | - | |
| 286 | -Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine | |
| 287 | -for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will | |
| 288 | -always return False because VBA Macros are stored in a different way which is not yet supported by olevba. | |
| 289 | - | |
| 290 | -Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method | |
| 291 | -may return True if the embedded document contains VBA Macros, even if the main document does not. | |
| 292 | - | |
| 293 | -### Extract VBA Macro Source Code | |
| 294 | - | |
| 295 | -The method **extract_macros** extracts and decompresses source code for each VBA macro found in the file (possibly | |
| 296 | -including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) | |
| 297 | -for each VBA macro found. | |
| 298 | - | |
| 299 | -- filename: If the file is OLE (MS Office 97-2003), filename is the path of the file. | |
| 300 | - If the file is OpenXML (MS Office 2007+), filename is the path of the OLE subfile containing VBA macros within the zip archive, | |
| 301 | - e.g. word/vbaProject.bin. | |
| 302 | -- stream_path: path of the OLE stream containing the VBA macro source code | |
| 303 | -- vba_filename: corresponding VBA filename | |
| 304 | -- vba_code: string containing the VBA source code in clear text | |
| 305 | - | |
| 306 | -Example: | |
| 307 | - | |
| 308 | - :::python | |
| 309 | - for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros(): | |
| 310 | - print '-'*79 | |
| 311 | - print 'Filename :', filename | |
| 312 | - print 'OLE stream :', stream_path | |
| 313 | - print 'VBA filename:', vba_filename | |
| 314 | - print '- '*39 | |
| 315 | - print vba_code | |
| 316 | - | |
| 317 | -Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples. | |
| 318 | - | |
| 319 | -### Analyze VBA Source Code | |
| 320 | - | |
| 321 | -Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained | |
| 322 | -in a file: | |
| 323 | - | |
| 324 | -The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all | |
| 325 | -VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc. | |
| 326 | - | |
| 327 | -scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 328 | -strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 329 | -By default, it will include the strings which contain printable characters only. | |
| 330 | - | |
| 331 | -**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 332 | - | |
| 333 | -- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or | |
| 334 | - 'VBA obfuscated Strings'. | |
| 335 | -- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 336 | - the decoded value of the string. | |
| 337 | -- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 338 | - | |
| 339 | -Example: | |
| 340 | - | |
| 341 | - :::python | |
| 342 | - results = vbaparser.scan() | |
| 343 | - for kw_type, keyword, description in results: | |
| 344 | - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 345 | - | |
| 346 | -**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category: | |
| 347 | -(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings). | |
| 348 | - | |
| 349 | - | |
| 350 | - | |
| 351 | -### Close the VBA_Parser | |
| 352 | - | |
| 353 | -After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed, | |
| 354 | -especially if your application is parsing many files. | |
| 355 | - | |
| 356 | - :::python | |
| 357 | - vbaparser.close() | |
| 358 | - | |
| 359 | - | |
| 360 | --------------------------------------------------------------------------- | |
| 361 | - | |
| 362 | -## Deprecated API | |
| 363 | - | |
| 364 | -The following methods and functions are still functional, but their usage is not recommended | |
| 365 | -since they have been replaced by better solutions. | |
| 366 | - | |
| 367 | -### VBA_Scanner (deprecated) | |
| 368 | - | |
| 369 | -Note: this API is under active development and may change in the future. | |
| 370 | - | |
| 371 | -The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, | |
| 372 | -suspicious keywords, IOCs, auto-executable macros, etc. | |
| 373 | - | |
| 374 | -First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the | |
| 375 | -extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis. | |
| 376 | - | |
| 377 | -scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 378 | -strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 379 | - | |
| 380 | -**scan** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 381 | - | |
| 382 | -- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'. | |
| 383 | -- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 384 | - the decoded value of the string. | |
| 385 | -- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 386 | - | |
| 387 | -Example: | |
| 388 | - | |
| 389 | - :::python | |
| 390 | - vba_scanner = VBA_Scanner(vba_code) | |
| 391 | - results = vba_scanner.scan(include_decoded_strings=True) | |
| 392 | - for kw_type, keyword, description in results: | |
| 393 | - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 394 | - | |
| 395 | -The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan(): | |
| 396 | - | |
| 397 | - :::python | |
| 398 | - results = scan_vba(vba_code, include_decoded_strings=True) | |
| 399 | - for kw_type, keyword, description in results: | |
| 400 | - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 401 | - | |
| 402 | -**scan_summary** returns a tuple with the number of items found for each category: | |
| 403 | -(autoexec, suspicious, IOCs, hex, base64, dridex). | |
| 404 | - | |
| 405 | - | |
| 406 | -### Detect auto-executable macros (deprecated) | |
| 407 | - | |
| 408 | -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 409 | - | |
| 410 | -The function **detect_autoexec** checks if VBA macro code contains specific macro names | |
| 411 | -that will be triggered when the document/workbook is opened, closed, changed, etc. | |
| 412 | - | |
| 413 | -It returns a list of tuples containing two strings, the detected keyword, and the | |
| 414 | -description of the trigger. (See the malware example above) | |
| 415 | - | |
| 416 | -Sample usage: | |
| 417 | - | |
| 418 | - :::python | |
| 419 | - from oletools.olevba import detect_autoexec | |
| 420 | - autoexec_keywords = detect_autoexec(vba_code) | |
| 421 | - if autoexec_keywords: | |
| 422 | - print 'Auto-executable macro keywords found:' | |
| 423 | - for keyword, description in autoexec_keywords: | |
| 424 | - print '%s: %s' % (keyword, description) | |
| 425 | - else: | |
| 426 | - print 'Auto-executable macro keywords: None found' | |
| 427 | - | |
| 428 | - | |
| 429 | -### Detect suspicious VBA keywords (deprecated) | |
| 430 | - | |
| 431 | -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 432 | - | |
| 433 | -The function **detect_suspicious** checks if VBA macro code contains specific | |
| 434 | -keywords often used by malware to act on the system (create files, run | |
| 435 | -commands or applications, write to the registry, etc). | |
| 436 | - | |
| 437 | -It returns a list of tuples containing two strings, the detected keyword, and the | |
| 438 | -description of the corresponding malicious behaviour. (See the malware example above) | |
| 439 | - | |
| 440 | -Sample usage: | |
| 441 | - | |
| 442 | - :::python | |
| 443 | - from oletools.olevba import detect_suspicious | |
| 444 | - suspicious_keywords = detect_suspicious(vba_code) | |
| 445 | - if suspicious_keywords: | |
| 446 | - print 'Suspicious VBA keywords found:' | |
| 447 | - for keyword, description in suspicious_keywords: | |
| 448 | - print '%s: %s' % (keyword, description) | |
| 449 | - else: | |
| 450 | - print 'Suspicious VBA keywords: None found' | |
| 451 | - | |
| 452 | - | |
| 453 | -### Extract potential IOCs (deprecated) | |
| 454 | - | |
| 455 | -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 456 | - | |
| 457 | -The function **detect_patterns** checks if VBA macro code contains specific | |
| 458 | -patterns of interest, that may be useful for malware analysis and detection | |
| 459 | -(potential Indicators of Compromise): IP addresses, e-mail addresses, | |
| 460 | -URLs, executable file names. | |
| 461 | - | |
| 462 | -It returns a list of tuples containing two strings, the pattern type, and the | |
| 463 | -extracted value. (See the malware example above) | |
| 464 | - | |
| 465 | -Sample usage: | |
| 466 | - | |
| 467 | - :::python | |
| 468 | - from oletools.olevba import detect_patterns | |
| 469 | - patterns = detect_patterns(vba_code) | |
| 470 | - if patterns: | |
| 471 | - print 'Patterns found:' | |
| 472 | - for pattern_type, value in patterns: | |
| 473 | - print '%s: %s' % (pattern_type, value) | |
| 474 | - else: | |
| 475 | - print 'Patterns: None found' | |
| 476 | - | |
| 477 | - | |
| 478 | --------------------------------------------------------------------------- | |
| 479 | - | |
| 480 | -python-oletools documentation | |
| 481 | ------------------------------ | |
| 482 | - | |
| 483 | -- [[Home]] | |
| 484 | -- [[License]] | |
| 485 | -- [[Install]] | |
| 486 | -- [[Contribute]], Suggest Improvements or Report Issues | |
| 487 | -- Tools: | |
| 488 | - - [[olebrowse]] | |
| 489 | - - [[oleid]] | |
| 490 | - - [[olemeta]] | |
| 491 | - - [[oletimes]] | |
| 492 | - - [[olevba]] | |
| 493 | - - [[pyxswf]] | |
| 1 | +olevba | |
| 2 | +====== | |
| 3 | + | |
| 4 | +olevba is a script to parse OLE and OpenXML files such as MS Office documents | |
| 5 | +(e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text, | |
| 6 | +and detect security-related patterns such as **auto-executable macros**, **suspicious | |
| 7 | +VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques, | |
| 8 | +and potential **IOCs** (IP addresses, URLs, executable filenames, etc). | |
| 9 | +It also detects and decodes several common **obfuscation methods including Hex encoding, | |
| 10 | +StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings. | |
| 11 | + | |
| 12 | +It can be used either as a command-line tool, or as a python module from your own applications. | |
| 13 | + | |
| 14 | +It is part of the [python-oletools](http://www.decalage.info/python/oletools) package. | |
| 15 | + | |
| 16 | +olevba is based on source code from [officeparser](https://github.com/unixfreak0037/officeparser) | |
| 17 | +by John William Davison, with significant modifications. | |
| 18 | + | |
| 19 | +## Supported formats | |
| 20 | + | |
| 21 | +- Word 97-2003 (.doc, .dot) | |
| 22 | +- Word 2007+ (.docm, .dotm) | |
| 23 | +- Word 2003 XML (.xml) | |
| 24 | +- Word/Excel MHTML, aka Single File Web Page (.mht) | |
| 25 | +- Excel 97-2003 (.xls) | |
| 26 | +- Excel 2007+ (.xlsm, .xlsb) | |
| 27 | +- PowerPoint 2007+ (.pptm, .ppsm) | |
| 28 | + | |
| 29 | +## Main Features | |
| 30 | + | |
| 31 | +- Detect VBA macros in MS Office 97-2003 and 2007+ files, XML, MHT | |
| 32 | +- Extract VBA macro source code | |
| 33 | +- Detect auto-executable macros | |
| 34 | +- Detect suspicious VBA keywords often used by malware | |
| 35 | +- Detect anti-sandboxing and anti-virtualization techniques | |
| 36 | +- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex | |
| 37 | +- Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with | |
| 38 | +[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings | |
| 39 | +- Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names | |
| 40 | +- Scan multiple files and sample collections (wildcards, recursive) | |
| 41 | +- Triage mode for a summary view of multiple files | |
| 42 | +- Scan malware samples in password-protected Zip archives | |
| 43 | +- Python API to use olevba from your applications | |
| 44 | + | |
| 45 | +MS Office files encrypted with a password are also supported, because VBA macro code is never | |
| 46 | +encrypted, only the content of the document. | |
| 47 | + | |
| 48 | +## About VBA Macros | |
| 49 | + | |
| 50 | +See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros | |
| 51 | +and how they are stored in MS Office documents. | |
| 52 | + | |
| 53 | +## How it works | |
| 54 | + | |
| 55 | +1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away. | |
| 56 | +1. If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them. | |
| 57 | +1. olevba identifies all the VBA projects stored in the OLE structure. | |
| 58 | +1. Each VBA project is parsed to find the corresponding OLE streams containing macro code. | |
| 59 | +1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression). | |
| 60 | +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions). | |
| 61 | +1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros | |
| 62 | +and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc). | |
| 63 | + | |
| 64 | + | |
| 65 | +## Usage | |
| 66 | + | |
| 67 | + :::text | |
| 68 | + Usage: olevba.py [options] <filename> [filename2 ...] | |
| 69 | + | |
| 70 | + Options: | |
| 71 | + -h, --help show this help message and exit | |
| 72 | + -r find files recursively in subdirectories. | |
| 73 | + -z ZIP_PASSWORD, --zip=ZIP_PASSWORD | |
| 74 | + if the file is a zip archive, open all files from it, | |
| 75 | + using the provided password (requires Python 2.6+) | |
| 76 | + -f ZIP_FNAME, --zipfname=ZIP_FNAME | |
| 77 | + if the file is a zip archive, file(s) to be opened | |
| 78 | + within the zip. Wildcards * and ? are supported. | |
| 79 | + (default:*) | |
| 80 | + -t, --triage triage mode, display results as a summary table | |
| 81 | + (default for multiple files) | |
| 82 | + -d, --detailed detailed mode, display full results (default for | |
| 83 | + single file) | |
| 84 | + -a, --analysis display only analysis results, not the macro source | |
| 85 | + code | |
| 86 | + -c, --code display only VBA source code, do not analyze it | |
| 87 | + -i INPUT, --input=INPUT | |
| 88 | + input file containing VBA source code to be analyzed | |
| 89 | + (no parsing) | |
| 90 | + --decode display all the obfuscated strings with their decoded | |
| 91 | + content (Hex, Base64, StrReverse, Dridex, VBA). | |
| 92 | + --attr display the attribute lines at the beginning of VBA | |
| 93 | + source code | |
| 94 | + --reveal display the macro source code after replacing all the | |
| 95 | + obfuscated strings by their decoded content. | |
| 96 | + | |
| 97 | +### Examples | |
| 98 | + | |
| 99 | +Scan a single file: | |
| 100 | + | |
| 101 | + :::text | |
| 102 | + olevba.py file.doc | |
| 103 | + | |
| 104 | +Scan a single file, stored in a Zip archive with password "infected": | |
| 105 | + | |
| 106 | + :::text | |
| 107 | + olevba.py malicious_file.xls.zip -z infected | |
| 108 | + | |
| 109 | +Scan a single file, showing all obfuscated strings decoded: | |
| 110 | + | |
| 111 | + :::text | |
| 112 | + olevba.py file.doc --decode | |
| 113 | + | |
| 114 | +Scan a single file, showing the macro source code with VBA strings deobfuscated: | |
| 115 | + | |
| 116 | + :::text | |
| 117 | + olevba.py file.doc --reveal | |
| 118 | + | |
| 119 | +Scan VBA source code extracted into a text file: | |
| 120 | + | |
| 121 | + :::text | |
| 122 | + olevba.py -i source_code.vba | |
| 123 | + | |
| 124 | +Scan a collection of files stored in a folder: | |
| 125 | + | |
| 126 | + :::text | |
| 127 | + olevba.py MalwareZoo/VBA/* | |
| 128 | + | |
| 129 | +Scan all .doc and .xls files, recursively in all subfolders: | |
| 130 | + | |
| 131 | + :::text | |
| 132 | + olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r | |
| 133 | + | |
| 134 | +Scan all .doc files within all .zip files with password, recursively: | |
| 135 | + | |
| 136 | + :::text | |
| 137 | + olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc | |
| 138 | + | |
| 139 | + | |
| 140 | +### Detailed analysis mode (default for single file) | |
| 141 | + | |
| 142 | +When a single file is scanned, or when using the option -d, all details of the analysis are displayed. | |
| 143 | + | |
| 144 | +For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/): | |
| 145 | + | |
| 146 | + :::text | |
| 147 | + >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected | |
| 148 | + =============================================================================== | |
| 149 | + FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip | |
| 150 | + Type: OLE | |
| 151 | + ------------------------------------------------------------------------------- | |
| 152 | + VBA MACRO ThisDocument.cls | |
| 153 | + in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument | |
| 154 | + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
| 155 | + Option Explicit | |
| 156 | + Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_ | |
| 157 | + ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _ | |
| 158 | + ByVal HQTLDG As Long) As Long | |
| 159 | + Sub AutoOpen() | |
| 160 | + Auto_Open | |
| 161 | + End Sub | |
| 162 | + Sub Auto_Open() | |
| 163 | + SNVJYQ | |
| 164 | + End Sub | |
| 165 | + Public Sub SNVJYQ() | |
| 166 | + [Malicious Code...] | |
| 167 | + End Sub | |
| 168 | + Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean | |
| 169 | + [Malicious Code...] | |
| 170 | + Application.DisplayAlerts = False | |
| 171 | + Application.Quit | |
| 172 | + End Function | |
| 173 | + Sub Workbook_Open() | |
| 174 | + Auto_Open | |
| 175 | + End Sub | |
| 176 | + | |
| 177 | + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
| 178 | + ANALYSIS: | |
| 179 | + +------------+----------------------+-----------------------------------------+ | |
| 180 | + | Type | Keyword | Description | | |
| 181 | + +------------+----------------------+-----------------------------------------+ | |
| 182 | + | AutoExec | AutoOpen | Runs when the Word document is opened | | |
| 183 | + | AutoExec | Auto_Open | Runs when the Excel Workbook is opened | | |
| 184 | + | AutoExec | Workbook_Open | Runs when the Excel Workbook is opened | | |
| 185 | + | Suspicious | Lib | May run code from a DLL | | |
| 186 | + | Suspicious | Shell | May run an executable file or a system | | |
| 187 | + | | | command | | |
| 188 | + | Suspicious | Environ | May read system environment variables | | |
| 189 | + | Suspicious | URLDownloadToFileA | May download files from the Internet | | |
| 190 | + | IOC | http://germanya.com. | URL | | |
| 191 | + | | ec/logs/test.exe" | | | |
| 192 | + | IOC | http://germanya.com. | URL | | |
| 193 | + | | ec/logs/counter.php" | | | |
| 194 | + | IOC | germanya.com | Executable file name | | |
| 195 | + | IOC | test.exe | Executable file name | | |
| 196 | + | IOC | sfjozjero.exe | Executable file name | | |
| 197 | + +------------+----------------------+-----------------------------------------+ | |
| 198 | + | |
| 199 | +### Triage mode (default for multiple files) | |
| 200 | + | |
| 201 | +When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed. | |
| 202 | +This is more convenient for quick triage of a collection of suspicious files. | |
| 203 | + | |
| 204 | +The following flags show the results of the analysis: | |
| 205 | + | |
| 206 | +- **OLE**: the file type is OLE, for example MS Office 97-2003 | |
| 207 | +- **OpX**: the file type is OpenXML, for example MS Office 2007+ | |
| 208 | +- **XML**: the file type is Word 2003 XML | |
| 209 | +- **MHT**: the file type is Word MHTML, aka Single File Web Page (.mht) | |
| 210 | +- **?**: the file type is not supported | |
| 211 | +- **M**: contains VBA Macros | |
| 212 | +- **A**: auto-executable macros | |
| 213 | +- **S**: suspicious VBA keywords | |
| 214 | +- **I**: potential IOCs | |
| 215 | +- **H**: hex-encoded strings (potential obfuscation) | |
| 216 | +- **B**: Base64-encoded strings (potential obfuscation) | |
| 217 | +- **D**: Dridex-encoded strings (potential obfuscation) | |
| 218 | +- **V**: VBA string expressions (potential obfuscation) | |
| 219 | + | |
| 220 | +Here is an example: | |
| 221 | + | |
| 222 | + :::text | |
| 223 | + c:\>olevba.py \MalwareZoo\VBA\samples\* | |
| 224 | + Flags Filename | |
| 225 | + ----------- ----------------------------------------------------------------- | |
| 226 | + OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware | |
| 227 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware | |
| 228 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware | |
| 229 | + OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware | |
| 230 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware | |
| 231 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware | |
| 232 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware | |
| 233 | + OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware | |
| 234 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware | |
| 235 | + OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware | |
| 236 | + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware | |
| 237 | + OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc | |
| 238 | + OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc | |
| 239 | + OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware | |
| 240 | + OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware | |
| 241 | + OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc | |
| 242 | + | |
| 243 | + | |
| 244 | +-------------------------------------------------------------------------- | |
| 245 | + | |
| 246 | +## How to use olevba in Python applications | |
| 247 | + | |
| 248 | +olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code | |
| 249 | +from your own python applications. | |
| 250 | + | |
| 251 | +IMPORTANT: olevba is currently under active development, therefore this API is likely to change. | |
| 252 | + | |
| 253 | +### Import olevba | |
| 254 | + | |
| 255 | +First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes: | |
| 256 | + | |
| 257 | + :::python | |
| 258 | + from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML | |
| 259 | + | |
| 260 | +### Parse a MS Office file | |
| 261 | + | |
| 262 | +To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter. | |
| 263 | +For example: | |
| 264 | + | |
| 265 | + :::python | |
| 266 | + vbaparser = VBA_Parser('my_file_with_macros.doc') | |
| 267 | + | |
| 268 | +The file may also be provided as a bytes string containing its data. In that case, the actual | |
| 269 | +filename must be provided for reference, and the file content with the data parameter. For example: | |
| 270 | + | |
| 271 | + :::python | |
| 272 | + myfile = 'my_file_with_macros.doc' | |
| 273 | + filedata = open(myfile, 'rb').read() | |
| 274 | + vbaparser = VBA_Parser(myfile, data=filedata) | |
| 275 | + | |
| 276 | +VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML | |
| 277 | +(MS Office 2007+), MHTML or Word 2003 XML. | |
| 278 | + | |
| 279 | +After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type. | |
| 280 | +It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module) | |
| 281 | + | |
| 282 | +### Detect VBA macros | |
| 283 | + | |
| 284 | +The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file, | |
| 285 | +False otherwise. | |
| 286 | + | |
| 287 | + :::python | |
| 288 | + if vbaparser.detect_vba_macros(): | |
| 289 | + print 'VBA Macros found' | |
| 290 | + else: | |
| 291 | + print 'No VBA Macros found' | |
| 292 | + | |
| 293 | +Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine | |
| 294 | +for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will | |
| 295 | +always return False because VBA Macros are stored in a different way which is not yet supported by olevba. | |
| 296 | + | |
| 297 | +Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method | |
| 298 | +may return True if the embedded document contains VBA Macros, even if the main document does not. | |
| 299 | + | |
| 300 | +### Extract VBA Macro Source Code | |
| 301 | + | |
| 302 | +The method **extract_macros** extracts and decompresses source code for each VBA macro found in the file (possibly | |
| 303 | +including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code) | |
| 304 | +for each VBA macro found. | |
| 305 | + | |
| 306 | +- filename: If the file is OLE (MS Office 97-2003), filename is the path of the file. | |
| 307 | + If the file is OpenXML (MS Office 2007+), filename is the path of the OLE subfile containing VBA macros within the zip archive, | |
| 308 | + e.g. word/vbaProject.bin. | |
| 309 | +- stream_path: path of the OLE stream containing the VBA macro source code | |
| 310 | +- vba_filename: corresponding VBA filename | |
| 311 | +- vba_code: string containing the VBA source code in clear text | |
| 312 | + | |
| 313 | +Example: | |
| 314 | + | |
| 315 | + :::python | |
| 316 | + for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros(): | |
| 317 | + print '-'*79 | |
| 318 | + print 'Filename :', filename | |
| 319 | + print 'OLE stream :', stream_path | |
| 320 | + print 'VBA filename:', vba_filename | |
| 321 | + print '- '*39 | |
| 322 | + print vba_code | |
| 323 | + | |
| 324 | +Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples. | |
| 325 | + | |
| 326 | +### Analyze VBA Source Code | |
| 327 | + | |
| 328 | +Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained | |
| 329 | +in a file: | |
| 330 | + | |
| 331 | +The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all | |
| 332 | +VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc. | |
| 333 | + | |
| 334 | +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 335 | +strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 336 | +By default, it will include the strings which contain printable characters only. | |
| 337 | + | |
| 338 | +**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 339 | + | |
| 340 | +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or | |
| 341 | + 'VBA obfuscated Strings'. | |
| 342 | +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 343 | + the decoded value of the string. | |
| 344 | +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 345 | + | |
| 346 | +Example: | |
| 347 | + | |
| 348 | + :::python | |
| 349 | + results = vbaparser.scan() | |
| 350 | + for kw_type, keyword, description in results: | |
| 351 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 352 | + | |
| 353 | +**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category: | |
| 354 | +(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings). | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | +### Close the VBA_Parser | |
| 359 | + | |
| 360 | +After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed, | |
| 361 | +especially if your application is parsing many files. | |
| 362 | + | |
| 363 | + :::python | |
| 364 | + vbaparser.close() | |
| 365 | + | |
| 366 | + | |
| 367 | +-------------------------------------------------------------------------- | |
| 368 | + | |
| 369 | +## Deprecated API | |
| 370 | + | |
| 371 | +The following methods and functions are still functional, but their usage is not recommended | |
| 372 | +since they have been replaced by better solutions. | |
| 373 | + | |
| 374 | +### VBA_Scanner (deprecated) | |
| 375 | + | |
| 376 | +Note: this API is under active development and may change in the future. | |
| 377 | + | |
| 378 | +The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings, | |
| 379 | +suspicious keywords, IOCs, auto-executable macros, etc. | |
| 380 | + | |
| 381 | +First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the | |
| 382 | +extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis. | |
| 383 | + | |
| 384 | +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded | |
| 385 | +strings found in the code (Hex, Base64, Dridex) with their decoded value. | |
| 386 | + | |
| 387 | +**scan** returns a list of tuples (type, keyword, description), one for each item in the results. | |
| 388 | + | |
| 389 | +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'. | |
| 390 | +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is | |
| 391 | + the decoded value of the string. | |
| 392 | +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string. | |
| 393 | + | |
| 394 | +Example: | |
| 395 | + | |
| 396 | + :::python | |
| 397 | + vba_scanner = VBA_Scanner(vba_code) | |
| 398 | + results = vba_scanner.scan(include_decoded_strings=True) | |
| 399 | + for kw_type, keyword, description in results: | |
| 400 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 401 | + | |
| 402 | +The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan(): | |
| 403 | + | |
| 404 | + :::python | |
| 405 | + results = scan_vba(vba_code, include_decoded_strings=True) | |
| 406 | + for kw_type, keyword, description in results: | |
| 407 | + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description) | |
| 408 | + | |
| 409 | +**scan_summary** returns a tuple with the number of items found for each category: | |
| 410 | +(autoexec, suspicious, IOCs, hex, base64, dridex). | |
| 411 | + | |
| 412 | + | |
| 413 | +### Detect auto-executable macros (deprecated) | |
| 414 | + | |
| 415 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 416 | + | |
| 417 | +The function **detect_autoexec** checks if VBA macro code contains specific macro names | |
| 418 | +that will be triggered when the document/workbook is opened, closed, changed, etc. | |
| 419 | + | |
| 420 | +It returns a list of tuples containing two strings, the detected keyword, and the | |
| 421 | +description of the trigger. (See the malware example above) | |
| 422 | + | |
| 423 | +Sample usage: | |
| 424 | + | |
| 425 | + :::python | |
| 426 | + from oletools.olevba import detect_autoexec | |
| 427 | + autoexec_keywords = detect_autoexec(vba_code) | |
| 428 | + if autoexec_keywords: | |
| 429 | + print 'Auto-executable macro keywords found:' | |
| 430 | + for keyword, description in autoexec_keywords: | |
| 431 | + print '%s: %s' % (keyword, description) | |
| 432 | + else: | |
| 433 | + print 'Auto-executable macro keywords: None found' | |
| 434 | + | |
| 435 | + | |
| 436 | +### Detect suspicious VBA keywords (deprecated) | |
| 437 | + | |
| 438 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 439 | + | |
| 440 | +The function **detect_suspicious** checks if VBA macro code contains specific | |
| 441 | +keywords often used by malware to act on the system (create files, run | |
| 442 | +commands or applications, write to the registry, etc). | |
| 443 | + | |
| 444 | +It returns a list of tuples containing two strings, the detected keyword, and the | |
| 445 | +description of the corresponding malicious behaviour. (See the malware example above) | |
| 446 | + | |
| 447 | +Sample usage: | |
| 448 | + | |
| 449 | + :::python | |
| 450 | + from oletools.olevba import detect_suspicious | |
| 451 | + suspicious_keywords = detect_suspicious(vba_code) | |
| 452 | + if suspicious_keywords: | |
| 453 | + print 'Suspicious VBA keywords found:' | |
| 454 | + for keyword, description in suspicious_keywords: | |
| 455 | + print '%s: %s' % (keyword, description) | |
| 456 | + else: | |
| 457 | + print 'Suspicious VBA keywords: None found' | |
| 458 | + | |
| 459 | + | |
| 460 | +### Extract potential IOCs (deprecated) | |
| 461 | + | |
| 462 | +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once. | |
| 463 | + | |
| 464 | +The function **detect_patterns** checks if VBA macro code contains specific | |
| 465 | +patterns of interest, that may be useful for malware analysis and detection | |
| 466 | +(potential Indicators of Compromise): IP addresses, e-mail addresses, | |
| 467 | +URLs, executable file names. | |
| 468 | + | |
| 469 | +It returns a list of tuples containing two strings, the pattern type, and the | |
| 470 | +extracted value. (See the malware example above) | |
| 471 | + | |
| 472 | +Sample usage: | |
| 473 | + | |
| 474 | + :::python | |
| 475 | + from oletools.olevba import detect_patterns | |
| 476 | + patterns = detect_patterns(vba_code) | |
| 477 | + if patterns: | |
| 478 | + print 'Patterns found:' | |
| 479 | + for pattern_type, value in patterns: | |
| 480 | + print '%s: %s' % (pattern_type, value) | |
| 481 | + else: | |
| 482 | + print 'Patterns: None found' | |
| 483 | + | |
| 484 | + | |
| 485 | +-------------------------------------------------------------------------- | |
| 486 | + | |
| 487 | +python-oletools documentation | |
| 488 | +----------------------------- | |
| 489 | + | |
| 490 | +- [[Home]] | |
| 491 | +- [[License]] | |
| 492 | +- [[Install]] | |
| 493 | +- [[Contribute]], Suggest Improvements or Report Issues | |
| 494 | +- Tools: | |
| 495 | + - [[olebrowse]] | |
| 496 | + - [[oleid]] | |
| 497 | + - [[olemeta]] | |
| 498 | + - [[oletimes]] | |
| 499 | + - [[olevba]] | |
| 500 | + - [[pyxswf]] | |
| 494 | 501 | - [[rtfobj]] |
| 495 | 502 | \ No newline at end of file | ... | ... |
oletools/olevba.py
| ... | ... | @@ -2100,6 +2100,7 @@ class VBA_Parser_CLI(VBA_Parser): |
| 2100 | 2100 | |
| 2101 | 2101 | |
| 2102 | 2102 | def reveal(self): |
| 2103 | + #TODO: move this code to the VBA_Parser class (without print) | |
| 2103 | 2104 | print 'MACRO SOURCE CODE WITH DEOBFUSCATED VBA STRINGS (EXPERIMENTAL):\n' |
| 2104 | 2105 | # we only want printable strings: |
| 2105 | 2106 | analysis = self.analyze_macros(show_decoded_strings=False) | ... | ... |