Commit c47b13c16461fd24a8217e12419989c0a81721fc

Authored by Philippe Lagadec
1 parent dfd6b4f0

updated documentation for v0.41

oletools/README.html
... ... @@ -4,7 +4,8 @@
4 4 <p>Note: python-oletools is not related to OLETools published by BeCubed Software.</p>
5 5 <h2 id="news">News</h2>
6 6 <ul>
7   -<li><strong>2015-09-17 v0.40</strong>: Improved macro deobfuscation in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to decode Hex and Base64 within VBA expressions. Display printable deobfuscated strings by default. Improved the VBA_Parser API. Improved performance. Fixed <a href="https://bitbucket.org/decalage/oletools/issue/23">issue #23</a> with sys.stderr.</li>
  7 +<li><strong>2015-09-22 v0.41</strong>: added new --reveal option to <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to show the macro code with VBA strings deobfuscated.</li>
  8 +<li>2015-09-17 v0.40: Improved macro deobfuscation in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, to decode Hex and Base64 within VBA expressions. Display printable deobfuscated strings by default. Improved the VBA_Parser API. Improved performance. Fixed <a href="https://bitbucket.org/decalage/oletools/issue/23">issue #23</a> with sys.stderr.</li>
8 9 <li>2015-06-19 v0.12: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> can now deobfuscate VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &amp;, using a VBA parser built with <a href="http://pyparsing.wikispaces.com">pyparsing</a>. New options to display only the analysis results or only the macros source code. The analysis is now done on all the VBA modules at once.</li>
9 10 <li>2015-05-29 v0.11: Improved parsing of MHTML and ActiveMime/MSO files in <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a>, added several suspicious keywords to VBA scanner (thanks to <span class="citation">@ozhermit</span> and Davy Douhine for the suggestions)</li>
10 11 <li>2015-05-06 v0.10: <a href="https://bitbucket.org/decalage/oletools/wiki/olevba">olevba</a> now supports Word MHTML files with macros, aka &quot;Single File Web Page&quot; (.mht) - see <a href="https://bitbucket.org/decalage/oletools/issue/10">issue #10</a> for more info</li>
... ...
oletools/README.rst
... ... @@ -26,7 +26,10 @@ Software.
26 26 News
27 27 ----
28 28  
29   -- **2015-09-17 v0.40**: Improved macro deobfuscation in
  29 +- **2015-09-22 v0.41**: added new --reveal option to
  30 + `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__, to
  31 + show the macro code with VBA strings deobfuscated.
  32 +- 2015-09-17 v0.40: Improved macro deobfuscation in
30 33 `olevba <https://bitbucket.org/decalage/oletools/wiki/olevba>`__, to
31 34 decode Hex and Base64 within VBA expressions. Display printable
32 35 deobfuscated strings by default. Improved the VBA\_Parser API.
... ...
oletools/doc/Home.html
1   -<p>python-oletools v0.40 documentation</p>
  1 +<p>python-oletools v0.41 documentation</p>
2 2 <p>===================================</p>
3 3 <p>This is the home page of the documentation for python-oletools. The latest version can be found</p>
4 4 <p><a href="https://bitbucket.org/decalage/oletools/wiki">online</a>, otherwise a copy is provided in the doc subfolder of the package.</p>
... ...
oletools/doc/Home.md
1   -python-oletools v0.40 documentation
2   -===================================
3   -
4   -This is the home page of the documentation for python-oletools. The latest version can be found
5   -[online](https://bitbucket.org/decalage/oletools/wiki), otherwise a copy is provided in the doc subfolder of the package.
6   -
7   -[python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze
8   -[Microsoft OLE2 files](http://en.wikipedia.org/wiki/Compound_File_Binary_Format)
9   -(also called Structured Storage, Compound File Binary Format or Compound Document File Format),
10   -such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging.
11   -It is based on the [olefile](http://www.decalage.info/olefile) parser.
12   -See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info.
13   -
14   -**Quick links:** [Home page](http://www.decalage.info/python/oletools) -
15   -[Download/Install](https://bitbucket.org/decalage/oletools/wiki/Install) -
16   -[Documentation](https://bitbucket.org/decalage/oletools/wiki) -
17   -[Report Issues/Suggestions/Questions](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) -
18   -[Contact the author](http://decalage.info/contact) -
19   -[Repository](https://bitbucket.org/decalage/oletools) -
20   -[Updates on Twitter](https://twitter.com/decalage2)
21   -
22   -Note: python-oletools is not related to OLETools published by BeCubed Software.
23   -
24   -Tools in python-oletools:
25   --------------------------
26   -
27   -- **[[olebrowse]]**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to
28   - view and extract individual data streams.
29   -- **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.
30   -- **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files.
31   -- **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages.
32   -- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
33   -- **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may
34   - be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
35   - which is especially useful for malware analysis.
36   -- **[[rtfobj]]**: a tool and python module to extract embedded objects from RTF files.
37   -- and a few others (coming soon)
38   -
39   ---------------------------------------------------------------------------
40   -
41   -python-oletools documentation
42   ------------------------------
43   -
44   -- [[Home]]
45   -- [[License]]
46   -- [[Install]]
47   -- [[Contribute]], Suggest Improvements or Report Issues
48   -- Tools:
49   - - [[olebrowse]]
50   - - [[oleid]]
51   - - [[olemeta]]
52   - - [[oletimes]]
53   - - [[olevba]]
54   - - [[pyxswf]]
  1 +python-oletools v0.41 documentation
  2 +===================================
  3 +
  4 +This is the home page of the documentation for python-oletools. The latest version can be found
  5 +[online](https://bitbucket.org/decalage/oletools/wiki), otherwise a copy is provided in the doc subfolder of the package.
  6 +
  7 +[python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze
  8 +[Microsoft OLE2 files](http://en.wikipedia.org/wiki/Compound_File_Binary_Format)
  9 +(also called Structured Storage, Compound File Binary Format or Compound Document File Format),
  10 +such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging.
  11 +It is based on the [olefile](http://www.decalage.info/olefile) parser.
  12 +See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info.
  13 +
  14 +**Quick links:** [Home page](http://www.decalage.info/python/oletools) -
  15 +[Download/Install](https://bitbucket.org/decalage/oletools/wiki/Install) -
  16 +[Documentation](https://bitbucket.org/decalage/oletools/wiki) -
  17 +[Report Issues/Suggestions/Questions](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) -
  18 +[Contact the author](http://decalage.info/contact) -
  19 +[Repository](https://bitbucket.org/decalage/oletools) -
  20 +[Updates on Twitter](https://twitter.com/decalage2)
  21 +
  22 +Note: python-oletools is not related to OLETools published by BeCubed Software.
  23 +
  24 +Tools in python-oletools:
  25 +-------------------------
  26 +
  27 +- **[[olebrowse]]**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to
  28 + view and extract individual data streams.
  29 +- **[[oleid]]**: a tool to analyze OLE files to detect specific characteristics usually found in malicious files.
  30 +- **[[olemeta]]**: a tool to extract all standard properties (metadata) from OLE files.
  31 +- **[[oletimes]]**: a tool to extract creation and modification timestamps of all streams and storages.
  32 +- **[[olevba]]**: a tool to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
  33 +- **[[pyxswf]]**: a tool to detect, extract and analyze Flash objects (SWF) that may
  34 + be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
  35 + which is especially useful for malware analysis.
  36 +- **[[rtfobj]]**: a tool and python module to extract embedded objects from RTF files.
  37 +- and a few others (coming soon)
  38 +
  39 +--------------------------------------------------------------------------
  40 +
  41 +python-oletools documentation
  42 +-----------------------------
  43 +
  44 +- [[Home]]
  45 +- [[License]]
  46 +- [[Install]]
  47 +- [[Contribute]], Suggest Improvements or Report Issues
  48 +- Tools:
  49 + - [[olebrowse]]
  50 + - [[oleid]]
  51 + - [[olemeta]]
  52 + - [[oletimes]]
  53 + - [[olevba]]
  54 + - [[pyxswf]]
55 55 - [[rtfobj]]
56 56 \ No newline at end of file
... ...
oletools/doc/olevba.html
... ... @@ -106,7 +106,11 @@ Options:
106 106  
107 107 --attr display the attribute lines at the beginning of VBA
108 108  
109   - source code</code></pre>
  109 + source code
  110 +
  111 + --reveal display the macro source code after replacing all the
  112 +
  113 + obfuscated strings by their decoded content.</code></pre>
110 114 <h3 id="examples">Examples</h3>
111 115 <p>Scan a single file:</p>
112 116 <pre><code>olevba.py file.doc</code></pre>
... ... @@ -114,6 +118,8 @@ Options:
114 118 <pre><code>olevba.py malicious_file.xls.zip -z infected</code></pre>
115 119 <p>Scan a single file, showing all obfuscated strings decoded:</p>
116 120 <pre><code>olevba.py file.doc --decode</code></pre>
  121 +<p>Scan a single file, showing the macro source code with VBA strings deobfuscated:</p>
  122 +<pre><code>olevba.py file.doc --reveal</code></pre>
117 123 <p>Scan VBA source code extracted into a text file:</p>
118 124 <pre><code>olevba.py -i source_code.vba</code></pre>
119 125 <p>Scan a collection of files stored in a folder:</p>
... ...
oletools/doc/olevba.md
1   -olevba
2   -======
3   -
4   -olevba is a script to parse OLE and OpenXML files such as MS Office documents
5   -(e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text,
6   -and detect security-related patterns such as **auto-executable macros**, **suspicious
7   -VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,
8   -and potential **IOCs** (IP addresses, URLs, executable filenames, etc).
9   -It also detects and decodes several common **obfuscation methods including Hex encoding,
10   -StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings.
11   -
12   -It can be used either as a command-line tool, or as a python module from your own applications.
13   -
14   -It is part of the [python-oletools](http://www.decalage.info/python/oletools) package.
15   -
16   -olevba is based on source code from [officeparser](https://github.com/unixfreak0037/officeparser)
17   -by John William Davison, with significant modifications.
18   -
19   -## Supported formats
20   -
21   -- Word 97-2003 (.doc, .dot)
22   -- Word 2007+ (.docm, .dotm)
23   -- Word 2003 XML (.xml)
24   -- Word/Excel MHTML, aka Single File Web Page (.mht)
25   -- Excel 97-2003 (.xls)
26   -- Excel 2007+ (.xlsm, .xlsb)
27   -- PowerPoint 2007+ (.pptm, .ppsm)
28   -
29   -## Main Features
30   -
31   -- Detect VBA macros in MS Office 97-2003 and 2007+ files, XML, MHT
32   -- Extract VBA macro source code
33   -- Detect auto-executable macros
34   -- Detect suspicious VBA keywords often used by malware
35   -- Detect anti-sandboxing and anti-virtualization techniques
36   -- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
37   -- Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with
38   -[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings
39   -- Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
40   -- Scan multiple files and sample collections (wildcards, recursive)
41   -- Triage mode for a summary view of multiple files
42   -- Scan malware samples in password-protected Zip archives
43   -- Python API to use olevba from your applications
44   -
45   -MS Office files encrypted with a password are also supported, because VBA macro code is never
46   -encrypted, only the content of the document.
47   -
48   -## About VBA Macros
49   -
50   -See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros
51   -and how they are stored in MS Office documents.
52   -
53   -## How it works
54   -
55   -1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
56   -1. If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them.
57   -1. olevba identifies all the VBA projects stored in the OLE structure.
58   -1. Each VBA project is parsed to find the corresponding OLE streams containing macro code.
59   -1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
60   -1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions).
61   -1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
62   -and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
63   -
64   -
65   -## Usage
66   -
67   - :::text
68   - Usage: olevba.py [options] <filename> [filename2 ...]
69   -
70   - Options:
71   - -h, --help show this help message and exit
72   - -r find files recursively in subdirectories.
73   - -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
74   - if the file is a zip archive, open all files from it,
75   - using the provided password (requires Python 2.6+)
76   - -f ZIP_FNAME, --zipfname=ZIP_FNAME
77   - if the file is a zip archive, file(s) to be opened
78   - within the zip. Wildcards * and ? are supported.
79   - (default:*)
80   - -t, --triage triage mode, display results as a summary table
81   - (default for multiple files)
82   - -d, --detailed detailed mode, display full results (default for
83   - single file)
84   - -a, --analysis display only analysis results, not the macro source
85   - code
86   - -c, --code display only VBA source code, do not analyze it
87   - -i INPUT, --input=INPUT
88   - input file containing VBA source code to be analyzed
89   - (no parsing)
90   - --decode display all the obfuscated strings with their decoded
91   - content (Hex, Base64, StrReverse, Dridex, VBA).
92   - --attr display the attribute lines at the beginning of VBA
93   - source code
94   -
95   -### Examples
96   -
97   -Scan a single file:
98   -
99   - :::text
100   - olevba.py file.doc
101   -
102   -Scan a single file, stored in a Zip archive with password "infected":
103   -
104   - :::text
105   - olevba.py malicious_file.xls.zip -z infected
106   -
107   -Scan a single file, showing all obfuscated strings decoded:
108   -
109   - :::text
110   - olevba.py file.doc --decode
111   -
112   -Scan VBA source code extracted into a text file:
113   -
114   - :::text
115   - olevba.py -i source_code.vba
116   -
117   -Scan a collection of files stored in a folder:
118   -
119   - :::text
120   - olevba.py MalwareZoo/VBA/*
121   -
122   -Scan all .doc and .xls files, recursively in all subfolders:
123   -
124   - :::text
125   - olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
126   -
127   -Scan all .doc files within all .zip files with password, recursively:
128   -
129   - :::text
130   - olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
131   -
132   -
133   -### Detailed analysis mode (default for single file)
134   -
135   -When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
136   -
137   -For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/):
138   -
139   - :::text
140   - >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
141   - ===============================================================================
142   - FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
143   - Type: OLE
144   - -------------------------------------------------------------------------------
145   - VBA MACRO ThisDocument.cls
146   - in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument
147   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
148   - Option Explicit
149   - Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_
150   - ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _
151   - ByVal HQTLDG As Long) As Long
152   - Sub AutoOpen()
153   - Auto_Open
154   - End Sub
155   - Sub Auto_Open()
156   - SNVJYQ
157   - End Sub
158   - Public Sub SNVJYQ()
159   - [Malicious Code...]
160   - End Sub
161   - Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean
162   - [Malicious Code...]
163   - Application.DisplayAlerts = False
164   - Application.Quit
165   - End Function
166   - Sub Workbook_Open()
167   - Auto_Open
168   - End Sub
169   -
170   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
171   - ANALYSIS:
172   - +------------+----------------------+-----------------------------------------+
173   - | Type | Keyword | Description |
174   - +------------+----------------------+-----------------------------------------+
175   - | AutoExec | AutoOpen | Runs when the Word document is opened |
176   - | AutoExec | Auto_Open | Runs when the Excel Workbook is opened |
177   - | AutoExec | Workbook_Open | Runs when the Excel Workbook is opened |
178   - | Suspicious | Lib | May run code from a DLL |
179   - | Suspicious | Shell | May run an executable file or a system |
180   - | | | command |
181   - | Suspicious | Environ | May read system environment variables |
182   - | Suspicious | URLDownloadToFileA | May download files from the Internet |
183   - | IOC | http://germanya.com. | URL |
184   - | | ec/logs/test.exe" | |
185   - | IOC | http://germanya.com. | URL |
186   - | | ec/logs/counter.php" | |
187   - | IOC | germanya.com | Executable file name |
188   - | IOC | test.exe | Executable file name |
189   - | IOC | sfjozjero.exe | Executable file name |
190   - +------------+----------------------+-----------------------------------------+
191   -
192   -### Triage mode (default for multiple files)
193   -
194   -When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed.
195   -This is more convenient for quick triage of a collection of suspicious files.
196   -
197   -The following flags show the results of the analysis:
198   -
199   -- **OLE**: the file type is OLE, for example MS Office 97-2003
200   -- **OpX**: the file type is OpenXML, for example MS Office 2007+
201   -- **XML**: the file type is Word 2003 XML
202   -- **MHT**: the file type is Word MHTML, aka Single File Web Page (.mht)
203   -- **?**: the file type is not supported
204   -- **M**: contains VBA Macros
205   -- **A**: auto-executable macros
206   -- **S**: suspicious VBA keywords
207   -- **I**: potential IOCs
208   -- **H**: hex-encoded strings (potential obfuscation)
209   -- **B**: Base64-encoded strings (potential obfuscation)
210   -- **D**: Dridex-encoded strings (potential obfuscation)
211   -- **V**: VBA string expressions (potential obfuscation)
212   -
213   -Here is an example:
214   -
215   - :::text
216   - c:\>olevba.py \MalwareZoo\VBA\samples\*
217   - Flags Filename
218   - ----------- -----------------------------------------------------------------
219   - OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
220   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
221   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
222   - OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
223   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
224   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
225   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
226   - OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
227   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
228   - OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
229   - OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
230   - OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
231   - OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
232   - OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
233   - OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
234   - OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
235   -
236   -
237   ---------------------------------------------------------------------------
238   -
239   -## How to use olevba in Python applications
240   -
241   -olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code
242   -from your own python applications.
243   -
244   -IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
245   -
246   -### Import olevba
247   -
248   -First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
249   -
250   - :::python
251   - from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
252   -
253   -### Parse a MS Office file
254   -
255   -To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
256   -For example:
257   -
258   - :::python
259   - vbaparser = VBA_Parser('my_file_with_macros.doc')
260   -
261   -The file may also be provided as a bytes string containing its data. In that case, the actual
262   -filename must be provided for reference, and the file content with the data parameter. For example:
263   -
264   - :::python
265   - myfile = 'my_file_with_macros.doc'
266   - filedata = open(myfile, 'rb').read()
267   - vbaparser = VBA_Parser(myfile, data=filedata)
268   -
269   -VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML
270   -(MS Office 2007+), MHTML or Word 2003 XML.
271   -
272   -After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type.
273   -It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)
274   -
275   -### Detect VBA macros
276   -
277   -The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file,
278   -False otherwise.
279   -
280   - :::python
281   - if vbaparser.detect_vba_macros():
282   - print 'VBA Macros found'
283   - else:
284   - print 'No VBA Macros found'
285   -
286   -Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine
287   -for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will
288   -always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
289   -
290   -Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method
291   -may return True if the embedded document contains VBA Macros, even if the main document does not.
292   -
293   -### Extract VBA Macro Source Code
294   -
295   -The method **extract_macros** extracts and decompresses source code for each VBA macro found in the file (possibly
296   -including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code)
297   -for each VBA macro found.
298   -
299   -- filename: If the file is OLE (MS Office 97-2003), filename is the path of the file.
300   - If the file is OpenXML (MS Office 2007+), filename is the path of the OLE subfile containing VBA macros within the zip archive,
301   - e.g. word/vbaProject.bin.
302   -- stream_path: path of the OLE stream containing the VBA macro source code
303   -- vba_filename: corresponding VBA filename
304   -- vba_code: string containing the VBA source code in clear text
305   -
306   -Example:
307   -
308   - :::python
309   - for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
310   - print '-'*79
311   - print 'Filename :', filename
312   - print 'OLE stream :', stream_path
313   - print 'VBA filename:', vba_filename
314   - print '- '*39
315   - print vba_code
316   -
317   -Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples.
318   -
319   -### Analyze VBA Source Code
320   -
321   -Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained
322   -in a file:
323   -
324   -The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all
325   -VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
326   -
327   -scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
328   -strings found in the code (Hex, Base64, Dridex) with their decoded value.
329   -By default, it will include the strings which contain printable characters only.
330   -
331   -**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results.
332   -
333   -- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or
334   - 'VBA obfuscated Strings'.
335   -- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
336   - the decoded value of the string.
337   -- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
338   -
339   -Example:
340   -
341   - :::python
342   - results = vbaparser.scan()
343   - for kw_type, keyword, description in results:
344   - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
345   -
346   -**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category:
347   -(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).
348   -
349   -
350   -
351   -### Close the VBA_Parser
352   -
353   -After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,
354   -especially if your application is parsing many files.
355   -
356   - :::python
357   - vbaparser.close()
358   -
359   -
360   ---------------------------------------------------------------------------
361   -
362   -## Deprecated API
363   -
364   -The following methods and functions are still functional, but their usage is not recommended
365   -since they have been replaced by better solutions.
366   -
367   -### VBA_Scanner (deprecated)
368   -
369   -Note: this API is under active development and may change in the future.
370   -
371   -The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
372   -suspicious keywords, IOCs, auto-executable macros, etc.
373   -
374   -First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the
375   -extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis.
376   -
377   -scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
378   -strings found in the code (Hex, Base64, Dridex) with their decoded value.
379   -
380   -**scan** returns a list of tuples (type, keyword, description), one for each item in the results.
381   -
382   -- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.
383   -- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
384   - the decoded value of the string.
385   -- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
386   -
387   -Example:
388   -
389   - :::python
390   - vba_scanner = VBA_Scanner(vba_code)
391   - results = vba_scanner.scan(include_decoded_strings=True)
392   - for kw_type, keyword, description in results:
393   - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
394   -
395   -The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan():
396   -
397   - :::python
398   - results = scan_vba(vba_code, include_decoded_strings=True)
399   - for kw_type, keyword, description in results:
400   - print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
401   -
402   -**scan_summary** returns a tuple with the number of items found for each category:
403   -(autoexec, suspicious, IOCs, hex, base64, dridex).
404   -
405   -
406   -### Detect auto-executable macros (deprecated)
407   -
408   -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
409   -
410   -The function **detect_autoexec** checks if VBA macro code contains specific macro names
411   -that will be triggered when the document/workbook is opened, closed, changed, etc.
412   -
413   -It returns a list of tuples containing two strings, the detected keyword, and the
414   -description of the trigger. (See the malware example above)
415   -
416   -Sample usage:
417   -
418   - :::python
419   - from oletools.olevba import detect_autoexec
420   - autoexec_keywords = detect_autoexec(vba_code)
421   - if autoexec_keywords:
422   - print 'Auto-executable macro keywords found:'
423   - for keyword, description in autoexec_keywords:
424   - print '%s: %s' % (keyword, description)
425   - else:
426   - print 'Auto-executable macro keywords: None found'
427   -
428   -
429   -### Detect suspicious VBA keywords (deprecated)
430   -
431   -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
432   -
433   -The function **detect_suspicious** checks if VBA macro code contains specific
434   -keywords often used by malware to act on the system (create files, run
435   -commands or applications, write to the registry, etc).
436   -
437   -It returns a list of tuples containing two strings, the detected keyword, and the
438   -description of the corresponding malicious behaviour. (See the malware example above)
439   -
440   -Sample usage:
441   -
442   - :::python
443   - from oletools.olevba import detect_suspicious
444   - suspicious_keywords = detect_suspicious(vba_code)
445   - if suspicious_keywords:
446   - print 'Suspicious VBA keywords found:'
447   - for keyword, description in suspicious_keywords:
448   - print '%s: %s' % (keyword, description)
449   - else:
450   - print 'Suspicious VBA keywords: None found'
451   -
452   -
453   -### Extract potential IOCs (deprecated)
454   -
455   -**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
456   -
457   -The function **detect_patterns** checks if VBA macro code contains specific
458   -patterns of interest, that may be useful for malware analysis and detection
459   -(potential Indicators of Compromise): IP addresses, e-mail addresses,
460   -URLs, executable file names.
461   -
462   -It returns a list of tuples containing two strings, the pattern type, and the
463   -extracted value. (See the malware example above)
464   -
465   -Sample usage:
466   -
467   - :::python
468   - from oletools.olevba import detect_patterns
469   - patterns = detect_patterns(vba_code)
470   - if patterns:
471   - print 'Patterns found:'
472   - for pattern_type, value in patterns:
473   - print '%s: %s' % (pattern_type, value)
474   - else:
475   - print 'Patterns: None found'
476   -
477   -
478   ---------------------------------------------------------------------------
479   -
480   -python-oletools documentation
481   ------------------------------
482   -
483   -- [[Home]]
484   -- [[License]]
485   -- [[Install]]
486   -- [[Contribute]], Suggest Improvements or Report Issues
487   -- Tools:
488   - - [[olebrowse]]
489   - - [[oleid]]
490   - - [[olemeta]]
491   - - [[oletimes]]
492   - - [[olevba]]
493   - - [[pyxswf]]
  1 +olevba
  2 +======
  3 +
  4 +olevba is a script to parse OLE and OpenXML files such as MS Office documents
  5 +(e.g. Word, Excel), to **detect VBA Macros**, extract their **source code** in clear text,
  6 +and detect security-related patterns such as **auto-executable macros**, **suspicious
  7 +VBA keywords** used by malware, anti-sandboxing and anti-virtualization techniques,
  8 +and potential **IOCs** (IP addresses, URLs, executable filenames, etc).
  9 +It also detects and decodes several common **obfuscation methods including Hex encoding,
  10 +StrReverse, Base64, Dridex, VBA expressions**, and extracts IOCs from decoded strings.
  11 +
  12 +It can be used either as a command-line tool, or as a python module from your own applications.
  13 +
  14 +It is part of the [python-oletools](http://www.decalage.info/python/oletools) package.
  15 +
  16 +olevba is based on source code from [officeparser](https://github.com/unixfreak0037/officeparser)
  17 +by John William Davison, with significant modifications.
  18 +
  19 +## Supported formats
  20 +
  21 +- Word 97-2003 (.doc, .dot)
  22 +- Word 2007+ (.docm, .dotm)
  23 +- Word 2003 XML (.xml)
  24 +- Word/Excel MHTML, aka Single File Web Page (.mht)
  25 +- Excel 97-2003 (.xls)
  26 +- Excel 2007+ (.xlsm, .xlsb)
  27 +- PowerPoint 2007+ (.pptm, .ppsm)
  28 +
  29 +## Main Features
  30 +
  31 +- Detect VBA macros in MS Office 97-2003 and 2007+ files, XML, MHT
  32 +- Extract VBA macro source code
  33 +- Detect auto-executable macros
  34 +- Detect suspicious VBA keywords often used by malware
  35 +- Detect anti-sandboxing and anti-virtualization techniques
  36 +- Detect and decodes strings obfuscated with Hex/Base64/StrReverse/Dridex
  37 +- Deobfuscates VBA expressions with any combination of Chr, Asc, Val, StrReverse, Environ, +, &, using a VBA parser built with
  38 +[pyparsing](http://pyparsing.wikispaces.com), including custom Hex and Base64 encodings
  39 +- Extract IOCs/patterns of interest such as IP addresses, URLs, e-mail addresses and executable file names
  40 +- Scan multiple files and sample collections (wildcards, recursive)
  41 +- Triage mode for a summary view of multiple files
  42 +- Scan malware samples in password-protected Zip archives
  43 +- Python API to use olevba from your applications
  44 +
  45 +MS Office files encrypted with a password are also supported, because VBA macro code is never
  46 +encrypted, only the content of the document.
  47 +
  48 +## About VBA Macros
  49 +
  50 +See [this article](http://www.decalage.info/en/vba_tools) for more information and technical details about VBA Macros
  51 +and how they are stored in MS Office documents.
  52 +
  53 +## How it works
  54 +
  55 +1. olevba checks the file type: If it is an OLE file (i.e MS Office 97-2003), it is parsed right away.
  56 +1. If it is a zip file (i.e. MS Office 2007+), XML or MHTML, olevba looks for all OLE files stored in it (e.g. vbaProject.bin, editdata.mso), and opens them.
  57 +1. olevba identifies all the VBA projects stored in the OLE structure.
  58 +1. Each VBA project is parsed to find the corresponding OLE streams containing macro code.
  59 +1. In each of these OLE streams, the VBA macro source code is extracted and decompressed (RLE compression).
  60 +1. olevba looks for specific strings obfuscated with various algorithms (Hex, Base64, StrReverse, Dridex, VBA expressions).
  61 +1. olevba scans the macro source code and the deobfuscated strings to find suspicious keywords, auto-executable macros
  62 +and potential IOCs (URLs, IP addresses, e-mail addresses, executable filenames, etc).
  63 +
  64 +
  65 +## Usage
  66 +
  67 + :::text
  68 + Usage: olevba.py [options] <filename> [filename2 ...]
  69 +
  70 + Options:
  71 + -h, --help show this help message and exit
  72 + -r find files recursively in subdirectories.
  73 + -z ZIP_PASSWORD, --zip=ZIP_PASSWORD
  74 + if the file is a zip archive, open all files from it,
  75 + using the provided password (requires Python 2.6+)
  76 + -f ZIP_FNAME, --zipfname=ZIP_FNAME
  77 + if the file is a zip archive, file(s) to be opened
  78 + within the zip. Wildcards * and ? are supported.
  79 + (default:*)
  80 + -t, --triage triage mode, display results as a summary table
  81 + (default for multiple files)
  82 + -d, --detailed detailed mode, display full results (default for
  83 + single file)
  84 + -a, --analysis display only analysis results, not the macro source
  85 + code
  86 + -c, --code display only VBA source code, do not analyze it
  87 + -i INPUT, --input=INPUT
  88 + input file containing VBA source code to be analyzed
  89 + (no parsing)
  90 + --decode display all the obfuscated strings with their decoded
  91 + content (Hex, Base64, StrReverse, Dridex, VBA).
  92 + --attr display the attribute lines at the beginning of VBA
  93 + source code
  94 + --reveal display the macro source code after replacing all the
  95 + obfuscated strings by their decoded content.
  96 +
  97 +### Examples
  98 +
  99 +Scan a single file:
  100 +
  101 + :::text
  102 + olevba.py file.doc
  103 +
  104 +Scan a single file, stored in a Zip archive with password "infected":
  105 +
  106 + :::text
  107 + olevba.py malicious_file.xls.zip -z infected
  108 +
  109 +Scan a single file, showing all obfuscated strings decoded:
  110 +
  111 + :::text
  112 + olevba.py file.doc --decode
  113 +
  114 +Scan a single file, showing the macro source code with VBA strings deobfuscated:
  115 +
  116 + :::text
  117 + olevba.py file.doc --reveal
  118 +
  119 +Scan VBA source code extracted into a text file:
  120 +
  121 + :::text
  122 + olevba.py -i source_code.vba
  123 +
  124 +Scan a collection of files stored in a folder:
  125 +
  126 + :::text
  127 + olevba.py MalwareZoo/VBA/*
  128 +
  129 +Scan all .doc and .xls files, recursively in all subfolders:
  130 +
  131 + :::text
  132 + olevba.py MalwareZoo/VBA/*.doc MalwareZoo/VBA/*.xls -r
  133 +
  134 +Scan all .doc files within all .zip files with password, recursively:
  135 +
  136 + :::text
  137 + olevba.py MalwareZoo/VBA/*.zip -r -z infected -f *.doc
  138 +
  139 +
  140 +### Detailed analysis mode (default for single file)
  141 +
  142 +When a single file is scanned, or when using the option -d, all details of the analysis are displayed.
  143 +
  144 +For example, checking the malware sample [DIAN_caso-5415.doc](https://malwr.com/analysis/M2I4YWRhM2IwY2QwNDljN2E3ZWFjYTg3ODk4NmZhYmE/):
  145 +
  146 + :::text
  147 + >olevba.py c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip -z infected
  148 + ===============================================================================
  149 + FILE: DIAN_caso-5415.doc.malware in c:\MalwareZoo\VBA\DIAN_caso-5415.doc.zip
  150 + Type: OLE
  151 + -------------------------------------------------------------------------------
  152 + VBA MACRO ThisDocument.cls
  153 + in file: DIAN_caso-5415.doc.malware - OLE stream: Macros/VBA/ThisDocument
  154 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  155 + Option Explicit
  156 + Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal FVQGKS As Long,_
  157 + ByVal WSGSGY As String, ByVal IFRRFV As String, ByVal NCVOLV As Long, _
  158 + ByVal HQTLDG As Long) As Long
  159 + Sub AutoOpen()
  160 + Auto_Open
  161 + End Sub
  162 + Sub Auto_Open()
  163 + SNVJYQ
  164 + End Sub
  165 + Public Sub SNVJYQ()
  166 + [Malicious Code...]
  167 + End Sub
  168 + Function OGEXYR(XSTAHU As String, PHHWIV As String) As Boolean
  169 + [Malicious Code...]
  170 + Application.DisplayAlerts = False
  171 + Application.Quit
  172 + End Function
  173 + Sub Workbook_Open()
  174 + Auto_Open
  175 + End Sub
  176 +
  177 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  178 + ANALYSIS:
  179 + +------------+----------------------+-----------------------------------------+
  180 + | Type | Keyword | Description |
  181 + +------------+----------------------+-----------------------------------------+
  182 + | AutoExec | AutoOpen | Runs when the Word document is opened |
  183 + | AutoExec | Auto_Open | Runs when the Excel Workbook is opened |
  184 + | AutoExec | Workbook_Open | Runs when the Excel Workbook is opened |
  185 + | Suspicious | Lib | May run code from a DLL |
  186 + | Suspicious | Shell | May run an executable file or a system |
  187 + | | | command |
  188 + | Suspicious | Environ | May read system environment variables |
  189 + | Suspicious | URLDownloadToFileA | May download files from the Internet |
  190 + | IOC | http://germanya.com. | URL |
  191 + | | ec/logs/test.exe" | |
  192 + | IOC | http://germanya.com. | URL |
  193 + | | ec/logs/counter.php" | |
  194 + | IOC | germanya.com | Executable file name |
  195 + | IOC | test.exe | Executable file name |
  196 + | IOC | sfjozjero.exe | Executable file name |
  197 + +------------+----------------------+-----------------------------------------+
  198 +
  199 +### Triage mode (default for multiple files)
  200 +
  201 +When several files are scanned, or when using the option -t, a summary of the analysis for each file is displayed.
  202 +This is more convenient for quick triage of a collection of suspicious files.
  203 +
  204 +The following flags show the results of the analysis:
  205 +
  206 +- **OLE**: the file type is OLE, for example MS Office 97-2003
  207 +- **OpX**: the file type is OpenXML, for example MS Office 2007+
  208 +- **XML**: the file type is Word 2003 XML
  209 +- **MHT**: the file type is Word MHTML, aka Single File Web Page (.mht)
  210 +- **?**: the file type is not supported
  211 +- **M**: contains VBA Macros
  212 +- **A**: auto-executable macros
  213 +- **S**: suspicious VBA keywords
  214 +- **I**: potential IOCs
  215 +- **H**: hex-encoded strings (potential obfuscation)
  216 +- **B**: Base64-encoded strings (potential obfuscation)
  217 +- **D**: Dridex-encoded strings (potential obfuscation)
  218 +- **V**: VBA string expressions (potential obfuscation)
  219 +
  220 +Here is an example:
  221 +
  222 + :::text
  223 + c:\>olevba.py \MalwareZoo\VBA\samples\*
  224 + Flags Filename
  225 + ----------- -----------------------------------------------------------------
  226 + OLE:MASI--- \MalwareZoo\VBA\samples\DIAN_caso-5415.doc.malware
  227 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_1.doc.malware
  228 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_2.doc.malware
  229 + OLE:MASI--- \MalwareZoo\VBA\samples\DRIDEX_3.doc.malware
  230 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_4.doc.malware
  231 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_5.doc.malware
  232 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_6.doc.malware
  233 + OLE:MAS---- \MalwareZoo\VBA\samples\DRIDEX_7.doc.malware
  234 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_8.doc.malware
  235 + OLE:MASIHBD \MalwareZoo\VBA\samples\DRIDEX_9.xls.malware
  236 + OLE:MASIH-- \MalwareZoo\VBA\samples\DRIDEX_A.doc.malware
  237 + OLE:------- \MalwareZoo\VBA\samples\Normal_Document.doc
  238 + OLE:M------ \MalwareZoo\VBA\samples\Normal_Document_Macro.doc
  239 + OpX:MASI--- \MalwareZoo\VBA\samples\RottenKitten.xlsb.malware
  240 + OLE:MASI-B- \MalwareZoo\VBA\samples\ROVNIX.doc.malware
  241 + OLE:MA----- \MalwareZoo\VBA\samples\Word within Word macro auto.doc
  242 +
  243 +
  244 +--------------------------------------------------------------------------
  245 +
  246 +## How to use olevba in Python applications
  247 +
  248 +olevba may be used to open a MS Office file, detect if it contains VBA macros, extract and analyze the VBA source code
  249 +from your own python applications.
  250 +
  251 +IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
  252 +
  253 +### Import olevba
  254 +
  255 +First, import the **oletools.olevba** package, using at least the VBA_Parser and VBA_Scanner classes:
  256 +
  257 + :::python
  258 + from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
  259 +
  260 +### Parse a MS Office file
  261 +
  262 +To parse a file on disk, create an instance of the **VBA_Parser** class, providing the name of the file to open as parameter.
  263 +For example:
  264 +
  265 + :::python
  266 + vbaparser = VBA_Parser('my_file_with_macros.doc')
  267 +
  268 +The file may also be provided as a bytes string containing its data. In that case, the actual
  269 +filename must be provided for reference, and the file content with the data parameter. For example:
  270 +
  271 + :::python
  272 + myfile = 'my_file_with_macros.doc'
  273 + filedata = open(myfile, 'rb').read()
  274 + vbaparser = VBA_Parser(myfile, data=filedata)
  275 +
  276 +VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML
  277 +(MS Office 2007+), MHTML or Word 2003 XML.
  278 +
  279 +After parsing the file, the attribute **VBA_Parser.type** is a string indicating the file type.
  280 +It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)
  281 +
  282 +### Detect VBA macros
  283 +
  284 +The method **detect_vba_macros** of a VBA_Parser object returns True if VBA macros have been found in the file,
  285 +False otherwise.
  286 +
  287 + :::python
  288 + if vbaparser.detect_vba_macros():
  289 + print 'VBA Macros found'
  290 + else:
  291 + print 'No VBA Macros found'
  292 +
  293 +Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine
  294 +for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will
  295 +always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
  296 +
  297 +Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method
  298 +may return True if the embedded document contains VBA Macros, even if the main document does not.
  299 +
  300 +### Extract VBA Macro Source Code
  301 +
  302 +The method **extract_macros** extracts and decompresses source code for each VBA macro found in the file (possibly
  303 +including embedded files). It is a generator yielding a tuple (filename, stream_path, vba_filename, vba_code)
  304 +for each VBA macro found.
  305 +
  306 +- filename: If the file is OLE (MS Office 97-2003), filename is the path of the file.
  307 + If the file is OpenXML (MS Office 2007+), filename is the path of the OLE subfile containing VBA macros within the zip archive,
  308 + e.g. word/vbaProject.bin.
  309 +- stream_path: path of the OLE stream containing the VBA macro source code
  310 +- vba_filename: corresponding VBA filename
  311 +- vba_code: string containing the VBA source code in clear text
  312 +
  313 +Example:
  314 +
  315 + :::python
  316 + for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
  317 + print '-'*79
  318 + print 'Filename :', filename
  319 + print 'OLE stream :', stream_path
  320 + print 'VBA filename:', vba_filename
  321 + print '- '*39
  322 + print vba_code
  323 +
  324 +Alternatively, the VBA_Parser method **extract_all_macros** returns the same results as a list of tuples.
  325 +
  326 +### Analyze VBA Source Code
  327 +
  328 +Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained
  329 +in a file:
  330 +
  331 +The methods **scan** or **scan_summary** from the class **VBA_Parser** can be used to scan the source code of all
  332 +VBA modules to find obfuscated strings, suspicious keywords, IOCs, auto-executable macros, etc.
  333 +
  334 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  335 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  336 +By default, it will include the strings which contain printable characters only.
  337 +
  338 +**VBA_Parser.scan()** returns a list of tuples (type, keyword, description), one for each item in the results.
  339 +
  340 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String', 'Dridex String' or
  341 + 'VBA obfuscated Strings'.
  342 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  343 + the decoded value of the string.
  344 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  345 +
  346 +Example:
  347 +
  348 + :::python
  349 + results = vbaparser.scan()
  350 + for kw_type, keyword, description in results:
  351 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  352 +
  353 +**VBA_Parser.scan_summary()** returns a tuple with the number of items found for each category:
  354 +(autoexec, suspicious, IOCs, hex, base64, dridex, vbastrings).
  355 +
  356 +
  357 +
  358 +### Close the VBA_Parser
  359 +
  360 +After usage, it is better to call the **close** method of the VBA_Parser object, to make sure the file is closed,
  361 +especially if your application is parsing many files.
  362 +
  363 + :::python
  364 + vbaparser.close()
  365 +
  366 +
  367 +--------------------------------------------------------------------------
  368 +
  369 +## Deprecated API
  370 +
  371 +The following methods and functions are still functional, but their usage is not recommended
  372 +since they have been replaced by better solutions.
  373 +
  374 +### VBA_Scanner (deprecated)
  375 +
  376 +Note: this API is under active development and may change in the future.
  377 +
  378 +The class **VBA_Scanner** can be used to scan the source code of a VBA module to find obfuscated strings,
  379 +suspicious keywords, IOCs, auto-executable macros, etc.
  380 +
  381 +First, create a VBA_Scanner object with a string containing the VBA source code (for example returned by the
  382 +extract_macros method). Then call the methods **scan** or **scan_summary** to get the results of the analysis.
  383 +
  384 +scan() takes an optional argument include_decoded_strings: if set to True, the results will contain all the encoded
  385 +strings found in the code (Hex, Base64, Dridex) with their decoded value.
  386 +
  387 +**scan** returns a list of tuples (type, keyword, description), one for each item in the results.
  388 +
  389 +- type may be either 'AutoExec', 'Suspicious', 'IOC', 'Hex String', 'Base64 String' or 'Dridex String'.
  390 +- keyword is the string found for auto-executable macros, suspicious keywords or IOCs. For obfuscated strings, it is
  391 + the decoded value of the string.
  392 +- description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  393 +
  394 +Example:
  395 +
  396 + :::python
  397 + vba_scanner = VBA_Scanner(vba_code)
  398 + results = vba_scanner.scan(include_decoded_strings=True)
  399 + for kw_type, keyword, description in results:
  400 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  401 +
  402 +The function **scan_vba** is a shortcut for VBA_Scanner(vba_code).scan():
  403 +
  404 + :::python
  405 + results = scan_vba(vba_code, include_decoded_strings=True)
  406 + for kw_type, keyword, description in results:
  407 + print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
  408 +
  409 +**scan_summary** returns a tuple with the number of items found for each category:
  410 +(autoexec, suspicious, IOCs, hex, base64, dridex).
  411 +
  412 +
  413 +### Detect auto-executable macros (deprecated)
  414 +
  415 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
  416 +
  417 +The function **detect_autoexec** checks if VBA macro code contains specific macro names
  418 +that will be triggered when the document/workbook is opened, closed, changed, etc.
  419 +
  420 +It returns a list of tuples containing two strings, the detected keyword, and the
  421 +description of the trigger. (See the malware example above)
  422 +
  423 +Sample usage:
  424 +
  425 + :::python
  426 + from oletools.olevba import detect_autoexec
  427 + autoexec_keywords = detect_autoexec(vba_code)
  428 + if autoexec_keywords:
  429 + print 'Auto-executable macro keywords found:'
  430 + for keyword, description in autoexec_keywords:
  431 + print '%s: %s' % (keyword, description)
  432 + else:
  433 + print 'Auto-executable macro keywords: None found'
  434 +
  435 +
  436 +### Detect suspicious VBA keywords (deprecated)
  437 +
  438 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
  439 +
  440 +The function **detect_suspicious** checks if VBA macro code contains specific
  441 +keywords often used by malware to act on the system (create files, run
  442 +commands or applications, write to the registry, etc).
  443 +
  444 +It returns a list of tuples containing two strings, the detected keyword, and the
  445 +description of the corresponding malicious behaviour. (See the malware example above)
  446 +
  447 +Sample usage:
  448 +
  449 + :::python
  450 + from oletools.olevba import detect_suspicious
  451 + suspicious_keywords = detect_suspicious(vba_code)
  452 + if suspicious_keywords:
  453 + print 'Suspicious VBA keywords found:'
  454 + for keyword, description in suspicious_keywords:
  455 + print '%s: %s' % (keyword, description)
  456 + else:
  457 + print 'Suspicious VBA keywords: None found'
  458 +
  459 +
  460 +### Extract potential IOCs (deprecated)
  461 +
  462 +**Deprecated**: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
  463 +
  464 +The function **detect_patterns** checks if VBA macro code contains specific
  465 +patterns of interest, that may be useful for malware analysis and detection
  466 +(potential Indicators of Compromise): IP addresses, e-mail addresses,
  467 +URLs, executable file names.
  468 +
  469 +It returns a list of tuples containing two strings, the pattern type, and the
  470 +extracted value. (See the malware example above)
  471 +
  472 +Sample usage:
  473 +
  474 + :::python
  475 + from oletools.olevba import detect_patterns
  476 + patterns = detect_patterns(vba_code)
  477 + if patterns:
  478 + print 'Patterns found:'
  479 + for pattern_type, value in patterns:
  480 + print '%s: %s' % (pattern_type, value)
  481 + else:
  482 + print 'Patterns: None found'
  483 +
  484 +
  485 +--------------------------------------------------------------------------
  486 +
  487 +python-oletools documentation
  488 +-----------------------------
  489 +
  490 +- [[Home]]
  491 +- [[License]]
  492 +- [[Install]]
  493 +- [[Contribute]], Suggest Improvements or Report Issues
  494 +- Tools:
  495 + - [[olebrowse]]
  496 + - [[oleid]]
  497 + - [[olemeta]]
  498 + - [[oletimes]]
  499 + - [[olevba]]
  500 + - [[pyxswf]]
494 501 - [[rtfobj]]
495 502 \ No newline at end of file
... ...
oletools/olevba.py
... ... @@ -2100,6 +2100,7 @@ class VBA_Parser_CLI(VBA_Parser):
2100 2100  
2101 2101  
2102 2102 def reveal(self):
  2103 + #TODO: move this code to the VBA_Parser class (without print)
2103 2104 print 'MACRO SOURCE CODE WITH DEOBFUSCATED VBA STRINGS (EXPERIMENTAL):\n'
2104 2105 # we only want printable strings:
2105 2106 analysis = self.analyze_macros(show_decoded_strings=False)
... ...