Commit 2e0e1529c778b2edadd2c04546367b4567eb72d4

Authored by Philippe Lagadec
1 parent 3033d641

updated readme, moved tools details to the documentation wiki

Showing 2 changed files with 56 additions and 149 deletions
README.md
@@ -3,185 +3,62 @@ python-oletools @@ -3,185 +3,62 @@ python-oletools
3 3
4 [python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis and debugging. It is based on the [OleFileIO_PL](http://www.decalage.info/python/olefileio) parser. See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info. 4 [python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis and debugging. It is based on the [OleFileIO_PL](http://www.decalage.info/python/olefileio) parser. See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info.
5 5
  6 +**Quick links:** [Home page](http://www.decalage.info/python/oletools) - [Download](https://bitbucket.org/decalage/oletools/downloads) - [Documentation](https://bitbucket.org/decalage/oletools/wiki) - [Report issues](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) - [Contact the author](http://decalage.info/contact) - [Repository](https://bitbucket.org/decalage/oletools) - [Updates on Twitter](https://twitter.com/decalage2)
  7 +
6 Note: python-oletools is not related to OLETools published by BeCubed Software. 8 Note: python-oletools is not related to OLETools published by BeCubed Software.
7 9
8 Tools in python-oletools: 10 Tools in python-oletools:
9 ------------------------- 11 -------------------------
10 12
11 -- **olebrowse**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to 13 +- **[olebrowse](https://bitbucket.org/decalage/oletools/wiki/olebrowse)**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to
12 view and extract individual data streams. 14 view and extract individual data streams.
13 -- **oleid**: a tool to analyze OLE files to detect specific characteristics that could potentially indicate that the file is suspicious or malicious.  
14 -- **olemeta**: a tool to extract all standard properties (metadata) from OLE files.  
15 -- **oletimes**: a tool to extract creation and modification timestamps of all streams and storages.  
16 -- **olevba (new)**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).  
17 -- **pyxswf**: a tool to detect, extract and analyze Flash objects (SWF) that may 15 +- **[oleid](https://bitbucket.org/decalage/oletools/wiki/oleid)**: a tool to analyze OLE files to detect specific characteristics that could potentially indicate that the file is suspicious or malicious.
  16 +- **[olemeta](https://bitbucket.org/decalage/oletools/wiki/olemeta)**: a tool to extract all standard properties (metadata) from OLE files.
  17 +- **[oletimes](https://bitbucket.org/decalage/oletools/wiki/oletimes)**: a tool to extract creation and modification timestamps of all streams and storages.
  18 +- **[olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) (new)**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML).
  19 +- **[pyxswf](https://bitbucket.org/decalage/oletools/wiki/pyxswf)**: a tool to detect, extract and analyze Flash objects (SWF) that may
18 be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, 20 be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF,
19 which is especially useful for malware analysis. 21 which is especially useful for malware analysis.
20 -- **rtfobj**: a tool and python module to extract embedded objects from RTF files. 22 +- **[rtfobj](https://bitbucket.org/decalage/oletools/wiki/rtfobj)**: a tool and python module to extract embedded objects from RTF files.
21 - and a few others (coming soon) 23 - and a few others (coming soon)
22 24
23 News 25 News
24 ---- 26 ----
25 27
26 -- 2014-08-15 v0.06alpha: added olevba, a new tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML)  
27 -- 2013-07-24 v0.05: added new tools olemeta and oletimes  
28 -- 2013-04-18 v0.04: fixed bug in rtfobj, added documentation for rtfobj  
29 -- 2012-11-09 v0.03: Improved pyxswf to extract Flash objects from RTF  
30 -- 2012-10-29 v0.02: Added oleid  
31 -- 2012-10-09 v0.01: Initial version of olebrowse and pyxswf 28 +- **2014-08-16 v0.06**: added [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba), a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved [documentation](https://bitbucket.org/decalage/oletools/wiki)
  29 +- 2013-07-24 v0.05: added new tools [olemeta](https://bitbucket.org/decalage/oletools/wiki/olemeta) and [oletimes](https://bitbucket.org/decalage/oletools/wiki/oletimes)
  30 +- 2013-04-18 v0.04: fixed bug in rtfobj, added documentation for [rtfobj](https://bitbucket.org/decalage/oletools/wiki/rtfobj)
  31 +- 2012-11-09 v0.03: Improved [pyxswf](https://bitbucket.org/decalage/oletools/wiki/pyxswf) to extract Flash objects from RTF
  32 +- 2012-10-29 v0.02: Added [oleid](https://bitbucket.org/decalage/oletools/wiki/oleid)
  33 +- 2012-10-09 v0.01: Initial version of [olebrowse](https://bitbucket.org/decalage/oletools/wiki/olebrowse) and pyxswf
32 - see changelog in source code for more info. 34 - see changelog in source code for more info.
33 35
34 -Download:  
35 ----------  
36 -  
37 -The archive is available on [the project page](https://bitbucket.org/decalage/oletools/downloads).  
38 -  
39 -  
40 -olebrowse:  
41 -----------  
42 -  
43 -A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to  
44 -view and extract individual data streams.  
45 -  
46 - Usage: olebrowse.py [file]  
47 -  
48 -If you provide a file it will be opened, else a dialog will allow you to browse folders to open a file. Then if it is a valid OLE file, the list of data streams will be displayed. You can select a stream, and then either view its content in a builtin hexadecimal viewer, or save it to a file for further analysis.  
49 -  
50 -For screenshots and other info, see [http://www.decalage.info/python/olebrowse](http://www.decalage.info/python/olebrowse)  
51 -  
52 -oleid:  
53 -------  
54 -  
55 -oleid is a script to analyze OLE files such as MS Office documents (e.g. Word,  
56 -Excel), to detect specific characteristics that could potentially indicate that  
57 -the file is suspicious or malicious, in terms of security (e.g. malware).  
58 -For example it can detect VBA macros, embedded Flash objects, fragmentation.  
59 -  
60 - Usage: oleid.py <file>  
61 -  
62 -Example - analyzing a Word document containing a Flash object and VBA macros:  
63 -  
64 - C:\oletools>oleid.py word_flash_vba.doc  
65 - Filename: word_flash_vba.doc  
66 - OLE format: True  
67 - Has SummaryInformation stream: True  
68 - Application name: Microsoft Office Word  
69 - Encrypted: False  
70 - Word Document: True  
71 - VBA Macros: True  
72 - Excel Workbook: False  
73 - PowerPoint Presentation: False  
74 - Visio Drawing: False  
75 - ObjectPool: True  
76 - Flash objects: 1  
77 -  
78 -oleid project website: [http://www.decalage.info/python/oleid](http://www.decalage.info/python/oleid)  
79 -  
80 -  
81 -pyxswf:  
82 ---------  
83 -  
84 -pyxswf is a script to detect, extract and analyze Flash objects (SWF files) that may  
85 -be embedded in files such as MS Office documents (e.g. Word, Excel),  
86 -which is especially useful for malware analysis.  
87 -  
88 -pyxswf is an extension to [xxxswf.py](http://hooked-on-mnemonics.blogspot.nl/2011/12/xxxswfpy.html) published by Alexander Hanel.  
89 -  
90 -Compared to xxxswf, it can extract streams from MS Office documents by parsing  
91 -their OLE structure properly, which is necessary when streams are fragmented.  
92 -Stream fragmentation is a known obfuscation technique, as explained on  
93 -[http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/](http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/)  
94 -  
95 -It can also extract Flash objects from RTF documents, by parsing embedded objects encoded in hexadecimal format (-f option).  
96 -  
97 -  
98 -For this, simply add the -o option to work on OLE streams rather than raw files, or the -f option to work on RTF files. 36 +Download and Install:
  37 +---------------------
99 38
100 - Usage: pyxswf.py [options] <file.bad>  
101 -  
102 - Options:  
103 - -o, --ole Parse an OLE file (e.g. Word, Excel) to look for SWF  
104 - in each stream  
105 - -f, --rtf Parse an RTF file to look for SWF in each embedded  
106 - object  
107 - -x, --extract Extracts the embedded SWF(s), names it MD5HASH.swf &  
108 - saves it in the working dir. No addition args needed  
109 - -h, --help show this help message and exit  
110 - -y, --yara Scans the SWF(s) with yara. If the SWF(s) is  
111 - compressed it will be deflated. No addition args  
112 - needed  
113 - -s, --md5scan Scans the SWF(s) for MD5 signatures. Please see func  
114 - checkMD5 to define hashes. No addition args needed  
115 - -H, --header Displays the SWFs file header. No addition args needed  
116 - -d, --decompress Deflates compressed SWFS(s)  
117 - -r PATH, --recdir=PATH  
118 - Will recursively scan a directory for files that  
119 - contain SWFs. Must provide path in quotes  
120 - -c, --compress Compresses the SWF using Zlib  
121 -  
122 -Example 1 - detecting and extracting a SWF file from a Word document on Windows:  
123 -  
124 - C:\oletools>pyxswf.py -o word_flash.doc  
125 - OLE stream: 'Contents'  
126 - [SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents  
127 - [ADDR] SWF 1 at 0x8 - FWS Header  
128 -  
129 - C:\oletools>pyxswf.py -xo word_flash.doc  
130 - OLE stream: 'Contents'  
131 - [SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents  
132 - [ADDR] SWF 1 at 0x8 - FWS Header  
133 - [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf  
134 -  
135 -Example 2 - detecting and extracting a SWF file from a RTF document on Windows:  
136 -  
137 - C:\oletools>pyxswf.py -xf "rtf_flash.rtf"  
138 - RTF embedded object size 1498557 at index 000036DD  
139 - [SUMMARY] 1 SWF(s) in MD5:46a110548007e04f4043785ac4184558:RTF_embedded_object_0  
140 - 00036DD  
141 - [ADDR] SWF 1 at 0xc40 - FWS Header  
142 - [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf  
143 -  
144 -For more info, see [http://www.decalage.info/python/pyxswf](http://www.decalage.info/python/pyxswf)  
145 -  
146 -  
147 -rtfobj  
148 -------  
149 -  
150 -rtfobj is a Python module to extract embedded objects from RTF files, such as  
151 -OLE ojects. It can be used as a Python library or a command-line tool.  
152 -  
153 - Usage: rtfobj.py <file.rtf>  
154 -  
155 -It extracts and decodes all the data blocks encoded as hexadecimal in the RTF document, and saves them as files named "object_xxxx.bin", xxxx being the location of the object in the RTF file.  
156 -  
157 -Usage as python module: rtf_iter_objects(filename) is an iterator which yields a tuple (index, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object. Example:  
158 -  
159 - import rtfobj  
160 - for index, data in rtfobj.rtf_iter_objects("myfile.rtf"):  
161 - print 'found object size %d at index %08X' % (len(data), index)  
162 -  
163 -  
164 -For more info, see [http://www.decalage.info/python/rtfobj](http://www.decalage.info/python/rtfobj) 39 +To use python-oletools from the command line as analysis tools, you may simply [download the zip archive](https://bitbucket.org/decalage/oletools/downloads) and extract the files in the directory of your choice.
165 40
  41 +If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "**easy_install oletools**" or "**pip install oletools**" to download and install in one go. Otherwise you may download the zip archive and run "**setup.py install**".
166 42
167 How to contribute: 43 How to contribute:
168 ------------------ 44 ------------------
169 45
170 -The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/oletools). You may use it to submit enhancements or to report any issue. 46 +The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/oletools). You may use it to submit enhancements (using fork and pull requests) or to report any issue.
171 47
172 -If you would like to help us improve this module, or simply provide feedback, you may also send an e-mail to decalage(at)laposte.net. 48 +If you would like to help us improve this module, or simply provide feedback, you may also [contact the author](http://decalage.info/contact).
173 49
174 -How to report bugs:  
175 -------------------- 50 +How to suggest improvements or report bugs:
  51 +-------------------------------------------
176 52
177 -To report a bug or any issue, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), or send an e-mail with all the information and files to reproduce the problem. 53 +To suggest improvements, report a bug or any issue, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), providing all the information and files to reproduce the problem. You may also [contact the author](http://decalage.info/contact).
178 54
179 License 55 License
180 ------- 56 -------
181 57
182 This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license. 58 This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
183 59
184 -The python-oletools package is copyright (c) 2012-2013, Philippe Lagadec (http://www.decalage.info) 60 +The python-oletools package is copyright (c) 2012-2014 Philippe Lagadec (http://www.decalage.info)
  61 +
185 All rights reserved. 62 All rights reserved.
186 63
187 Redistribution and use in source and binary forms, with or without modification, 64 Redistribution and use in source and binary forms, with or without modification,
@@ -204,3 +81,28 @@ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, @@ -204,3 +81,28 @@ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
204 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 81 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
205 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 82 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
206 83
  84 +
  85 +----------
  86 +
  87 +olevba contains modified source code from the officeparser project, published
  88 +under the following MIT License (MIT):
  89 +
  90 +officeparser is copyright (c) 2014 John William Davison
  91 +
  92 +Permission is hereby granted, free of charge, to any person obtaining a copy
  93 +of this software and associated documentation files (the "Software"), to deal
  94 +in the Software without restriction, including without limitation the rights
  95 +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  96 +copies of the Software, and to permit persons to whom the Software is
  97 +furnished to do so, subject to the following conditions:
  98 +
  99 +The above copyright notice and this permission notice shall be included in all
  100 +copies or substantial portions of the Software.
  101 +
  102 +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  103 +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  104 +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  105 +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  106 +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  107 +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  108 +SOFTWARE.
oletools/pyxswf.py
@@ -55,6 +55,11 @@ __version__ = &#39;0.02&#39; @@ -55,6 +55,11 @@ __version__ = &#39;0.02&#39;
55 55
56 #------------------------------------------------------------------------------ 56 #------------------------------------------------------------------------------
57 # TODO: 57 # TODO:
  58 +# + add support for LZMA-compressed flash files (ZWS header)
  59 +# references: http://blog.malwaretracker.com/2014/01/cve-2013-5331-evaded-av-by-using.html
  60 +# http://code.metager.de/source/xref/adobe/flash/crossbridge/tools/swf-info.py
  61 +# http://room32.dyndns.org/forums/showthread.php?766-SWFCompression
  62 +# sample code: http://room32.dyndns.org/SWFCompression.py
58 # - check if file is OLE 63 # - check if file is OLE
59 # - support -r 64 # - support -r
60 65