Commit 2e0e1529c778b2edadd2c04546367b4567eb72d4
1 parent
3033d641
updated readme, moved tools details to the documentation wiki
Showing
2 changed files
with
56 additions
and
149 deletions
README.md
| @@ -3,185 +3,62 @@ python-oletools | @@ -3,185 +3,62 @@ python-oletools | ||
| 3 | 3 | ||
| 4 | [python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis and debugging. It is based on the [OleFileIO_PL](http://www.decalage.info/python/olefileio) parser. See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info. | 4 | [python-oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze [Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format)](http://en.wikipedia.org/wiki/Compound_File_Binary_Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis and debugging. It is based on the [OleFileIO_PL](http://www.decalage.info/python/olefileio) parser. See [http://www.decalage.info/python/oletools](http://www.decalage.info/python/oletools) for more info. |
| 5 | 5 | ||
| 6 | +**Quick links:** [Home page](http://www.decalage.info/python/oletools) - [Download](https://bitbucket.org/decalage/oletools/downloads) - [Documentation](https://bitbucket.org/decalage/oletools/wiki) - [Report issues](https://bitbucket.org/decalage/oletools/issues?status=new&status=open) - [Contact the author](http://decalage.info/contact) - [Repository](https://bitbucket.org/decalage/oletools) - [Updates on Twitter](https://twitter.com/decalage2) | ||
| 7 | + | ||
| 6 | Note: python-oletools is not related to OLETools published by BeCubed Software. | 8 | Note: python-oletools is not related to OLETools published by BeCubed Software. |
| 7 | 9 | ||
| 8 | Tools in python-oletools: | 10 | Tools in python-oletools: |
| 9 | ------------------------- | 11 | ------------------------- |
| 10 | 12 | ||
| 11 | -- **olebrowse**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to | 13 | +- **[olebrowse](https://bitbucket.org/decalage/oletools/wiki/olebrowse)**: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to |
| 12 | view and extract individual data streams. | 14 | view and extract individual data streams. |
| 13 | -- **oleid**: a tool to analyze OLE files to detect specific characteristics that could potentially indicate that the file is suspicious or malicious. | ||
| 14 | -- **olemeta**: a tool to extract all standard properties (metadata) from OLE files. | ||
| 15 | -- **oletimes**: a tool to extract creation and modification timestamps of all streams and storages. | ||
| 16 | -- **olevba (new)**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML). | ||
| 17 | -- **pyxswf**: a tool to detect, extract and analyze Flash objects (SWF) that may | 15 | +- **[oleid](https://bitbucket.org/decalage/oletools/wiki/oleid)**: a tool to analyze OLE files to detect specific characteristics that could potentially indicate that the file is suspicious or malicious. |
| 16 | +- **[olemeta](https://bitbucket.org/decalage/oletools/wiki/olemeta)**: a tool to extract all standard properties (metadata) from OLE files. | ||
| 17 | +- **[oletimes](https://bitbucket.org/decalage/oletools/wiki/oletimes)**: a tool to extract creation and modification timestamps of all streams and storages. | ||
| 18 | +- **[olevba](https://bitbucket.org/decalage/oletools/wiki/olevba) (new)**: a tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML). | ||
| 19 | +- **[pyxswf](https://bitbucket.org/decalage/oletools/wiki/pyxswf)**: a tool to detect, extract and analyze Flash objects (SWF) that may | ||
| 18 | be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, | 20 | be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, |
| 19 | which is especially useful for malware analysis. | 21 | which is especially useful for malware analysis. |
| 20 | -- **rtfobj**: a tool and python module to extract embedded objects from RTF files. | 22 | +- **[rtfobj](https://bitbucket.org/decalage/oletools/wiki/rtfobj)**: a tool and python module to extract embedded objects from RTF files. |
| 21 | - and a few others (coming soon) | 23 | - and a few others (coming soon) |
| 22 | 24 | ||
| 23 | News | 25 | News |
| 24 | ---- | 26 | ---- |
| 25 | 27 | ||
| 26 | -- 2014-08-15 v0.06alpha: added olevba, a new tool to extract VBA Macro source code from MS Office documents (OLE and OpenXML) | ||
| 27 | -- 2013-07-24 v0.05: added new tools olemeta and oletimes | ||
| 28 | -- 2013-04-18 v0.04: fixed bug in rtfobj, added documentation for rtfobj | ||
| 29 | -- 2012-11-09 v0.03: Improved pyxswf to extract Flash objects from RTF | ||
| 30 | -- 2012-10-29 v0.02: Added oleid | ||
| 31 | -- 2012-10-09 v0.01: Initial version of olebrowse and pyxswf | 28 | +- **2014-08-16 v0.06**: added [olevba](https://bitbucket.org/decalage/oletools/wiki/olevba), a new tool to extract VBA Macro source code from MS Office documents (97-2003 and 2007+). Improved [documentation](https://bitbucket.org/decalage/oletools/wiki) |
| 29 | +- 2013-07-24 v0.05: added new tools [olemeta](https://bitbucket.org/decalage/oletools/wiki/olemeta) and [oletimes](https://bitbucket.org/decalage/oletools/wiki/oletimes) | ||
| 30 | +- 2013-04-18 v0.04: fixed bug in rtfobj, added documentation for [rtfobj](https://bitbucket.org/decalage/oletools/wiki/rtfobj) | ||
| 31 | +- 2012-11-09 v0.03: Improved [pyxswf](https://bitbucket.org/decalage/oletools/wiki/pyxswf) to extract Flash objects from RTF | ||
| 32 | +- 2012-10-29 v0.02: Added [oleid](https://bitbucket.org/decalage/oletools/wiki/oleid) | ||
| 33 | +- 2012-10-09 v0.01: Initial version of [olebrowse](https://bitbucket.org/decalage/oletools/wiki/olebrowse) and pyxswf | ||
| 32 | - see changelog in source code for more info. | 34 | - see changelog in source code for more info. |
| 33 | 35 | ||
| 34 | -Download: | ||
| 35 | ---------- | ||
| 36 | - | ||
| 37 | -The archive is available on [the project page](https://bitbucket.org/decalage/oletools/downloads). | ||
| 38 | - | ||
| 39 | - | ||
| 40 | -olebrowse: | ||
| 41 | ----------- | ||
| 42 | - | ||
| 43 | -A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to | ||
| 44 | -view and extract individual data streams. | ||
| 45 | - | ||
| 46 | - Usage: olebrowse.py [file] | ||
| 47 | - | ||
| 48 | -If you provide a file it will be opened, else a dialog will allow you to browse folders to open a file. Then if it is a valid OLE file, the list of data streams will be displayed. You can select a stream, and then either view its content in a builtin hexadecimal viewer, or save it to a file for further analysis. | ||
| 49 | - | ||
| 50 | -For screenshots and other info, see [http://www.decalage.info/python/olebrowse](http://www.decalage.info/python/olebrowse) | ||
| 51 | - | ||
| 52 | -oleid: | ||
| 53 | ------- | ||
| 54 | - | ||
| 55 | -oleid is a script to analyze OLE files such as MS Office documents (e.g. Word, | ||
| 56 | -Excel), to detect specific characteristics that could potentially indicate that | ||
| 57 | -the file is suspicious or malicious, in terms of security (e.g. malware). | ||
| 58 | -For example it can detect VBA macros, embedded Flash objects, fragmentation. | ||
| 59 | - | ||
| 60 | - Usage: oleid.py <file> | ||
| 61 | - | ||
| 62 | -Example - analyzing a Word document containing a Flash object and VBA macros: | ||
| 63 | - | ||
| 64 | - C:\oletools>oleid.py word_flash_vba.doc | ||
| 65 | - Filename: word_flash_vba.doc | ||
| 66 | - OLE format: True | ||
| 67 | - Has SummaryInformation stream: True | ||
| 68 | - Application name: Microsoft Office Word | ||
| 69 | - Encrypted: False | ||
| 70 | - Word Document: True | ||
| 71 | - VBA Macros: True | ||
| 72 | - Excel Workbook: False | ||
| 73 | - PowerPoint Presentation: False | ||
| 74 | - Visio Drawing: False | ||
| 75 | - ObjectPool: True | ||
| 76 | - Flash objects: 1 | ||
| 77 | - | ||
| 78 | -oleid project website: [http://www.decalage.info/python/oleid](http://www.decalage.info/python/oleid) | ||
| 79 | - | ||
| 80 | - | ||
| 81 | -pyxswf: | ||
| 82 | --------- | ||
| 83 | - | ||
| 84 | -pyxswf is a script to detect, extract and analyze Flash objects (SWF files) that may | ||
| 85 | -be embedded in files such as MS Office documents (e.g. Word, Excel), | ||
| 86 | -which is especially useful for malware analysis. | ||
| 87 | - | ||
| 88 | -pyxswf is an extension to [xxxswf.py](http://hooked-on-mnemonics.blogspot.nl/2011/12/xxxswfpy.html) published by Alexander Hanel. | ||
| 89 | - | ||
| 90 | -Compared to xxxswf, it can extract streams from MS Office documents by parsing | ||
| 91 | -their OLE structure properly, which is necessary when streams are fragmented. | ||
| 92 | -Stream fragmentation is a known obfuscation technique, as explained on | ||
| 93 | -[http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/](http://www.breakingpointsystems.com/resources/blog/evasion-with-ole2-fragmentation/) | ||
| 94 | - | ||
| 95 | -It can also extract Flash objects from RTF documents, by parsing embedded objects encoded in hexadecimal format (-f option). | ||
| 96 | - | ||
| 97 | - | ||
| 98 | -For this, simply add the -o option to work on OLE streams rather than raw files, or the -f option to work on RTF files. | 36 | +Download and Install: |
| 37 | +--------------------- | ||
| 99 | 38 | ||
| 100 | - Usage: pyxswf.py [options] <file.bad> | ||
| 101 | - | ||
| 102 | - Options: | ||
| 103 | - -o, --ole Parse an OLE file (e.g. Word, Excel) to look for SWF | ||
| 104 | - in each stream | ||
| 105 | - -f, --rtf Parse an RTF file to look for SWF in each embedded | ||
| 106 | - object | ||
| 107 | - -x, --extract Extracts the embedded SWF(s), names it MD5HASH.swf & | ||
| 108 | - saves it in the working dir. No addition args needed | ||
| 109 | - -h, --help show this help message and exit | ||
| 110 | - -y, --yara Scans the SWF(s) with yara. If the SWF(s) is | ||
| 111 | - compressed it will be deflated. No addition args | ||
| 112 | - needed | ||
| 113 | - -s, --md5scan Scans the SWF(s) for MD5 signatures. Please see func | ||
| 114 | - checkMD5 to define hashes. No addition args needed | ||
| 115 | - -H, --header Displays the SWFs file header. No addition args needed | ||
| 116 | - -d, --decompress Deflates compressed SWFS(s) | ||
| 117 | - -r PATH, --recdir=PATH | ||
| 118 | - Will recursively scan a directory for files that | ||
| 119 | - contain SWFs. Must provide path in quotes | ||
| 120 | - -c, --compress Compresses the SWF using Zlib | ||
| 121 | - | ||
| 122 | -Example 1 - detecting and extracting a SWF file from a Word document on Windows: | ||
| 123 | - | ||
| 124 | - C:\oletools>pyxswf.py -o word_flash.doc | ||
| 125 | - OLE stream: 'Contents' | ||
| 126 | - [SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents | ||
| 127 | - [ADDR] SWF 1 at 0x8 - FWS Header | ||
| 128 | - | ||
| 129 | - C:\oletools>pyxswf.py -xo word_flash.doc | ||
| 130 | - OLE stream: 'Contents' | ||
| 131 | - [SUMMARY] 1 SWF(s) in MD5:993664cc86f60d52d671b6610813cfd1:Contents | ||
| 132 | - [ADDR] SWF 1 at 0x8 - FWS Header | ||
| 133 | - [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf | ||
| 134 | - | ||
| 135 | -Example 2 - detecting and extracting a SWF file from a RTF document on Windows: | ||
| 136 | - | ||
| 137 | - C:\oletools>pyxswf.py -xf "rtf_flash.rtf" | ||
| 138 | - RTF embedded object size 1498557 at index 000036DD | ||
| 139 | - [SUMMARY] 1 SWF(s) in MD5:46a110548007e04f4043785ac4184558:RTF_embedded_object_0 | ||
| 140 | - 00036DD | ||
| 141 | - [ADDR] SWF 1 at 0xc40 - FWS Header | ||
| 142 | - [FILE] Carved SWF MD5: 2498e9c0701dc0e461ab4358f9102bc5.swf | ||
| 143 | - | ||
| 144 | -For more info, see [http://www.decalage.info/python/pyxswf](http://www.decalage.info/python/pyxswf) | ||
| 145 | - | ||
| 146 | - | ||
| 147 | -rtfobj | ||
| 148 | ------- | ||
| 149 | - | ||
| 150 | -rtfobj is a Python module to extract embedded objects from RTF files, such as | ||
| 151 | -OLE ojects. It can be used as a Python library or a command-line tool. | ||
| 152 | - | ||
| 153 | - Usage: rtfobj.py <file.rtf> | ||
| 154 | - | ||
| 155 | -It extracts and decodes all the data blocks encoded as hexadecimal in the RTF document, and saves them as files named "object_xxxx.bin", xxxx being the location of the object in the RTF file. | ||
| 156 | - | ||
| 157 | -Usage as python module: rtf_iter_objects(filename) is an iterator which yields a tuple (index, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object. Example: | ||
| 158 | - | ||
| 159 | - import rtfobj | ||
| 160 | - for index, data in rtfobj.rtf_iter_objects("myfile.rtf"): | ||
| 161 | - print 'found object size %d at index %08X' % (len(data), index) | ||
| 162 | - | ||
| 163 | - | ||
| 164 | -For more info, see [http://www.decalage.info/python/rtfobj](http://www.decalage.info/python/rtfobj) | 39 | +To use python-oletools from the command line as analysis tools, you may simply [download the zip archive](https://bitbucket.org/decalage/oletools/downloads) and extract the files in the directory of your choice. |
| 165 | 40 | ||
| 41 | +If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "**easy_install oletools**" or "**pip install oletools**" to download and install in one go. Otherwise you may download the zip archive and run "**setup.py install**". | ||
| 166 | 42 | ||
| 167 | How to contribute: | 43 | How to contribute: |
| 168 | ------------------ | 44 | ------------------ |
| 169 | 45 | ||
| 170 | -The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/oletools). You may use it to submit enhancements or to report any issue. | 46 | +The code is available in [a Mercurial repository on bitbucket](https://bitbucket.org/decalage/oletools). You may use it to submit enhancements (using fork and pull requests) or to report any issue. |
| 171 | 47 | ||
| 172 | -If you would like to help us improve this module, or simply provide feedback, you may also send an e-mail to decalage(at)laposte.net. | 48 | +If you would like to help us improve this module, or simply provide feedback, you may also [contact the author](http://decalage.info/contact). |
| 173 | 49 | ||
| 174 | -How to report bugs: | ||
| 175 | -------------------- | 50 | +How to suggest improvements or report bugs: |
| 51 | +------------------------------------------- | ||
| 176 | 52 | ||
| 177 | -To report a bug or any issue, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), or send an e-mail with all the information and files to reproduce the problem. | 53 | +To suggest improvements, report a bug or any issue, please use the [issue reporting page](https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open), providing all the information and files to reproduce the problem. You may also [contact the author](http://decalage.info/contact). |
| 178 | 54 | ||
| 179 | License | 55 | License |
| 180 | ------- | 56 | ------- |
| 181 | 57 | ||
| 182 | This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license. | 58 | This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license. |
| 183 | 59 | ||
| 184 | -The python-oletools package is copyright (c) 2012-2013, Philippe Lagadec (http://www.decalage.info) | 60 | +The python-oletools package is copyright (c) 2012-2014 Philippe Lagadec (http://www.decalage.info) |
| 61 | + | ||
| 185 | All rights reserved. | 62 | All rights reserved. |
| 186 | 63 | ||
| 187 | Redistribution and use in source and binary forms, with or without modification, | 64 | Redistribution and use in source and binary forms, with or without modification, |
| @@ -204,3 +81,28 @@ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | @@ -204,3 +81,28 @@ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
| 204 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | 81 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
| 205 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | 82 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
| 206 | 83 | ||
| 84 | + | ||
| 85 | +---------- | ||
| 86 | + | ||
| 87 | +olevba contains modified source code from the officeparser project, published | ||
| 88 | +under the following MIT License (MIT): | ||
| 89 | + | ||
| 90 | +officeparser is copyright (c) 2014 John William Davison | ||
| 91 | + | ||
| 92 | +Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| 93 | +of this software and associated documentation files (the "Software"), to deal | ||
| 94 | +in the Software without restriction, including without limitation the rights | ||
| 95 | +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| 96 | +copies of the Software, and to permit persons to whom the Software is | ||
| 97 | +furnished to do so, subject to the following conditions: | ||
| 98 | + | ||
| 99 | +The above copyright notice and this permission notice shall be included in all | ||
| 100 | +copies or substantial portions of the Software. | ||
| 101 | + | ||
| 102 | +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| 103 | +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| 104 | +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| 105 | +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| 106 | +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| 107 | +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| 108 | +SOFTWARE. |
oletools/pyxswf.py
| @@ -55,6 +55,11 @@ __version__ = '0.02' | @@ -55,6 +55,11 @@ __version__ = '0.02' | ||
| 55 | 55 | ||
| 56 | #------------------------------------------------------------------------------ | 56 | #------------------------------------------------------------------------------ |
| 57 | # TODO: | 57 | # TODO: |
| 58 | +# + add support for LZMA-compressed flash files (ZWS header) | ||
| 59 | +# references: http://blog.malwaretracker.com/2014/01/cve-2013-5331-evaded-av-by-using.html | ||
| 60 | +# http://code.metager.de/source/xref/adobe/flash/crossbridge/tools/swf-info.py | ||
| 61 | +# http://room32.dyndns.org/forums/showthread.php?766-SWFCompression | ||
| 62 | +# sample code: http://room32.dyndns.org/SWFCompression.py | ||
| 58 | # - check if file is OLE | 63 | # - check if file is OLE |
| 59 | # - support -r | 64 | # - support -r |
| 60 | 65 |