diff --git a/.travis.yml b/.travis.yml new file mode 100644 index 0000000..2c621c8 --- /dev/null +++ b/.travis.yml @@ -0,0 +1,9 @@ +language: python + +python: + - "2.7" + - "3.6" + - "nightly" +cache: pip +script: + - python setup.py test diff --git a/README.md b/README.md index 9cbe39b..7f3664c 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ python-oletools =============== +[![PyPI](https://img.shields.io/pypi/v/oletools.svg)](https://pypi.python.org/pypi/oletools) +[![Build Status](https://travis-ci.org/decalage2/oletools.svg?branch=master)](https://travis-ci.org/decalage2/oletools) [oletools](http://www.decalage.info/python/oletools) is a package of python tools to analyze [Microsoft OLE2 files](http://en.wikipedia.org/wiki/Compound_File_Binary_Format) @@ -22,7 +24,17 @@ Note: python-oletools is not related to OLETools published by BeCubed Software. News ---- -- **2016-11-01 v0.50**: all oletools now support python 2 and 3. +- **2017-06-29 v0.51**: + - added the [oletools cheatsheet](https://github.com/decalage2/oletools/blob/master/cheatsheet/oletools_cheatsheet.pdf) + - improved [rtfobj](https://github.com/decalage2/oletools/wiki/rtfobj) to handle malformed RTF files, detect vulnerability CVE-2017-0199 + - olevba: improved deobfuscation and Mac files support + - [mraptor](https://github.com/decalage2/oletools/wiki/mraptor): added more ActiveX macro triggers + - added [DocVarDump.vba](https://github.com/decalage2/oletools/blob/master/oletools/DocVarDump.vba) to dump document variables using Word + - olemap: can now detect and extract [extra data at end of file](http://decalage.info/en/ole_extradata), improved display + - oledir, olemeta, oletimes: added support for zip files and wildcards + - many [bugfixes](https://github.com/decalage2/oletools/milestone/3?closed=1) in all the tools + - improved Python 2+3 support +- 2016-11-01 v0.50: all oletools now support python 2 and 3. - olevba: several bugfixes and improvements. - mraptor: improved detection, added mraptor_milter for Sendmail/Postfix integration. - rtfobj: brand new RTF parser, obfuscation-aware, improved display, detect @@ -33,13 +45,6 @@ improved handling of malformed/incomplete documents, improved error handling and now returns an exit code based on analysis results, new --relaxed option. [rtfobj](https://github.com/decalage2/oletools/wiki/rtfobj): improved parsing to handle obfuscated RTF documents, added -d option to set output dir. Moved repository and documentation to GitHub. -- 2016-04-19 v0.46: [olevba](https://github.com/decalage2/oletools/wiki/olevba) -does not deobfuscate VBA expressions by default (much faster), new option --deobf -to enable it. Fixed color display bug on Windows for several tools. -- 2016-04-12 v0.45: improved [rtfobj](https://github.com/decalage2/oletools/wiki/rtfobj) -to handle several [anti-analysis tricks](http://www.decalage.info/rtf_tricks), -improved [olevba](https://github.com/decalage2/oletools/wiki/olevba) -to export results in JSON format. See the [full changelog](https://github.com/decalage2/oletools/wiki/Changelog) for more information. @@ -67,6 +72,7 @@ Projects using oletools: oletools are used by a number of projects and online malware analysis services, including [Viper](http://viper.li/), [REMnux](https://remnux.org/), +[FAME](https://certsocietegenerale.github.io/fame/), [Hybrid-analysis.com](https://www.hybrid-analysis.com/), [Joe Sandbox](https://www.document-analyzer.net/), [Deepviz](https://sandbox.deepviz.com/), @@ -129,7 +135,7 @@ License This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license. -The python-oletools package is copyright (c) 2012-2016 Philippe Lagadec (http://www.decalage.info) +The python-oletools package is copyright (c) 2012-2017 Philippe Lagadec (http://www.decalage.info) All rights reserved. diff --git a/cheatsheet/oletools_cheatsheet.docx b/cheatsheet/oletools_cheatsheet.docx new file mode 100644 index 0000000..f8ee793 --- /dev/null +++ b/cheatsheet/oletools_cheatsheet.docx diff --git a/cheatsheet/oletools_cheatsheet.pdf b/cheatsheet/oletools_cheatsheet.pdf new file mode 100644 index 0000000..d91fd32 --- /dev/null +++ b/cheatsheet/oletools_cheatsheet.pdf diff --git a/oletools/DocVarDump.vba b/oletools/DocVarDump.vba new file mode 100644 index 0000000..70b2137 --- /dev/null +++ b/oletools/DocVarDump.vba @@ -0,0 +1,117 @@ +' DocVarDump.vba +' +' DocVarDump is a VBA macro that can be used to dump the content of all document +' variables stored in a MS Word document. +' +' USAGE: +' 1. Open the document to be analyzed in MS Word +' 2. Do NOT click on "Enable Content", to avoid running malicious macros +' 3. Save the document with a new name, using the DOCX format (not doc, not docm) +' This will remove all VBA macro code. +' 4. Close the file, and reopen the DOCX file you just saved +' 5. Press Alt+F11 to open the VBA Editor +' 6. Double-click on "This Document" under Project +' 7. Copy and Paste all the code from DocVarDump.vba +' 8. Move the cursor on the line "Sub DocVarDump()" +' 9. Press F5: This should run the code, and create a file "docvardump.txt" +' containing a hex dump of all document variables. +' +' ALTERNATIVE: Open the document in LibreOffice/OpenOffice, +' then go to File / Properties / Custom Properties +' +' Author: Philippe Lagadec - http://www.decalage.info +' License: BSD, see source code or documentation +' +' DocVarDump is part of the python-oletools package: +' http://www.decalage.info/python/oletools + +' CHANGELOG: +' 2016-09-21 v0.01 PL: - First working version +' 2017-04-10 v0.02 PL: - Added usage instructions + +Sub DocVarDump() + intFileNum = FreeFile + FName = Environ("TEMP") & "\docvardump.txt" + Open FName For Output As intFileNum + For Each myvar In ActiveDocument.Variables + Write #intFileNum, "Name = " & myvar.Name + 'TODO: check VarType, and only use hexdump for strings with non-printable chars + Write #intFileNum, "Value = " & HexDump(myvar.value) + Write #intFileNum, + Next myvar + Close intFileNum + Documents.Open (FName) +End Sub + +Function Hex2(value As Integer) + h = Hex(value) + If Len(h) < 2 Then + h = "0" & h + End If + Hex2 = h +End Function + +Function HexN(value As Integer, nchars As Integer) + h = Hex(value) + Do While Len(h) < nchars + h = "0" & h + Loop + HexN = h +End Function + +Function ReplaceClean1(sText As String) + Dim J As Integer + Dim vAddText + + vAddText = Array(Chr(129), Chr(141), Chr(143), Chr(144), Chr(157)) + For J = 0 To 31 + sText = Replace(sText, Chr(J), "\x" & Hex2(J)) + Next + For J = 0 To UBound(vAddText) + c = vAddText(J) + a = Asc(c) + sText = Replace(sText, c, "\x" & Hex2(a)) + Next + ReplaceClean1 = sText +End Function + +Function ReplaceClean3(sText As String) + Dim J As Integer + For J = 0 To 31 + sText = Replace(sText, Chr(J), ".") + Next + For J = 127 To 255 + sText = Replace(sText, Chr(J), ".") + Next + ReplaceClean3 = sText +End Function + +Function HexBytes(sText As String) + Dim i As Integer + HexBytes = "" + For i = 1 To Len(sText) + HexBytes = HexBytes & Hex2(Asc(Mid(sText, i))) & " " + Next +End Function + + +Function HexDump(sText As String) + Dim chunk As String + Dim i As Long + ' "\" is integer division, "/" is normal division (float) + nbytes = 8 + nchunks = Len(sText) \ nbytes + lastchunk = Len(sText) Mod nbytes + HexDump = "" + For i = 0 To nchunks - 1 + Offset = HexN(i * nbytes, 8) + chunk = Mid(sText, i * nbytes + 1, nbytes) + HexDump = HexDump & Offset & " " & HexBytes(chunk) & " " & ReplaceClean3(chunk) & vbCrLf + Next i + 'TODO: LAST CHUNK! + If lastchunk > 0 Then + Offset = HexN(nchunks * nbytes, 8) + chunk = Mid(sText, nchunks * nbytes + 1, lastchunk) + HexDump = HexDump & Offset & " " & HexBytes(chunk) & " " & ReplaceClean3(chunk) & vbCrLf + End If +End Function diff --git a/oletools/LICENSE.txt b/oletools/LICENSE.txt index 4b9f629..5651e93 100644 --- a/oletools/LICENSE.txt +++ b/oletools/LICENSE.txt @@ -3,7 +3,7 @@ LICENSE for the python-oletools package: This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license. -The python-oletools package is copyright (c) 2012-2016 Philippe Lagadec (http://www.decalage.info) +The python-oletools package is copyright (c) 2012-2017 Philippe Lagadec (http://www.decalage.info) All rights reserved. diff --git a/oletools/README.html b/oletools/README.html index 2e40975..5a3199e 100644 --- a/oletools/README.html +++ b/oletools/README.html @@ -9,12 +9,24 @@

python-oletools

-

oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.

+

oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.

Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the Author - Repository - Updates on Twitter

Note: python-oletools is not related to OLETools published by BeCubed Software.

News

-

Tools in python-oletools:

+

See the full changelog for more information.

+

Tools:

Projects using oletools:

-

oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, pcodedmp and probably VirusTotal. (Please contact me if you have or know a project using oletools)

+

oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, FAME, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, ViperMonkey, pcodedmp, dridex.malwareconfig.com, and probably VirusTotal. (Please contact me if you have or know a project using oletools)

Download and Install:

-

To use python-oletools from the command line as analysis tools, you may simply download the latest release archive and extract the files into the directory of your choice.

-

You may also download the latest development version with the most recent features.

-

Another possibility is to use a git client to clone the repository (https://github.com/decalage2/oletools.git) into a folder. You can then update it easily in the future.

-

If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "pip install oletools" or "easy_install oletools" to download and install in one go. Otherwise you may download/extract the zip archive and run "setup.py install".

-

Important: to update oletools if it is already installed, you must run "pip install -U oletools", otherwise pip will not update it.

+

The recommended way to download and install/update the latest stable release of oletools is to use pip:

+ +

This should automatically create command-line scripts to run each tool from any directory: olevba, mraptor, rtfobj, etc.

+

To get the latest development version instead:

+ +

See the documentation for other installation options.

Documentation:

The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.

How to Suggest Improvements, Report Issues or Contribute:

@@ -75,7 +75,7 @@

The code is available in a GitHub repository. You may use it to submit enhancements using forks and pull requests.

License

This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.

-

The python-oletools package is copyright (c) 2012-2016 Philippe Lagadec (http://www.decalage.info)

+

The python-oletools package is copyright (c) 2012-2017 Philippe Lagadec (http://www.decalage.info)

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Example:

-
for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
-    print '-'*79
-    print 'Filename    :', filename
-    print 'OLE stream  :', stream_path
-    print 'VBA filename:', vba_filename
-    print '- '*39
-    print vba_code
+
for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
+    print '-'*79
+    print 'Filename    :', filename
+    print 'OLE stream  :', stream_path
+    print 'VBA filename:', vba_filename
+    print '- '*39
+    print vba_code

Alternatively, the VBA_Parser method extract_all_macros returns the same results as a list of tuples.

Analyze VBA Source Code

Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained in a file:

@@ -265,24 +283,24 @@ vbaparser = VBA_Parser(myfile, data=filedata)
  • description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  • Example:

    -
    results = vbaparser.analyze_macros()
    -for kw_type, keyword, description in results:
    -    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
    +
    results = vbaparser.analyze_macros()
    +for kw_type, keyword, description in results:
    +    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)

    After calling analyze_macros, the following VBA_Parser attributes also provide the number of items found for each category:

    -
    print 'AutoExec keywords: %d' % vbaparser.nb_autoexec
    -print 'Suspicious keywords: %d' % vbaparser.nb_suspicious
    -print 'IOCs: %d' % vbaparser.nb_iocs
    -print 'Hex obfuscated strings: %d' % vbaparser.nb_hexstrings
    -print 'Base64 obfuscated strings: %d' % vbaparser.nb_base64strings
    -print 'Dridex obfuscated strings: %d' % vbaparser.nb_dridexstrings
    -print 'VBA obfuscated strings: %d' % vbaparser.nb_vbastrings
    +
    print 'AutoExec keywords: %d' % vbaparser.nb_autoexec
    +print 'Suspicious keywords: %d' % vbaparser.nb_suspicious
    +print 'IOCs: %d' % vbaparser.nb_iocs
    +print 'Hex obfuscated strings: %d' % vbaparser.nb_hexstrings
    +print 'Base64 obfuscated strings: %d' % vbaparser.nb_base64strings
    +print 'Dridex obfuscated strings: %d' % vbaparser.nb_dridexstrings
    +print 'VBA obfuscated strings: %d' % vbaparser.nb_vbastrings

    Deobfuscate VBA Macro Source Code

    The method reveal attempts to deobfuscate the macro source code by replacing all the obfuscated strings by their decoded content. Returns a single string.

    Example:

    -
    print vbaparser.reveal()
    +
    print vbaparser.reveal()

    Close the VBA_Parser

    After usage, it is better to call the close method of the VBA_Parser object, to make sure the file is closed, especially if your application is parsing many files.

    -
    vbaparser.close()
    +
    vbaparser.close()

    Deprecated API

    The following methods and functions are still functional, but their usage is not recommended since they have been replaced by better solutions.

    @@ -297,54 +315,54 @@ vbaparser = VBA_Parser(myfile, data=filedata)
  • description provides a description of the keyword. For obfuscated strings, it is the encoded value of the string.
  • Example:

    -
    vba_scanner = VBA_Scanner(vba_code)
    -results = vba_scanner.scan(include_decoded_strings=True)
    -for kw_type, keyword, description in results:
    -    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
    +
    vba_scanner = VBA_Scanner(vba_code)
    +results = vba_scanner.scan(include_decoded_strings=True)
    +for kw_type, keyword, description in results:
    +    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)

    The function scan_vba is a shortcut for VBA_Scanner(vba_code).scan():

    -
    results = scan_vba(vba_code, include_decoded_strings=True)
    -for kw_type, keyword, description in results:
    -    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
    +
    results = scan_vba(vba_code, include_decoded_strings=True)
    +for kw_type, keyword, description in results:
    +    print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)

    scan_summary returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).

    Detect auto-executable macros (deprecated)

    Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.

    The function detect_autoexec checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.

    It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)

    Sample usage:

    -
    from oletools.olevba import detect_autoexec
    -autoexec_keywords = detect_autoexec(vba_code)
    -if autoexec_keywords:
    -    print 'Auto-executable macro keywords found:'
    -    for keyword, description in autoexec_keywords:
    -        print '%s: %s' % (keyword, description)
    -else:
    -    print 'Auto-executable macro keywords: None found'
    +
    from oletools.olevba import detect_autoexec
    +autoexec_keywords = detect_autoexec(vba_code)
    +if autoexec_keywords:
    +    print 'Auto-executable macro keywords found:'
    +    for keyword, description in autoexec_keywords:
    +        print '%s: %s' % (keyword, description)
    +else:
    +    print 'Auto-executable macro keywords: None found'

    Detect suspicious VBA keywords (deprecated)

    Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.

    The function detect_suspicious checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).

    It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)

    Sample usage:

    -
    from oletools.olevba import detect_suspicious
    -suspicious_keywords = detect_suspicious(vba_code)
    -if suspicious_keywords:
    -    print 'Suspicious VBA keywords found:'
    -    for keyword, description in suspicious_keywords:
    -        print '%s: %s' % (keyword, description)
    -else:
    -    print 'Suspicious VBA keywords: None found'
    +
    from oletools.olevba import detect_suspicious
    +suspicious_keywords = detect_suspicious(vba_code)
    +if suspicious_keywords:
    +    print 'Suspicious VBA keywords found:'
    +    for keyword, description in suspicious_keywords:
    +        print '%s: %s' % (keyword, description)
    +else:
    +    print 'Suspicious VBA keywords: None found'

    Extract potential IOCs (deprecated)

    Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.

    The function detect_patterns checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.

    It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)

    Sample usage:

    -
    from oletools.olevba import detect_patterns
    -patterns = detect_patterns(vba_code)
    -if patterns:
    -    print 'Patterns found:'
    -    for pattern_type, value in patterns:
    -        print '%s: %s' % (pattern_type, value)
    -else:
    -    print 'Patterns: None found'
    +
    from oletools.olevba import detect_patterns
    +patterns = detect_patterns(vba_code)
    +if patterns:
    +    print 'Patterns found:'
    +    for pattern_type, value in patterns:
    +        print '%s: %s' % (pattern_type, value)
    +else:
    +    print 'Patterns: None found'

    python-oletools documentation