Logo white

Peter M. Groen / oletools

Sign in
  • Sign in
  • Project
  • Files
  • Commits
  • Network
  • Graphs
  • Milestones
  • Issues 0
  • Merge Requests 0
  • Labels
  • Wiki
  • Commits 1,521
  • Compare
  • Branches 1
  • Tags 0
  • oletools
17 Jan, 2018
40 commits
  • ppt_parser: add warning that this might be replaced ...
    207ca8a7
    Want to discourage people working on ppt_parser, which would increase the
    amount of code required to reprodcue in ppt_record_parser in order for it
    to replace ppt_parser
    Christian Herdtweck authored
    2018-01-17 15:43:38 +0100  
    Browse Code »
  • unittests: run pylint and pep8 on oleobj test
    cb072e36
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • unittests: add more samples to oleobj test
    46920be6
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: make sane filenames always ascii-only ...
    ccbe0b23
    Regular expression \w behaves differently in Python2 (matches only ascii)
    and Python3 (matches all unicode word characters). Clarify that we only
    want ascii in sanitized filenames.
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: unify closing of ole stream in error case
    2c0f8847
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: improve logging slightly
    d2920ad4
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • ppt_record_parser: ensure import is relative ...
    5d7a6445
    Strangest thing: this change was necessary for unittesting oleobj. Without
    this, running python3.3 -m unittest tests.oleobj.test_basic resulted in:
    AttributeError: 'module' object has no attribute 'oleobj' . That was a
    rather unhelpful error message.
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: use ZipSubFile to allow OleFileIO to seek()
    a5036230
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: use absolute import to make py3-compatible
    ea58877a
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • unittest: create unittests for ooxml.ZipSubFile
    56b79d1e
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • ooxml: re-implement complete seek(); add attr closed ...
    d4eb585e
    OleFileIO requires a complete seek() and checks for closed attribute.
    
    Also added some commented debug print commands to ZipSubFile
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: fix logging --> log and make it lazy where possible ...
    a7d1050e
    Also remove 1 exception from output and add a comment
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • unittest: add 3 tests with 6 samples for oleobj
    1ee956aa
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: accept custom command line args for testing
    9977c523
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: upgrade from optparse to argparse
    3f009e76
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: encode filenames/paths to unicode ...
    471b141f
    This make compatibility with py3 easier, but requires us to guess an
    encoding. Should work fine for European-generated files, could produce
    strange results from Asian files.
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: make pylint and pep8 happier ...
    670d7075
    Most changes are just whitespace or line break or case changes. But:
    - this did find an actual error (variable exc was used before creation)
    - did move imports up between license and changelog (although I would prefer
    it in its original place)
    - removed the _ansi_ from read_*_ansi_string
    - move logging constants from main to global scope
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: vary shell status code if dumped or not / error occurred ...
    1665aeea
    Tell caller of script roughly what happened in call.
    
    Also: check whether given file arguments exist and return non-zero exit
    and remove print of non-existent __doc__
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: remember whether OleNativeStream data is stream/link
    7680eb11
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: parse and dump from stream ...
    c8a4b6a9
    This way we do not have to keep a whole big office file in memory.
    (Olefile might do that, anyway, but then we have one copy less.)
    
    Also merge subfunction process_native_stream back into process_file
    (harder to read but makes more sense for exception handling)
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: parse OleNativeStream and OleObject from stream ...
    aa95f26a
    Can parse both now from bytes array or stream
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: change data parsing to change index rather than data ...
    dad20c2c
    This is more efficient and simplifies generalization to using byte-streams
    instead of byte arrays as data input.
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: generalize "opening" of ole files to allow for other types ...
    2b3f8d3e
    This way, oleobj can now handle office 2007+ types (docx, xlsx, pptx, and
    derivates).
    
    Since this adds another loop level into process_file, created own function
    for inner-most code part (the actual dumping).
    Christian Herdtweck authored
    2018-01-17 15:07:30 +0100  
    Browse Code »
  • oleobj: add options -v and -i for compatibility with ripOLE
    cc142ee3
    Christian Herdtweck authored
    2018-01-17 15:05:16 +0100  
    Browse Code »
  • xls_parser: fix "wrong" variable name
    becb96f7
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • record_base: ensure streams are closed in iter_streams
    217d6114
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: pylint, pep8; fix history, add todo
    de9f5e91
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: provide OleFileIO from embedded files ...
    79564711
    This was not easy to do if we want to avoid having the complete embedded file
    in uncompressed form in memory. Had to create a stream around an iterable,
    kind of fun :-)
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • record_base: simplify bugfixing by offering more verbosity
    a93b2109
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • unittest: create tests for ppt_record_parser.is_ppt
    faeb2aed
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: create function is_ppt
    5609051f
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt_record_parser: move constants to top of file
    8dc4854d
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: make pylint and pep8 happier
    8be66d11
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: provide stream type constants from olefile
    97035144
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • xls_parser: close stream after xlsb-parsing; update stream constructor
    989ead6c
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: offer a OleRecordStream.close
    cbbbfa23
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • xls_parser: fixup forgot rename parse-->finish_constructing
    ef014417
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt_record_parser: find and decompress embedded ole streams
    acfb36b3
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: rename parse --> finish_constructing, more docu
    38418c29
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt records: compensate wrong size in CurrentUserAtom ...
    e90e0e5a
    This compensates for an inconsistency that is probably just an error in
    some ppt versions. The size attribute of the CurrentUserAtom "forgets"
    about the optional unicode user name, which then creates strange data
    behind the record (where nothing should be)
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »