Logo white

Peter M. Groen / oletools

Sign in
  • Sign in
  • Project
  • Files
  • Commits
  • Network
  • Graphs
  • Milestones
  • Issues 0
  • Merge Requests 0
  • Labels
  • Wiki
  • Commits 1,521
  • Compare
  • Branches 1
  • Tags 0
  • oletools
17 Jan, 2018
28 commits
  • oleobj: parse and dump from stream ...
    c8a4b6a9
    This way we do not have to keep a whole big office file in memory.
    (Olefile might do that, anyway, but then we have one copy less.)
    
    Also merge subfunction process_native_stream back into process_file
    (harder to read but makes more sense for exception handling)
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: parse OleNativeStream and OleObject from stream ...
    aa95f26a
    Can parse both now from bytes array or stream
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: change data parsing to change index rather than data ...
    dad20c2c
    This is more efficient and simplifies generalization to using byte-streams
    instead of byte arrays as data input.
    Christian Herdtweck authored
    2018-01-17 15:07:33 +0100  
    Browse Code »
  • oleobj: generalize "opening" of ole files to allow for other types ...
    2b3f8d3e
    This way, oleobj can now handle office 2007+ types (docx, xlsx, pptx, and
    derivates).
    
    Since this adds another loop level into process_file, created own function
    for inner-most code part (the actual dumping).
    Christian Herdtweck authored
    2018-01-17 15:07:30 +0100  
    Browse Code »
  • oleobj: add options -v and -i for compatibility with ripOLE
    cc142ee3
    Christian Herdtweck authored
    2018-01-17 15:05:16 +0100  
    Browse Code »
  • xls_parser: fix "wrong" variable name
    becb96f7
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • record_base: ensure streams are closed in iter_streams
    217d6114
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: pylint, pep8; fix history, add todo
    de9f5e91
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: provide OleFileIO from embedded files ...
    79564711
    This was not easy to do if we want to avoid having the complete embedded file
    in uncompressed form in memory. Had to create a stream around an iterable,
    kind of fun :-)
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • record_base: simplify bugfixing by offering more verbosity
    a93b2109
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • unittest: create tests for ppt_record_parser.is_ppt
    faeb2aed
    Christian Herdtweck authored
    2018-01-17 15:00:18 +0100  
    Browse Code »
  • ppt_record_parser: create function is_ppt
    5609051f
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt_record_parser: move constants to top of file
    8dc4854d
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: make pylint and pep8 happier
    8be66d11
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: provide stream type constants from olefile
    97035144
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • xls_parser: close stream after xlsb-parsing; update stream constructor
    989ead6c
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: offer a OleRecordStream.close
    cbbbfa23
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • xls_parser: fixup forgot rename parse-->finish_constructing
    ef014417
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt_record_parser: find and decompress embedded ole streams
    acfb36b3
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • record_base: rename parse --> finish_constructing, more docu
    38418c29
    Christian Herdtweck authored
    2018-01-17 15:00:17 +0100  
    Browse Code »
  • ppt records: compensate wrong size in CurrentUserAtom ...
    e90e0e5a
    This compensates for an inconsistency that is probably just an error in
    some ppt versions. The size attribute of the CurrentUserAtom "forgets"
    about the optional unicode user name, which then creates strange data
    behind the record (where nothing should be)
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • record_base: make compatible with container substreams
    470d0806
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • record_base: ignore [Document]SummaryInformation streams
    5c9b328c
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • xls_parser: rename type to rec_type to make pylint happier
    97990227
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • msodde: use new method name (get-->iter)_streams
    f1c708ac
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • ppt_parser: create new alternative based on records ...
    730c5088
    Sofar, the ppt_parser is rather stupid, does not understand the structure
    of the streams but just looks for a certain byte sequence anywhere in the
    stream (search_* methods).
    
    There was another attempt to understand and parse the stream structure
    but that failed (parse_* methods).
    
    Encouraged by xls_parser, that also parses the data as a series of
    records, tried the same with ppt files and works nicely sofar. Might
    be able to replace ppt_parser soon.
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • ooxml: implement skipping data in ZipSubFile
    3781f711
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »
  • xls_parser: move code to new record_base for re-use with ppt files ...
    d397edb5
    Parsing through records seems to make sense. Try to repeat the same with
    ppt files next. To avoid copy-and-paste, move code to be used by both to
    common base record_base.py
    Christian Herdtweck authored
    2018-01-17 15:00:16 +0100  
    Browse Code »

11 Jan, 2018
1 commit
  • fixed issue #242 (apply unquote to fldSimple tags)
    27dc5360
    decalage2 authored
    2018-01-11 11:43:26 +0100  
    Browse Code »

09 Jan, 2018
1 commit
  • Merge pull request #241 from christian-intra2net/dde-in-csv ...
    95ca88d2
    Dde in csv
    Philippe Lagadec authored
    2018-01-09 23:03:14 +0100  
    Browse Code »

05 Jan, 2018
8 commits
  • msodde: update doc, history and version
    874a5105
    Christian Herdtweck authored
    2018-01-05 10:48:55 +0100  
    Browse Code »
  • unittests: make pylint and pep8 a bit happier ...
    3977c68c
    They actually found a few \ in strings I had overlooked
    Christian Herdtweck authored
    2018-01-05 10:44:10 +0100  
    Browse Code »
  • unittest: add simple csv file and test it
    59a85138
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »
  • unittest: add .csv to list of files to be ignored ...
    4ac29b53
    Replace #print(...) with DEBUG_FLAG and conditional print(...)
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »
  • msodde: Wrap sys.stdout into unicode-encoder only in py2 ...
    63546685
    This is not necessary in python3
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »
  • msodde: limit csv delimiters to reasonable set ...
    5fe1f070
    The python sniffer would find "i" as delimiter in text or "<" in xml. We
    prefer an error over misinterpretation.
    
    Also, try all delimiters, not just a second one.
    
    Rename one constant (added CSV_)
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »
  • unittest: create new test for finding dde in csv
    fe87211b
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »
  • unittest: fix bug with OutputCapture and old loggers
    d6ffbaa5
    Christian Herdtweck authored
    2018-01-05 10:27:38 +0100  
    Browse Code »

04 Jan, 2018
2 commits
  • msodde: find DDE-links in CSV exported by Excel
    6956fce2
    Christian Herdtweck authored
    2018-01-04 18:48:01 +0100  
    Browse Code »
  • msodde: unify debug-logging what file type we found
    43c043d3
    Christian Herdtweck authored
    2018-01-04 18:47:20 +0100  
    Browse Code »