Commit ab1ba65e68e1534961a8ada1bbcbe4f5c623231f

Authored by Philippe Lagadec
1 parent b9b82e25

olevba: added pyparsing into thirdparty folder

oletools/thirdparty/pyparsing/LICENSE 0 → 100644
  1 +Permission is hereby granted, free of charge, to any person obtaining
  2 +a copy of this software and associated documentation files (the
  3 +"Software"), to deal in the Software without restriction, including
  4 +without limitation the rights to use, copy, modify, merge, publish,
  5 +distribute, sublicense, and/or sell copies of the Software, and to
  6 +permit persons to whom the Software is furnished to do so, subject to
  7 +the following conditions:
  8 +
  9 +The above copyright notice and this permission notice shall be
  10 +included in all copies or substantial portions of the Software.
  11 +
  12 +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  13 +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  14 +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
  15 +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
  16 +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
  17 +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
  18 +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
oletools/thirdparty/pyparsing/README 0 → 100644
  1 +====================================
  2 +PyParsing -- A Python Parsing Module
  3 +====================================
  4 +
  5 +Introduction
  6 +============
  7 +
  8 +The pyparsing module is an alternative approach to creating and executing
  9 +simple grammars, vs. the traditional lex/yacc approach, or the use of
  10 +regular expressions. The pyparsing module provides a library of classes
  11 +that client code uses to construct the grammar directly in Python code.
  12 +
  13 +Here is a program to parse "Hello, World!" (or any greeting of the form
  14 +"<salutation>, <addressee>!"):
  15 +
  16 + from pyparsing import Word, alphas
  17 + greet = Word( alphas ) + "," + Word( alphas ) + "!"
  18 + hello = "Hello, World!"
  19 + print hello, "->", greet.parseString( hello )
  20 +
  21 +The program outputs the following:
  22 +
  23 + Hello, World! -> ['Hello', ',', 'World', '!']
  24 +
  25 +The Python representation of the grammar is quite readable, owing to the
  26 +self-explanatory class names, and the use of '+', '|' and '^' operator
  27 +definitions.
  28 +
  29 +The parsed results returned from parseString() can be accessed as a
  30 +nested list, a dictionary, or an object with named attributes.
  31 +
  32 +The pyparsing module handles some of the problems that are typically
  33 +vexing when writing text parsers:
  34 +- extra or missing whitespace (the above program will also handle
  35 + "Hello,World!", "Hello , World !", etc.)
  36 +- quoted strings
  37 +- embedded comments
  38 +
  39 +The .zip file includes examples of a simple SQL parser, simple CORBA IDL
  40 +parser, a config file parser, a chemical formula parser, and a four-
  41 +function algebraic notation parser. It also includes a simple how-to
  42 +document, and a UML class diagram of the library's classes.
  43 +
  44 +
  45 +
  46 +Installation
  47 +============
  48 +
  49 +Do the usual:
  50 +
  51 + python setup.py install
  52 +
  53 +(pyparsing requires Python 2.3.2 or later.)
  54 +
  55 +
  56 +Documentation
  57 +=============
  58 +
  59 +See:
  60 +
  61 + HowToUsePyparsing.html
  62 +
  63 +
  64 +License
  65 +=======
  66 +
  67 + MIT License. See header of pyparsing.py
  68 +
  69 +History
  70 +=======
  71 +
  72 + See CHANGES file.
oletools/thirdparty/pyparsing/__init__.py 0 → 100644
oletools/thirdparty/pyparsing/pyparsing.py 0 → 100644
  1 +# module pyparsing.py
  2 +#
  3 +# Copyright (c) 2003-2013 Paul T. McGuire
  4 +#
  5 +# Permission is hereby granted, free of charge, to any person obtaining
  6 +# a copy of this software and associated documentation files (the
  7 +# "Software"), to deal in the Software without restriction, including
  8 +# without limitation the rights to use, copy, modify, merge, publish,
  9 +# distribute, sublicense, and/or sell copies of the Software, and to
  10 +# permit persons to whom the Software is furnished to do so, subject to
  11 +# the following conditions:
  12 +#
  13 +# The above copyright notice and this permission notice shall be
  14 +# included in all copies or substantial portions of the Software.
  15 +#
  16 +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  17 +# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  18 +# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
  19 +# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
  20 +# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
  21 +# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
  22 +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  23 +#
  24 +
  25 +__doc__ = \
  26 +"""
  27 +pyparsing module - Classes and methods to define and execute parsing grammars
  28 +
  29 +The pyparsing module is an alternative approach to creating and executing simple grammars,
  30 +vs. the traditional lex/yacc approach, or the use of regular expressions. With pyparsing, you
  31 +don't need to learn a new syntax for defining grammars or matching expressions - the parsing module
  32 +provides a library of classes that you use to construct the grammar directly in Python.
  33 +
  34 +Here is a program to parse "Hello, World!" (or any greeting of the form C{"<salutation>, <addressee>!"})::
  35 +
  36 + from pyparsing import Word, alphas
  37 +
  38 + # define grammar of a greeting
  39 + greet = Word( alphas ) + "," + Word( alphas ) + "!"
  40 +
  41 + hello = "Hello, World!"
  42 + print (hello, "->", greet.parseString( hello ))
  43 +
  44 +The program outputs the following::
  45 +
  46 + Hello, World! -> ['Hello', ',', 'World', '!']
  47 +
  48 +The Python representation of the grammar is quite readable, owing to the self-explanatory
  49 +class names, and the use of '+', '|' and '^' operators.
  50 +
  51 +The parsed results returned from C{parseString()} can be accessed as a nested list, a dictionary, or an
  52 +object with named attributes.
  53 +
  54 +The pyparsing module handles some of the problems that are typically vexing when writing text parsers:
  55 + - extra or missing whitespace (the above program will also handle "Hello,World!", "Hello , World !", etc.)
  56 + - quoted strings
  57 + - embedded comments
  58 +"""
  59 +
  60 +__version__ = "2.0.3"
  61 +__versionTime__ = "16 Aug 2014 00:12"
  62 +__author__ = "Paul McGuire <ptmcg@users.sourceforge.net>"
  63 +
  64 +import string
  65 +from weakref import ref as wkref
  66 +import copy
  67 +import sys
  68 +import warnings
  69 +import re
  70 +import sre_constants
  71 +import collections
  72 +import pprint
  73 +#~ sys.stderr.write( "testing pyparsing module, version %s, %s\n" % (__version__,__versionTime__ ) )
  74 +
  75 +__all__ = [
  76 +'And', 'CaselessKeyword', 'CaselessLiteral', 'CharsNotIn', 'Combine', 'Dict', 'Each', 'Empty',
  77 +'FollowedBy', 'Forward', 'GoToColumn', 'Group', 'Keyword', 'LineEnd', 'LineStart', 'Literal',
  78 +'MatchFirst', 'NoMatch', 'NotAny', 'OneOrMore', 'OnlyOnce', 'Optional', 'Or',
  79 +'ParseBaseException', 'ParseElementEnhance', 'ParseException', 'ParseExpression', 'ParseFatalException',
  80 +'ParseResults', 'ParseSyntaxException', 'ParserElement', 'QuotedString', 'RecursiveGrammarException',
  81 +'Regex', 'SkipTo', 'StringEnd', 'StringStart', 'Suppress', 'Token', 'TokenConverter', 'Upcase',
  82 +'White', 'Word', 'WordEnd', 'WordStart', 'ZeroOrMore',
  83 +'alphanums', 'alphas', 'alphas8bit', 'anyCloseTag', 'anyOpenTag', 'cStyleComment', 'col',
  84 +'commaSeparatedList', 'commonHTMLEntity', 'countedArray', 'cppStyleComment', 'dblQuotedString',
  85 +'dblSlashComment', 'delimitedList', 'dictOf', 'downcaseTokens', 'empty', 'hexnums',
  86 +'htmlComment', 'javaStyleComment', 'keepOriginalText', 'line', 'lineEnd', 'lineStart', 'lineno',
  87 +'makeHTMLTags', 'makeXMLTags', 'matchOnlyAtCol', 'matchPreviousExpr', 'matchPreviousLiteral',
  88 +'nestedExpr', 'nullDebugAction', 'nums', 'oneOf', 'opAssoc', 'operatorPrecedence', 'printables',
  89 +'punc8bit', 'pythonStyleComment', 'quotedString', 'removeQuotes', 'replaceHTMLEntity',
  90 +'replaceWith', 'restOfLine', 'sglQuotedString', 'srange', 'stringEnd',
  91 +'stringStart', 'traceParseAction', 'unicodeString', 'upcaseTokens', 'withAttribute',
  92 +'indentedBlock', 'originalTextFor', 'ungroup', 'infixNotation','locatedExpr',
  93 +]
  94 +
  95 +PY_3 = sys.version.startswith('3')
  96 +if PY_3:
  97 + _MAX_INT = sys.maxsize
  98 + basestring = str
  99 + unichr = chr
  100 + _ustr = str
  101 +
  102 + # build list of single arg builtins, that can be used as parse actions
  103 + singleArgBuiltins = [sum, len, sorted, reversed, list, tuple, set, any, all, min, max]
  104 +
  105 +else:
  106 + _MAX_INT = sys.maxint
  107 + range = xrange
  108 +
  109 + def _ustr(obj):
  110 + """Drop-in replacement for str(obj) that tries to be Unicode friendly. It first tries
  111 + str(obj). If that fails with a UnicodeEncodeError, then it tries unicode(obj). It
  112 + then < returns the unicode object | encodes it with the default encoding | ... >.
  113 + """
  114 + if isinstance(obj,unicode):
  115 + return obj
  116 +
  117 + try:
  118 + # If this works, then _ustr(obj) has the same behaviour as str(obj), so
  119 + # it won't break any existing code.
  120 + return str(obj)
  121 +
  122 + except UnicodeEncodeError:
  123 + # The Python docs (http://docs.python.org/ref/customization.html#l2h-182)
  124 + # state that "The return value must be a string object". However, does a
  125 + # unicode object (being a subclass of basestring) count as a "string
  126 + # object"?
  127 + # If so, then return a unicode object:
  128 + return unicode(obj)
  129 + # Else encode it... but how? There are many choices... :)
  130 + # Replace unprintables with escape codes?
  131 + #return unicode(obj).encode(sys.getdefaultencoding(), 'backslashreplace_errors')
  132 + # Replace unprintables with question marks?
  133 + #return unicode(obj).encode(sys.getdefaultencoding(), 'replace')
  134 + # ...
  135 +
  136 + # build list of single arg builtins, tolerant of Python version, that can be used as parse actions
  137 + singleArgBuiltins = []
  138 + import __builtin__
  139 + for fname in "sum len sorted reversed list tuple set any all min max".split():
  140 + try:
  141 + singleArgBuiltins.append(getattr(__builtin__,fname))
  142 + except AttributeError:
  143 + continue
  144 +
  145 +_generatorType = type((y for y in range(1)))
  146 +
  147 +def _xml_escape(data):
  148 + """Escape &, <, >, ", ', etc. in a string of data."""
  149 +
  150 + # ampersand must be replaced first
  151 + from_symbols = '&><"\''
  152 + to_symbols = ('&'+s+';' for s in "amp gt lt quot apos".split())
  153 + for from_,to_ in zip(from_symbols, to_symbols):
  154 + data = data.replace(from_, to_)
  155 + return data
  156 +
  157 +class _Constants(object):
  158 + pass
  159 +
  160 +alphas = string.ascii_lowercase + string.ascii_uppercase
  161 +nums = "0123456789"
  162 +hexnums = nums + "ABCDEFabcdef"
  163 +alphanums = alphas + nums
  164 +_bslash = chr(92)
  165 +printables = "".join(c for c in string.printable if c not in string.whitespace)
  166 +
  167 +class ParseBaseException(Exception):
  168 + """base exception class for all parsing runtime exceptions"""
  169 + # Performance tuning: we construct a *lot* of these, so keep this
  170 + # constructor as small and fast as possible
  171 + def __init__( self, pstr, loc=0, msg=None, elem=None ):
  172 + self.loc = loc
  173 + if msg is None:
  174 + self.msg = pstr
  175 + self.pstr = ""
  176 + else:
  177 + self.msg = msg
  178 + self.pstr = pstr
  179 + self.parserElement = elem
  180 +
  181 + def __getattr__( self, aname ):
  182 + """supported attributes by name are:
  183 + - lineno - returns the line number of the exception text
  184 + - col - returns the column number of the exception text
  185 + - line - returns the line containing the exception text
  186 + """
  187 + if( aname == "lineno" ):
  188 + return lineno( self.loc, self.pstr )
  189 + elif( aname in ("col", "column") ):
  190 + return col( self.loc, self.pstr )
  191 + elif( aname == "line" ):
  192 + return line( self.loc, self.pstr )
  193 + else:
  194 + raise AttributeError(aname)
  195 +
  196 + def __str__( self ):
  197 + return "%s (at char %d), (line:%d, col:%d)" % \
  198 + ( self.msg, self.loc, self.lineno, self.column )
  199 + def __repr__( self ):
  200 + return _ustr(self)
  201 + def markInputline( self, markerString = ">!<" ):
  202 + """Extracts the exception line from the input string, and marks
  203 + the location of the exception with a special symbol.
  204 + """
  205 + line_str = self.line
  206 + line_column = self.column - 1
  207 + if markerString:
  208 + line_str = "".join((line_str[:line_column],
  209 + markerString, line_str[line_column:]))
  210 + return line_str.strip()
  211 + def __dir__(self):
  212 + return "loc msg pstr parserElement lineno col line " \
  213 + "markInputline __str__ __repr__".split()
  214 +
  215 +class ParseException(ParseBaseException):
  216 + """exception thrown when parse expressions don't match class;
  217 + supported attributes by name are:
  218 + - lineno - returns the line number of the exception text
  219 + - col - returns the column number of the exception text
  220 + - line - returns the line containing the exception text
  221 + """
  222 + pass
  223 +
  224 +class ParseFatalException(ParseBaseException):
  225 + """user-throwable exception thrown when inconsistent parse content
  226 + is found; stops all parsing immediately"""
  227 + pass
  228 +
  229 +class ParseSyntaxException(ParseFatalException):
  230 + """just like C{L{ParseFatalException}}, but thrown internally when an
  231 + C{L{ErrorStop<And._ErrorStop>}} ('-' operator) indicates that parsing is to stop immediately because
  232 + an unbacktrackable syntax error has been found"""
  233 + def __init__(self, pe):
  234 + super(ParseSyntaxException, self).__init__(
  235 + pe.pstr, pe.loc, pe.msg, pe.parserElement)
  236 +
  237 +#~ class ReparseException(ParseBaseException):
  238 + #~ """Experimental class - parse actions can raise this exception to cause
  239 + #~ pyparsing to reparse the input string:
  240 + #~ - with a modified input string, and/or
  241 + #~ - with a modified start location
  242 + #~ Set the values of the ReparseException in the constructor, and raise the
  243 + #~ exception in a parse action to cause pyparsing to use the new string/location.
  244 + #~ Setting the values as None causes no change to be made.
  245 + #~ """
  246 + #~ def __init_( self, newstring, restartLoc ):
  247 + #~ self.newParseText = newstring
  248 + #~ self.reparseLoc = restartLoc
  249 +
  250 +class RecursiveGrammarException(Exception):
  251 + """exception thrown by C{validate()} if the grammar could be improperly recursive"""
  252 + def __init__( self, parseElementList ):
  253 + self.parseElementTrace = parseElementList
  254 +
  255 + def __str__( self ):
  256 + return "RecursiveGrammarException: %s" % self.parseElementTrace
  257 +
  258 +class _ParseResultsWithOffset(object):
  259 + def __init__(self,p1,p2):
  260 + self.tup = (p1,p2)
  261 + def __getitem__(self,i):
  262 + return self.tup[i]
  263 + def __repr__(self):
  264 + return repr(self.tup)
  265 + def setOffset(self,i):
  266 + self.tup = (self.tup[0],i)
  267 +
  268 +class ParseResults(object):
  269 + """Structured parse results, to provide multiple means of access to the parsed data:
  270 + - as a list (C{len(results)})
  271 + - by list index (C{results[0], results[1]}, etc.)
  272 + - by attribute (C{results.<resultsName>})
  273 + """
  274 + def __new__(cls, toklist, name=None, asList=True, modal=True ):
  275 + if isinstance(toklist, cls):
  276 + return toklist
  277 + retobj = object.__new__(cls)
  278 + retobj.__doinit = True
  279 + return retobj
  280 +
  281 + # Performance tuning: we construct a *lot* of these, so keep this
  282 + # constructor as small and fast as possible
  283 + def __init__( self, toklist, name=None, asList=True, modal=True, isinstance=isinstance ):
  284 + if self.__doinit:
  285 + self.__doinit = False
  286 + self.__name = None
  287 + self.__parent = None
  288 + self.__accumNames = {}
  289 + if isinstance(toklist, list):
  290 + self.__toklist = toklist[:]
  291 + elif isinstance(toklist, _generatorType):
  292 + self.__toklist = list(toklist)
  293 + else:
  294 + self.__toklist = [toklist]
  295 + self.__tokdict = dict()
  296 +
  297 + if name is not None and name:
  298 + if not modal:
  299 + self.__accumNames[name] = 0
  300 + if isinstance(name,int):
  301 + name = _ustr(name) # will always return a str, but use _ustr for consistency
  302 + self.__name = name
  303 + if not (isinstance(toklist, (type(None), basestring, list)) and toklist in (None,'',[])):
  304 + if isinstance(toklist,basestring):
  305 + toklist = [ toklist ]
  306 + if asList:
  307 + if isinstance(toklist,ParseResults):
  308 + self[name] = _ParseResultsWithOffset(toklist.copy(),0)
  309 + else:
  310 + self[name] = _ParseResultsWithOffset(ParseResults(toklist[0]),0)
  311 + self[name].__name = name
  312 + else:
  313 + try:
  314 + self[name] = toklist[0]
  315 + except (KeyError,TypeError,IndexError):
  316 + self[name] = toklist
  317 +
  318 + def __getitem__( self, i ):
  319 + if isinstance( i, (int,slice) ):
  320 + return self.__toklist[i]
  321 + else:
  322 + if i not in self.__accumNames:
  323 + return self.__tokdict[i][-1][0]
  324 + else:
  325 + return ParseResults([ v[0] for v in self.__tokdict[i] ])
  326 +
  327 + def __setitem__( self, k, v, isinstance=isinstance ):
  328 + if isinstance(v,_ParseResultsWithOffset):
  329 + self.__tokdict[k] = self.__tokdict.get(k,list()) + [v]
  330 + sub = v[0]
  331 + elif isinstance(k,int):
  332 + self.__toklist[k] = v
  333 + sub = v
  334 + else:
  335 + self.__tokdict[k] = self.__tokdict.get(k,list()) + [_ParseResultsWithOffset(v,0)]
  336 + sub = v
  337 + if isinstance(sub,ParseResults):
  338 + sub.__parent = wkref(self)
  339 +
  340 + def __delitem__( self, i ):
  341 + if isinstance(i,(int,slice)):
  342 + mylen = len( self.__toklist )
  343 + del self.__toklist[i]
  344 +
  345 + # convert int to slice
  346 + if isinstance(i, int):
  347 + if i < 0:
  348 + i += mylen
  349 + i = slice(i, i+1)
  350 + # get removed indices
  351 + removed = list(range(*i.indices(mylen)))
  352 + removed.reverse()
  353 + # fixup indices in token dictionary
  354 + for name in self.__tokdict:
  355 + occurrences = self.__tokdict[name]
  356 + for j in removed:
  357 + for k, (value, position) in enumerate(occurrences):
  358 + occurrences[k] = _ParseResultsWithOffset(value, position - (position > j))
  359 + else:
  360 + del self.__tokdict[i]
  361 +
  362 + def __contains__( self, k ):
  363 + return k in self.__tokdict
  364 +
  365 + def __len__( self ): return len( self.__toklist )
  366 + def __bool__(self): return len( self.__toklist ) > 0
  367 + __nonzero__ = __bool__
  368 + def __iter__( self ): return iter( self.__toklist )
  369 + def __reversed__( self ): return iter( self.__toklist[::-1] )
  370 + def iterkeys( self ):
  371 + """Returns all named result keys."""
  372 + if hasattr(self.__tokdict, "iterkeys"):
  373 + return self.__tokdict.iterkeys()
  374 + else:
  375 + return iter(self.__tokdict)
  376 +
  377 + def itervalues( self ):
  378 + """Returns all named result values."""
  379 + return (self[k] for k in self.iterkeys())
  380 +
  381 + def iteritems( self ):
  382 + return ((k, self[k]) for k in self.iterkeys())
  383 +
  384 + if PY_3:
  385 + keys = iterkeys
  386 + values = itervalues
  387 + items = iteritems
  388 + else:
  389 + def keys( self ):
  390 + """Returns all named result keys."""
  391 + return list(self.iterkeys())
  392 +
  393 + def values( self ):
  394 + """Returns all named result values."""
  395 + return list(self.itervalues())
  396 +
  397 + def items( self ):
  398 + """Returns all named result keys and values as a list of tuples."""
  399 + return list(self.iteritems())
  400 +
  401 + def haskeys( self ):
  402 + """Since keys() returns an iterator, this method is helpful in bypassing
  403 + code that looks for the existence of any defined results names."""
  404 + return bool(self.__tokdict)
  405 +
  406 + def pop( self, *args, **kwargs):
  407 + """Removes and returns item at specified index (default=last).
  408 + Supports both list and dict semantics for pop(). If passed no
  409 + argument or an integer argument, it will use list semantics
  410 + and pop tokens from the list of parsed tokens. If passed a
  411 + non-integer argument (most likely a string), it will use dict
  412 + semantics and pop the corresponding value from any defined
  413 + results names. A second default return value argument is
  414 + supported, just as in dict.pop()."""
  415 + if not args:
  416 + args = [-1]
  417 + for k,v in kwargs.items():
  418 + if k == 'default':
  419 + args = (args[0], v)
  420 + else:
  421 + raise TypeError("pop() got an unexpected keyword argument '%s'" % k)
  422 + if (isinstance(args[0], int) or
  423 + len(args) == 1 or
  424 + args[0] in self):
  425 + index = args[0]
  426 + ret = self[index]
  427 + del self[index]
  428 + return ret
  429 + else:
  430 + defaultvalue = args[1]
  431 + return defaultvalue
  432 +
  433 + def get(self, key, defaultValue=None):
  434 + """Returns named result matching the given key, or if there is no
  435 + such name, then returns the given C{defaultValue} or C{None} if no
  436 + C{defaultValue} is specified."""
  437 + if key in self:
  438 + return self[key]
  439 + else:
  440 + return defaultValue
  441 +
  442 + def insert( self, index, insStr ):
  443 + """Inserts new element at location index in the list of parsed tokens."""
  444 + self.__toklist.insert(index, insStr)
  445 + # fixup indices in token dictionary
  446 + for name in self.__tokdict:
  447 + occurrences = self.__tokdict[name]
  448 + for k, (value, position) in enumerate(occurrences):
  449 + occurrences[k] = _ParseResultsWithOffset(value, position + (position > index))
  450 +
  451 + def append( self, item ):
  452 + """Add single element to end of ParseResults list of elements."""
  453 + self.__toklist.append(item)
  454 +
  455 + def extend( self, itemseq ):
  456 + """Add sequence of elements to end of ParseResults list of elements."""
  457 + if isinstance(itemseq, ParseResults):
  458 + self += itemseq
  459 + else:
  460 + self.__toklist.extend(itemseq)
  461 +
  462 + def clear( self ):
  463 + """Clear all elements and results names."""
  464 + del self.__toklist[:]
  465 + self.__tokdict.clear()
  466 +
  467 + def __getattr__( self, name ):
  468 + try:
  469 + return self[name]
  470 + except KeyError:
  471 + return ""
  472 +
  473 + if name in self.__tokdict:
  474 + if name not in self.__accumNames:
  475 + return self.__tokdict[name][-1][0]
  476 + else:
  477 + return ParseResults([ v[0] for v in self.__tokdict[name] ])
  478 + else:
  479 + return ""
  480 +
  481 + def __add__( self, other ):
  482 + ret = self.copy()
  483 + ret += other
  484 + return ret
  485 +
  486 + def __iadd__( self, other ):
  487 + if other.__tokdict:
  488 + offset = len(self.__toklist)
  489 + addoffset = ( lambda a: (a<0 and offset) or (a+offset) )
  490 + otheritems = other.__tokdict.items()
  491 + otherdictitems = [(k, _ParseResultsWithOffset(v[0],addoffset(v[1])) )
  492 + for (k,vlist) in otheritems for v in vlist]
  493 + for k,v in otherdictitems:
  494 + self[k] = v
  495 + if isinstance(v[0],ParseResults):
  496 + v[0].__parent = wkref(self)
  497 +
  498 + self.__toklist += other.__toklist
  499 + self.__accumNames.update( other.__accumNames )
  500 + return self
  501 +
  502 + def __radd__(self, other):
  503 + if isinstance(other,int) and other == 0:
  504 + return self.copy()
  505 +
  506 + def __repr__( self ):
  507 + return "(%s, %s)" % ( repr( self.__toklist ), repr( self.__tokdict ) )
  508 +
  509 + def __str__( self ):
  510 + out = []
  511 + for i in self.__toklist:
  512 + if isinstance(i, ParseResults):
  513 + out.append(_ustr(i))
  514 + else:
  515 + out.append(repr(i))
  516 + return '[' + ', '.join(out) + ']'
  517 +
  518 + def _asStringList( self, sep='' ):
  519 + out = []
  520 + for item in self.__toklist:
  521 + if out and sep:
  522 + out.append(sep)
  523 + if isinstance( item, ParseResults ):
  524 + out += item._asStringList()
  525 + else:
  526 + out.append( _ustr(item) )
  527 + return out
  528 +
  529 + def asList( self ):
  530 + """Returns the parse results as a nested list of matching tokens, all converted to strings."""
  531 + out = []
  532 + for res in self.__toklist:
  533 + if isinstance(res,ParseResults):
  534 + out.append( res.asList() )
  535 + else:
  536 + out.append( res )
  537 + return out
  538 +
  539 + def asDict( self ):
  540 + """Returns the named parse results as dictionary."""
  541 + if PY_3:
  542 + return dict( self.items() )
  543 + else:
  544 + return dict( self.iteritems() )
  545 +
  546 + def copy( self ):
  547 + """Returns a new copy of a C{ParseResults} object."""
  548 + ret = ParseResults( self.__toklist )
  549 + ret.__tokdict = self.__tokdict.copy()
  550 + ret.__parent = self.__parent
  551 + ret.__accumNames.update( self.__accumNames )
  552 + ret.__name = self.__name
  553 + return ret
  554 +
  555 + def asXML( self, doctag=None, namedItemsOnly=False, indent="", formatted=True ):
  556 + """Returns the parse results as XML. Tags are created for tokens and lists that have defined results names."""
  557 + nl = "\n"
  558 + out = []
  559 + namedItems = dict((v[1],k) for (k,vlist) in self.__tokdict.items()
  560 + for v in vlist)
  561 + nextLevelIndent = indent + " "
  562 +
  563 + # collapse out indents if formatting is not desired
  564 + if not formatted:
  565 + indent = ""
  566 + nextLevelIndent = ""
  567 + nl = ""
  568 +
  569 + selfTag = None
  570 + if doctag is not None:
  571 + selfTag = doctag
  572 + else:
  573 + if self.__name:
  574 + selfTag = self.__name
  575 +
  576 + if not selfTag:
  577 + if namedItemsOnly:
  578 + return ""
  579 + else:
  580 + selfTag = "ITEM"
  581 +
  582 + out += [ nl, indent, "<", selfTag, ">" ]
  583 +
  584 + worklist = self.__toklist
  585 + for i,res in enumerate(worklist):
  586 + if isinstance(res,ParseResults):
  587 + if i in namedItems:
  588 + out += [ res.asXML(namedItems[i],
  589 + namedItemsOnly and doctag is None,
  590 + nextLevelIndent,
  591 + formatted)]
  592 + else:
  593 + out += [ res.asXML(None,
  594 + namedItemsOnly and doctag is None,
  595 + nextLevelIndent,
  596 + formatted)]
  597 + else:
  598 + # individual token, see if there is a name for it
  599 + resTag = None
  600 + if i in namedItems:
  601 + resTag = namedItems[i]
  602 + if not resTag:
  603 + if namedItemsOnly:
  604 + continue
  605 + else:
  606 + resTag = "ITEM"
  607 + xmlBodyText = _xml_escape(_ustr(res))
  608 + out += [ nl, nextLevelIndent, "<", resTag, ">",
  609 + xmlBodyText,
  610 + "</", resTag, ">" ]
  611 +
  612 + out += [ nl, indent, "</", selfTag, ">" ]
  613 + return "".join(out)
  614 +
  615 + def __lookup(self,sub):
  616 + for k,vlist in self.__tokdict.items():
  617 + for v,loc in vlist:
  618 + if sub is v:
  619 + return k
  620 + return None
  621 +
  622 + def getName(self):
  623 + """Returns the results name for this token expression."""
  624 + if self.__name:
  625 + return self.__name
  626 + elif self.__parent:
  627 + par = self.__parent()
  628 + if par:
  629 + return par.__lookup(self)
  630 + else:
  631 + return None
  632 + elif (len(self) == 1 and
  633 + len(self.__tokdict) == 1 and
  634 + self.__tokdict.values()[0][0][1] in (0,-1)):
  635 + return self.__tokdict.keys()[0]
  636 + else:
  637 + return None
  638 +
  639 + def dump(self,indent='',depth=0):
  640 + """Diagnostic method for listing out the contents of a C{ParseResults}.
  641 + Accepts an optional C{indent} argument so that this string can be embedded
  642 + in a nested display of other data."""
  643 + out = []
  644 + NL = '\n'
  645 + out.append( indent+_ustr(self.asList()) )
  646 + items = sorted(self.items())
  647 + for k,v in items:
  648 + if out:
  649 + out.append(NL)
  650 + out.append( "%s%s- %s: " % (indent,(' '*depth), k) )
  651 + if isinstance(v,ParseResults):
  652 + if v:
  653 + if v.haskeys():
  654 + out.append( v.dump(indent,depth+1) )
  655 + elif any(isinstance(vv,ParseResults) for vv in v):
  656 + for i,vv in enumerate(v):
  657 + if isinstance(vv,ParseResults):
  658 + out.append("\n%s%s[%d]:\n%s%s%s" % (indent,(' '*(depth+1)),i,indent,(' '*(depth+2)),vv.dump(indent,depth+2) ))
  659 + else:
  660 + out.append("\n%s%s[%d]:\n%s%s%s" % (indent,(' '*(depth+1)),i,indent,(' '*(depth+2)),_ustr(vv)))
  661 + else:
  662 + out.append(_ustr(v))
  663 + else:
  664 + out.append(_ustr(v))
  665 + else:
  666 + out.append(_ustr(v))
  667 + return "".join(out)
  668 +
  669 + def pprint(self, *args, **kwargs):
  670 + """Pretty-printer for parsed results as a list, using the C{pprint} module.
  671 + Accepts additional positional or keyword args as defined for the
  672 + C{pprint.pprint} method. (U{http://docs.python.org/3/library/pprint.html#pprint.pprint})"""
  673 + pprint.pprint(self.asList(), *args, **kwargs)
  674 +
  675 + # add support for pickle protocol
  676 + def __getstate__(self):
  677 + return ( self.__toklist,
  678 + ( self.__tokdict.copy(),
  679 + self.__parent is not None and self.__parent() or None,
  680 + self.__accumNames,
  681 + self.__name ) )
  682 +
  683 + def __setstate__(self,state):
  684 + self.__toklist = state[0]
  685 + (self.__tokdict,
  686 + par,
  687 + inAccumNames,
  688 + self.__name) = state[1]
  689 + self.__accumNames = {}
  690 + self.__accumNames.update(inAccumNames)
  691 + if par is not None:
  692 + self.__parent = wkref(par)
  693 + else:
  694 + self.__parent = None
  695 +
  696 + def __dir__(self):
  697 + return dir(super(ParseResults,self)) + list(self.keys())
  698 +
  699 +collections.MutableMapping.register(ParseResults)
  700 +
  701 +def col (loc,strg):
  702 + """Returns current column within a string, counting newlines as line separators.
  703 + The first column is number 1.
  704 +
  705 + Note: the default parsing behavior is to expand tabs in the input string
  706 + before starting the parsing process. See L{I{ParserElement.parseString}<ParserElement.parseString>} for more information
  707 + on parsing strings containing C{<TAB>}s, and suggested methods to maintain a
  708 + consistent view of the parsed string, the parse location, and line and column
  709 + positions within the parsed string.
  710 + """
  711 + return (loc<len(strg) and strg[loc] == '\n') and 1 or loc - strg.rfind("\n", 0, loc)
  712 +
  713 +def lineno(loc,strg):
  714 + """Returns current line number within a string, counting newlines as line separators.
  715 + The first line is number 1.
  716 +
  717 + Note: the default parsing behavior is to expand tabs in the input string
  718 + before starting the parsing process. See L{I{ParserElement.parseString}<ParserElement.parseString>} for more information
  719 + on parsing strings containing C{<TAB>}s, and suggested methods to maintain a
  720 + consistent view of the parsed string, the parse location, and line and column
  721 + positions within the parsed string.
  722 + """
  723 + return strg.count("\n",0,loc) + 1
  724 +
  725 +def line( loc, strg ):
  726 + """Returns the line of text containing loc within a string, counting newlines as line separators.
  727 + """
  728 + lastCR = strg.rfind("\n", 0, loc)
  729 + nextCR = strg.find("\n", loc)
  730 + if nextCR >= 0:
  731 + return strg[lastCR+1:nextCR]
  732 + else:
  733 + return strg[lastCR+1:]
  734 +
  735 +def _defaultStartDebugAction( instring, loc, expr ):
  736 + print (("Match " + _ustr(expr) + " at loc " + _ustr(loc) + "(%d,%d)" % ( lineno(loc,instring), col(loc,instring) )))
  737 +
  738 +def _defaultSuccessDebugAction( instring, startloc, endloc, expr, toks ):
  739 + print ("Matched " + _ustr(expr) + " -> " + str(toks.asList()))
  740 +
  741 +def _defaultExceptionDebugAction( instring, loc, expr, exc ):
  742 + print ("Exception raised:" + _ustr(exc))
  743 +
  744 +def nullDebugAction(*args):
  745 + """'Do-nothing' debug action, to suppress debugging output during parsing."""
  746 + pass
  747 +
  748 +# Only works on Python 3.x - nonlocal is toxic to Python 2 installs
  749 +#~ 'decorator to trim function calls to match the arity of the target'
  750 +#~ def _trim_arity(func, maxargs=3):
  751 + #~ if func in singleArgBuiltins:
  752 + #~ return lambda s,l,t: func(t)
  753 + #~ limit = 0
  754 + #~ foundArity = False
  755 + #~ def wrapper(*args):
  756 + #~ nonlocal limit,foundArity
  757 + #~ while 1:
  758 + #~ try:
  759 + #~ ret = func(*args[limit:])
  760 + #~ foundArity = True
  761 + #~ return ret
  762 + #~ except TypeError:
  763 + #~ if limit == maxargs or foundArity:
  764 + #~ raise
  765 + #~ limit += 1
  766 + #~ continue
  767 + #~ return wrapper
  768 +
  769 +# this version is Python 2.x-3.x cross-compatible
  770 +'decorator to trim function calls to match the arity of the target'
  771 +def _trim_arity(func, maxargs=2):
  772 + if func in singleArgBuiltins:
  773 + return lambda s,l,t: func(t)
  774 + limit = [0]
  775 + foundArity = [False]
  776 + def wrapper(*args):
  777 + while 1:
  778 + try:
  779 + ret = func(*args[limit[0]:])
  780 + foundArity[0] = True
  781 + return ret
  782 + except TypeError:
  783 + if limit[0] <= maxargs and not foundArity[0]:
  784 + limit[0] += 1
  785 + continue
  786 + raise
  787 + return wrapper
  788 +
  789 +class ParserElement(object):
  790 + """Abstract base level parser element class."""
  791 + DEFAULT_WHITE_CHARS = " \n\t\r"
  792 + verbose_stacktrace = False
  793 +
  794 + def setDefaultWhitespaceChars( chars ):
  795 + """Overrides the default whitespace chars
  796 + """
  797 + ParserElement.DEFAULT_WHITE_CHARS = chars
  798 + setDefaultWhitespaceChars = staticmethod(setDefaultWhitespaceChars)
  799 +
  800 + def inlineLiteralsUsing(cls):
  801 + """
  802 + Set class to be used for inclusion of string literals into a parser.
  803 + """
  804 + ParserElement.literalStringClass = cls
  805 + inlineLiteralsUsing = staticmethod(inlineLiteralsUsing)
  806 +
  807 + def __init__( self, savelist=False ):
  808 + self.parseAction = list()
  809 + self.failAction = None
  810 + #~ self.name = "<unknown>" # don't define self.name, let subclasses try/except upcall
  811 + self.strRepr = None
  812 + self.resultsName = None
  813 + self.saveAsList = savelist
  814 + self.skipWhitespace = True
  815 + self.whiteChars = ParserElement.DEFAULT_WHITE_CHARS
  816 + self.copyDefaultWhiteChars = True
  817 + self.mayReturnEmpty = False # used when checking for left-recursion
  818 + self.keepTabs = False
  819 + self.ignoreExprs = list()
  820 + self.debug = False
  821 + self.streamlined = False
  822 + self.mayIndexError = True # used to optimize exception handling for subclasses that don't advance parse index
  823 + self.errmsg = ""
  824 + self.modalResults = True # used to mark results names as modal (report only last) or cumulative (list all)
  825 + self.debugActions = ( None, None, None ) #custom debug actions
  826 + self.re = None
  827 + self.callPreparse = True # used to avoid redundant calls to preParse
  828 + self.callDuringTry = False
  829 +
  830 + def copy( self ):
  831 + """Make a copy of this C{ParserElement}. Useful for defining different parse actions
  832 + for the same parsing pattern, using copies of the original parse element."""
  833 + cpy = copy.copy( self )
  834 + cpy.parseAction = self.parseAction[:]
  835 + cpy.ignoreExprs = self.ignoreExprs[:]
  836 + if self.copyDefaultWhiteChars:
  837 + cpy.whiteChars = ParserElement.DEFAULT_WHITE_CHARS
  838 + return cpy
  839 +
  840 + def setName( self, name ):
  841 + """Define name for this expression, for use in debugging."""
  842 + self.name = name
  843 + self.errmsg = "Expected " + self.name
  844 + if hasattr(self,"exception"):
  845 + self.exception.msg = self.errmsg
  846 + return self
  847 +
  848 + def setResultsName( self, name, listAllMatches=False ):
  849 + """Define name for referencing matching tokens as a nested attribute
  850 + of the returned parse results.
  851 + NOTE: this returns a *copy* of the original C{ParserElement} object;
  852 + this is so that the client can define a basic element, such as an
  853 + integer, and reference it in multiple places with different names.
  854 +
  855 + You can also set results names using the abbreviated syntax,
  856 + C{expr("name")} in place of C{expr.setResultsName("name")} -
  857 + see L{I{__call__}<__call__>}.
  858 + """
  859 + newself = self.copy()
  860 + if name.endswith("*"):
  861 + name = name[:-1]
  862 + listAllMatches=True
  863 + newself.resultsName = name
  864 + newself.modalResults = not listAllMatches
  865 + return newself
  866 +
  867 + def setBreak(self,breakFlag = True):
  868 + """Method to invoke the Python pdb debugger when this element is
  869 + about to be parsed. Set C{breakFlag} to True to enable, False to
  870 + disable.
  871 + """
  872 + if breakFlag:
  873 + _parseMethod = self._parse
  874 + def breaker(instring, loc, doActions=True, callPreParse=True):
  875 + import pdb
  876 + pdb.set_trace()
  877 + return _parseMethod( instring, loc, doActions, callPreParse )
  878 + breaker._originalParseMethod = _parseMethod
  879 + self._parse = breaker
  880 + else:
  881 + if hasattr(self._parse,"_originalParseMethod"):
  882 + self._parse = self._parse._originalParseMethod
  883 + return self
  884 +
  885 + def setParseAction( self, *fns, **kwargs ):
  886 + """Define action to perform when successfully matching parse element definition.
  887 + Parse action fn is a callable method with 0-3 arguments, called as C{fn(s,loc,toks)},
  888 + C{fn(loc,toks)}, C{fn(toks)}, or just C{fn()}, where:
  889 + - s = the original string being parsed (see note below)
  890 + - loc = the location of the matching substring
  891 + - toks = a list of the matched tokens, packaged as a C{L{ParseResults}} object
  892 + If the functions in fns modify the tokens, they can return them as the return
  893 + value from fn, and the modified list of tokens will replace the original.
  894 + Otherwise, fn does not need to return any value.
  895 +
  896 + Note: the default parsing behavior is to expand tabs in the input string
  897 + before starting the parsing process. See L{I{parseString}<parseString>} for more information
  898 + on parsing strings containing C{<TAB>}s, and suggested methods to maintain a
  899 + consistent view of the parsed string, the parse location, and line and column
  900 + positions within the parsed string.
  901 + """
  902 + self.parseAction = list(map(_trim_arity, list(fns)))
  903 + self.callDuringTry = ("callDuringTry" in kwargs and kwargs["callDuringTry"])
  904 + return self
  905 +
  906 + def addParseAction( self, *fns, **kwargs ):
  907 + """Add parse action to expression's list of parse actions. See L{I{setParseAction}<setParseAction>}."""
  908 + self.parseAction += list(map(_trim_arity, list(fns)))
  909 + self.callDuringTry = self.callDuringTry or ("callDuringTry" in kwargs and kwargs["callDuringTry"])
  910 + return self
  911 +
  912 + def setFailAction( self, fn ):
  913 + """Define action to perform if parsing fails at this expression.
  914 + Fail acton fn is a callable function that takes the arguments
  915 + C{fn(s,loc,expr,err)} where:
  916 + - s = string being parsed
  917 + - loc = location where expression match was attempted and failed
  918 + - expr = the parse expression that failed
  919 + - err = the exception thrown
  920 + The function returns no value. It may throw C{L{ParseFatalException}}
  921 + if it is desired to stop parsing immediately."""
  922 + self.failAction = fn
  923 + return self
  924 +
  925 + def _skipIgnorables( self, instring, loc ):
  926 + exprsFound = True
  927 + while exprsFound:
  928 + exprsFound = False
  929 + for e in self.ignoreExprs:
  930 + try:
  931 + while 1:
  932 + loc,dummy = e._parse( instring, loc )
  933 + exprsFound = True
  934 + except ParseException:
  935 + pass
  936 + return loc
  937 +
  938 + def preParse( self, instring, loc ):
  939 + if self.ignoreExprs:
  940 + loc = self._skipIgnorables( instring, loc )
  941 +
  942 + if self.skipWhitespace:
  943 + wt = self.whiteChars
  944 + instrlen = len(instring)
  945 + while loc < instrlen and instring[loc] in wt:
  946 + loc += 1
  947 +
  948 + return loc
  949 +
  950 + def parseImpl( self, instring, loc, doActions=True ):
  951 + return loc, []
  952 +
  953 + def postParse( self, instring, loc, tokenlist ):
  954 + return tokenlist
  955 +
  956 + #~ @profile
  957 + def _parseNoCache( self, instring, loc, doActions=True, callPreParse=True ):
  958 + debugging = ( self.debug ) #and doActions )
  959 +
  960 + if debugging or self.failAction:
  961 + #~ print ("Match",self,"at loc",loc,"(%d,%d)" % ( lineno(loc,instring), col(loc,instring) ))
  962 + if (self.debugActions[0] ):
  963 + self.debugActions[0]( instring, loc, self )
  964 + if callPreParse and self.callPreparse:
  965 + preloc = self.preParse( instring, loc )
  966 + else:
  967 + preloc = loc
  968 + tokensStart = preloc
  969 + try:
  970 + try:
  971 + loc,tokens = self.parseImpl( instring, preloc, doActions )
  972 + except IndexError:
  973 + raise ParseException( instring, len(instring), self.errmsg, self )
  974 + except ParseBaseException as err:
  975 + #~ print ("Exception raised:", err)
  976 + if self.debugActions[2]:
  977 + self.debugActions[2]( instring, tokensStart, self, err )
  978 + if self.failAction:
  979 + self.failAction( instring, tokensStart, self, err )
  980 + raise
  981 + else:
  982 + if callPreParse and self.callPreparse:
  983 + preloc = self.preParse( instring, loc )
  984 + else:
  985 + preloc = loc
  986 + tokensStart = preloc
  987 + if self.mayIndexError or loc >= len(instring):
  988 + try:
  989 + loc,tokens = self.parseImpl( instring, preloc, doActions )
  990 + except IndexError:
  991 + raise ParseException( instring, len(instring), self.errmsg, self )
  992 + else:
  993 + loc,tokens = self.parseImpl( instring, preloc, doActions )
  994 +
  995 + tokens = self.postParse( instring, loc, tokens )
  996 +
  997 + retTokens = ParseResults( tokens, self.resultsName, asList=self.saveAsList, modal=self.modalResults )
  998 + if self.parseAction and (doActions or self.callDuringTry):
  999 + if debugging:
  1000 + try:
  1001 + for fn in self.parseAction:
  1002 + tokens = fn( instring, tokensStart, retTokens )
  1003 + if tokens is not None:
  1004 + retTokens = ParseResults( tokens,
  1005 + self.resultsName,
  1006 + asList=self.saveAsList and isinstance(tokens,(ParseResults,list)),
  1007 + modal=self.modalResults )
  1008 + except ParseBaseException as err:
  1009 + #~ print "Exception raised in user parse action:", err
  1010 + if (self.debugActions[2] ):
  1011 + self.debugActions[2]( instring, tokensStart, self, err )
  1012 + raise
  1013 + else:
  1014 + for fn in self.parseAction:
  1015 + tokens = fn( instring, tokensStart, retTokens )
  1016 + if tokens is not None:
  1017 + retTokens = ParseResults( tokens,
  1018 + self.resultsName,
  1019 + asList=self.saveAsList and isinstance(tokens,(ParseResults,list)),
  1020 + modal=self.modalResults )
  1021 +
  1022 + if debugging:
  1023 + #~ print ("Matched",self,"->",retTokens.asList())
  1024 + if (self.debugActions[1] ):
  1025 + self.debugActions[1]( instring, tokensStart, loc, self, retTokens )
  1026 +
  1027 + return loc, retTokens
  1028 +
  1029 + def tryParse( self, instring, loc ):
  1030 + try:
  1031 + return self._parse( instring, loc, doActions=False )[0]
  1032 + except ParseFatalException:
  1033 + raise ParseException( instring, loc, self.errmsg, self)
  1034 +
  1035 + # this method gets repeatedly called during backtracking with the same arguments -
  1036 + # we can cache these arguments and save ourselves the trouble of re-parsing the contained expression
  1037 + def _parseCache( self, instring, loc, doActions=True, callPreParse=True ):
  1038 + lookup = (self,instring,loc,callPreParse,doActions)
  1039 + if lookup in ParserElement._exprArgCache:
  1040 + value = ParserElement._exprArgCache[ lookup ]
  1041 + if isinstance(value, Exception):
  1042 + raise value
  1043 + return (value[0],value[1].copy())
  1044 + else:
  1045 + try:
  1046 + value = self._parseNoCache( instring, loc, doActions, callPreParse )
  1047 + ParserElement._exprArgCache[ lookup ] = (value[0],value[1].copy())
  1048 + return value
  1049 + except ParseBaseException as pe:
  1050 + pe.__traceback__ = None
  1051 + ParserElement._exprArgCache[ lookup ] = pe
  1052 + raise
  1053 +
  1054 + _parse = _parseNoCache
  1055 +
  1056 + # argument cache for optimizing repeated calls when backtracking through recursive expressions
  1057 + _exprArgCache = {}
  1058 + def resetCache():
  1059 + ParserElement._exprArgCache.clear()
  1060 + resetCache = staticmethod(resetCache)
  1061 +
  1062 + _packratEnabled = False
  1063 + def enablePackrat():
  1064 + """Enables "packrat" parsing, which adds memoizing to the parsing logic.
  1065 + Repeated parse attempts at the same string location (which happens
  1066 + often in many complex grammars) can immediately return a cached value,
  1067 + instead of re-executing parsing/validating code. Memoizing is done of
  1068 + both valid results and parsing exceptions.
  1069 +
  1070 + This speedup may break existing programs that use parse actions that
  1071 + have side-effects. For this reason, packrat parsing is disabled when
  1072 + you first import pyparsing. To activate the packrat feature, your
  1073 + program must call the class method C{ParserElement.enablePackrat()}. If
  1074 + your program uses C{psyco} to "compile as you go", you must call
  1075 + C{enablePackrat} before calling C{psyco.full()}. If you do not do this,
  1076 + Python will crash. For best results, call C{enablePackrat()} immediately
  1077 + after importing pyparsing.
  1078 + """
  1079 + if not ParserElement._packratEnabled:
  1080 + ParserElement._packratEnabled = True
  1081 + ParserElement._parse = ParserElement._parseCache
  1082 + enablePackrat = staticmethod(enablePackrat)
  1083 +
  1084 + def parseString( self, instring, parseAll=False ):
  1085 + """Execute the parse expression with the given string.
  1086 + This is the main interface to the client code, once the complete
  1087 + expression has been built.
  1088 +
  1089 + If you want the grammar to require that the entire input string be
  1090 + successfully parsed, then set C{parseAll} to True (equivalent to ending
  1091 + the grammar with C{L{StringEnd()}}).
  1092 +
  1093 + Note: C{parseString} implicitly calls C{expandtabs()} on the input string,
  1094 + in order to report proper column numbers in parse actions.
  1095 + If the input string contains tabs and
  1096 + the grammar uses parse actions that use the C{loc} argument to index into the
  1097 + string being parsed, you can ensure you have a consistent view of the input
  1098 + string by:
  1099 + - calling C{parseWithTabs} on your grammar before calling C{parseString}
  1100 + (see L{I{parseWithTabs}<parseWithTabs>})
  1101 + - define your parse action using the full C{(s,loc,toks)} signature, and
  1102 + reference the input string using the parse action's C{s} argument
  1103 + - explictly expand the tabs in your input string before calling
  1104 + C{parseString}
  1105 + """
  1106 + ParserElement.resetCache()
  1107 + if not self.streamlined:
  1108 + self.streamline()
  1109 + #~ self.saveAsList = True
  1110 + for e in self.ignoreExprs:
  1111 + e.streamline()
  1112 + if not self.keepTabs:
  1113 + instring = instring.expandtabs()
  1114 + try:
  1115 + loc, tokens = self._parse( instring, 0 )
  1116 + if parseAll:
  1117 + loc = self.preParse( instring, loc )
  1118 + se = Empty() + StringEnd()
  1119 + se._parse( instring, loc )
  1120 + except ParseBaseException as exc:
  1121 + if ParserElement.verbose_stacktrace:
  1122 + raise
  1123 + else:
  1124 + # catch and re-raise exception from here, clears out pyparsing internal stack trace
  1125 + raise exc
  1126 + else:
  1127 + return tokens
  1128 +
  1129 + def scanString( self, instring, maxMatches=_MAX_INT, overlap=False ):
  1130 + """Scan the input string for expression matches. Each match will return the
  1131 + matching tokens, start location, and end location. May be called with optional
  1132 + C{maxMatches} argument, to clip scanning after 'n' matches are found. If
  1133 + C{overlap} is specified, then overlapping matches will be reported.
  1134 +
  1135 + Note that the start and end locations are reported relative to the string
  1136 + being parsed. See L{I{parseString}<parseString>} for more information on parsing
  1137 + strings with embedded tabs."""
  1138 + if not self.streamlined:
  1139 + self.streamline()
  1140 + for e in self.ignoreExprs:
  1141 + e.streamline()
  1142 +
  1143 + if not self.keepTabs:
  1144 + instring = _ustr(instring).expandtabs()
  1145 + instrlen = len(instring)
  1146 + loc = 0
  1147 + preparseFn = self.preParse
  1148 + parseFn = self._parse
  1149 + ParserElement.resetCache()
  1150 + matches = 0
  1151 + try:
  1152 + while loc <= instrlen and matches < maxMatches:
  1153 + try:
  1154 + preloc = preparseFn( instring, loc )
  1155 + nextLoc,tokens = parseFn( instring, preloc, callPreParse=False )
  1156 + except ParseException:
  1157 + loc = preloc+1
  1158 + else:
  1159 + if nextLoc > loc:
  1160 + matches += 1
  1161 + yield tokens, preloc, nextLoc
  1162 + if overlap:
  1163 + nextloc = preparseFn( instring, loc )
  1164 + if nextloc > loc:
  1165 + loc = nextLoc
  1166 + else:
  1167 + loc += 1
  1168 + else:
  1169 + loc = nextLoc
  1170 + else:
  1171 + loc = preloc+1
  1172 + except ParseBaseException as exc:
  1173 + if ParserElement.verbose_stacktrace:
  1174 + raise
  1175 + else:
  1176 + # catch and re-raise exception from here, clears out pyparsing internal stack trace
  1177 + raise exc
  1178 +
  1179 + def transformString( self, instring ):
  1180 + """Extension to C{L{scanString}}, to modify matching text with modified tokens that may
  1181 + be returned from a parse action. To use C{transformString}, define a grammar and
  1182 + attach a parse action to it that modifies the returned token list.
  1183 + Invoking C{transformString()} on a target string will then scan for matches,
  1184 + and replace the matched text patterns according to the logic in the parse
  1185 + action. C{transformString()} returns the resulting transformed string."""
  1186 + out = []
  1187 + lastE = 0
  1188 + # force preservation of <TAB>s, to minimize unwanted transformation of string, and to
  1189 + # keep string locs straight between transformString and scanString
  1190 + self.keepTabs = True
  1191 + try:
  1192 + for t,s,e in self.scanString( instring ):
  1193 + out.append( instring[lastE:s] )
  1194 + if t:
  1195 + if isinstance(t,ParseResults):
  1196 + out += t.asList()
  1197 + elif isinstance(t,list):
  1198 + out += t
  1199 + else:
  1200 + out.append(t)
  1201 + lastE = e
  1202 + out.append(instring[lastE:])
  1203 + out = [o for o in out if o]
  1204 + return "".join(map(_ustr,_flatten(out)))
  1205 + except ParseBaseException as exc:
  1206 + if ParserElement.verbose_stacktrace:
  1207 + raise
  1208 + else:
  1209 + # catch and re-raise exception from here, clears out pyparsing internal stack trace
  1210 + raise exc
  1211 +
  1212 + def searchString( self, instring, maxMatches=_MAX_INT ):
  1213 + """Another extension to C{L{scanString}}, simplifying the access to the tokens found
  1214 + to match the given parse expression. May be called with optional
  1215 + C{maxMatches} argument, to clip searching after 'n' matches are found.
  1216 + """
  1217 + try:
  1218 + return ParseResults([ t for t,s,e in self.scanString( instring, maxMatches ) ])
  1219 + except ParseBaseException as exc:
  1220 + if ParserElement.verbose_stacktrace:
  1221 + raise
  1222 + else:
  1223 + # catch and re-raise exception from here, clears out pyparsing internal stack trace
  1224 + raise exc
  1225 +
  1226 + def __add__(self, other ):
  1227 + """Implementation of + operator - returns C{L{And}}"""
  1228 + if isinstance( other, basestring ):
  1229 + other = ParserElement.literalStringClass( other )
  1230 + if not isinstance( other, ParserElement ):
  1231 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1232 + SyntaxWarning, stacklevel=2)
  1233 + return None
  1234 + return And( [ self, other ] )
  1235 +
  1236 + def __radd__(self, other ):
  1237 + """Implementation of + operator when left operand is not a C{L{ParserElement}}"""
  1238 + if isinstance( other, basestring ):
  1239 + other = ParserElement.literalStringClass( other )
  1240 + if not isinstance( other, ParserElement ):
  1241 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1242 + SyntaxWarning, stacklevel=2)
  1243 + return None
  1244 + return other + self
  1245 +
  1246 + def __sub__(self, other):
  1247 + """Implementation of - operator, returns C{L{And}} with error stop"""
  1248 + if isinstance( other, basestring ):
  1249 + other = ParserElement.literalStringClass( other )
  1250 + if not isinstance( other, ParserElement ):
  1251 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1252 + SyntaxWarning, stacklevel=2)
  1253 + return None
  1254 + return And( [ self, And._ErrorStop(), other ] )
  1255 +
  1256 + def __rsub__(self, other ):
  1257 + """Implementation of - operator when left operand is not a C{L{ParserElement}}"""
  1258 + if isinstance( other, basestring ):
  1259 + other = ParserElement.literalStringClass( other )
  1260 + if not isinstance( other, ParserElement ):
  1261 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1262 + SyntaxWarning, stacklevel=2)
  1263 + return None
  1264 + return other - self
  1265 +
  1266 + def __mul__(self,other):
  1267 + """Implementation of * operator, allows use of C{expr * 3} in place of
  1268 + C{expr + expr + expr}. Expressions may also me multiplied by a 2-integer
  1269 + tuple, similar to C{{min,max}} multipliers in regular expressions. Tuples
  1270 + may also include C{None} as in:
  1271 + - C{expr*(n,None)} or C{expr*(n,)} is equivalent
  1272 + to C{expr*n + L{ZeroOrMore}(expr)}
  1273 + (read as "at least n instances of C{expr}")
  1274 + - C{expr*(None,n)} is equivalent to C{expr*(0,n)}
  1275 + (read as "0 to n instances of C{expr}")
  1276 + - C{expr*(None,None)} is equivalent to C{L{ZeroOrMore}(expr)}
  1277 + - C{expr*(1,None)} is equivalent to C{L{OneOrMore}(expr)}
  1278 +
  1279 + Note that C{expr*(None,n)} does not raise an exception if
  1280 + more than n exprs exist in the input stream; that is,
  1281 + C{expr*(None,n)} does not enforce a maximum number of expr
  1282 + occurrences. If this behavior is desired, then write
  1283 + C{expr*(None,n) + ~expr}
  1284 +
  1285 + """
  1286 + if isinstance(other,int):
  1287 + minElements, optElements = other,0
  1288 + elif isinstance(other,tuple):
  1289 + other = (other + (None, None))[:2]
  1290 + if other[0] is None:
  1291 + other = (0, other[1])
  1292 + if isinstance(other[0],int) and other[1] is None:
  1293 + if other[0] == 0:
  1294 + return ZeroOrMore(self)
  1295 + if other[0] == 1:
  1296 + return OneOrMore(self)
  1297 + else:
  1298 + return self*other[0] + ZeroOrMore(self)
  1299 + elif isinstance(other[0],int) and isinstance(other[1],int):
  1300 + minElements, optElements = other
  1301 + optElements -= minElements
  1302 + else:
  1303 + raise TypeError("cannot multiply 'ParserElement' and ('%s','%s') objects", type(other[0]),type(other[1]))
  1304 + else:
  1305 + raise TypeError("cannot multiply 'ParserElement' and '%s' objects", type(other))
  1306 +
  1307 + if minElements < 0:
  1308 + raise ValueError("cannot multiply ParserElement by negative value")
  1309 + if optElements < 0:
  1310 + raise ValueError("second tuple value must be greater or equal to first tuple value")
  1311 + if minElements == optElements == 0:
  1312 + raise ValueError("cannot multiply ParserElement by 0 or (0,0)")
  1313 +
  1314 + if (optElements):
  1315 + def makeOptionalList(n):
  1316 + if n>1:
  1317 + return Optional(self + makeOptionalList(n-1))
  1318 + else:
  1319 + return Optional(self)
  1320 + if minElements:
  1321 + if minElements == 1:
  1322 + ret = self + makeOptionalList(optElements)
  1323 + else:
  1324 + ret = And([self]*minElements) + makeOptionalList(optElements)
  1325 + else:
  1326 + ret = makeOptionalList(optElements)
  1327 + else:
  1328 + if minElements == 1:
  1329 + ret = self
  1330 + else:
  1331 + ret = And([self]*minElements)
  1332 + return ret
  1333 +
  1334 + def __rmul__(self, other):
  1335 + return self.__mul__(other)
  1336 +
  1337 + def __or__(self, other ):
  1338 + """Implementation of | operator - returns C{L{MatchFirst}}"""
  1339 + if isinstance( other, basestring ):
  1340 + other = ParserElement.literalStringClass( other )
  1341 + if not isinstance( other, ParserElement ):
  1342 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1343 + SyntaxWarning, stacklevel=2)
  1344 + return None
  1345 + return MatchFirst( [ self, other ] )
  1346 +
  1347 + def __ror__(self, other ):
  1348 + """Implementation of | operator when left operand is not a C{L{ParserElement}}"""
  1349 + if isinstance( other, basestring ):
  1350 + other = ParserElement.literalStringClass( other )
  1351 + if not isinstance( other, ParserElement ):
  1352 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1353 + SyntaxWarning, stacklevel=2)
  1354 + return None
  1355 + return other | self
  1356 +
  1357 + def __xor__(self, other ):
  1358 + """Implementation of ^ operator - returns C{L{Or}}"""
  1359 + if isinstance( other, basestring ):
  1360 + other = ParserElement.literalStringClass( other )
  1361 + if not isinstance( other, ParserElement ):
  1362 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1363 + SyntaxWarning, stacklevel=2)
  1364 + return None
  1365 + return Or( [ self, other ] )
  1366 +
  1367 + def __rxor__(self, other ):
  1368 + """Implementation of ^ operator when left operand is not a C{L{ParserElement}}"""
  1369 + if isinstance( other, basestring ):
  1370 + other = ParserElement.literalStringClass( other )
  1371 + if not isinstance( other, ParserElement ):
  1372 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1373 + SyntaxWarning, stacklevel=2)
  1374 + return None
  1375 + return other ^ self
  1376 +
  1377 + def __and__(self, other ):
  1378 + """Implementation of & operator - returns C{L{Each}}"""
  1379 + if isinstance( other, basestring ):
  1380 + other = ParserElement.literalStringClass( other )
  1381 + if not isinstance( other, ParserElement ):
  1382 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1383 + SyntaxWarning, stacklevel=2)
  1384 + return None
  1385 + return Each( [ self, other ] )
  1386 +
  1387 + def __rand__(self, other ):
  1388 + """Implementation of & operator when left operand is not a C{L{ParserElement}}"""
  1389 + if isinstance( other, basestring ):
  1390 + other = ParserElement.literalStringClass( other )
  1391 + if not isinstance( other, ParserElement ):
  1392 + warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),
  1393 + SyntaxWarning, stacklevel=2)
  1394 + return None
  1395 + return other & self
  1396 +
  1397 + def __invert__( self ):
  1398 + """Implementation of ~ operator - returns C{L{NotAny}}"""
  1399 + return NotAny( self )
  1400 +
  1401 + def __call__(self, name=None):
  1402 + """Shortcut for C{L{setResultsName}}, with C{listAllMatches=default}::
  1403 + userdata = Word(alphas).setResultsName("name") + Word(nums+"-").setResultsName("socsecno")
  1404 + could be written as::
  1405 + userdata = Word(alphas)("name") + Word(nums+"-")("socsecno")
  1406 +
  1407 + If C{name} is given with a trailing C{'*'} character, then C{listAllMatches} will be
  1408 + passed as C{True}.
  1409 +
  1410 + If C{name} is omitted, same as calling C{L{copy}}.
  1411 + """
  1412 + if name is not None:
  1413 + return self.setResultsName(name)
  1414 + else:
  1415 + return self.copy()
  1416 +
  1417 + def suppress( self ):
  1418 + """Suppresses the output of this C{ParserElement}; useful to keep punctuation from
  1419 + cluttering up returned output.
  1420 + """
  1421 + return Suppress( self )
  1422 +
  1423 + def leaveWhitespace( self ):
  1424 + """Disables the skipping of whitespace before matching the characters in the
  1425 + C{ParserElement}'s defined pattern. This is normally only used internally by
  1426 + the pyparsing module, but may be needed in some whitespace-sensitive grammars.
  1427 + """
  1428 + self.skipWhitespace = False
  1429 + return self
  1430 +
  1431 + def setWhitespaceChars( self, chars ):
  1432 + """Overrides the default whitespace chars
  1433 + """
  1434 + self.skipWhitespace = True
  1435 + self.whiteChars = chars
  1436 + self.copyDefaultWhiteChars = False
  1437 + return self
  1438 +
  1439 + def parseWithTabs( self ):
  1440 + """Overrides default behavior to expand C{<TAB>}s to spaces before parsing the input string.
  1441 + Must be called before C{parseString} when the input grammar contains elements that
  1442 + match C{<TAB>} characters."""
  1443 + self.keepTabs = True
  1444 + return self
  1445 +
  1446 + def ignore( self, other ):
  1447 + """Define expression to be ignored (e.g., comments) while doing pattern
  1448 + matching; may be called repeatedly, to define multiple comment or other
  1449 + ignorable patterns.
  1450 + """
  1451 + if isinstance( other, Suppress ):
  1452 + if other not in self.ignoreExprs:
  1453 + self.ignoreExprs.append( other.copy() )
  1454 + else:
  1455 + self.ignoreExprs.append( Suppress( other.copy() ) )
  1456 + return self
  1457 +
  1458 + def setDebugActions( self, startAction, successAction, exceptionAction ):
  1459 + """Enable display of debugging messages while doing pattern matching."""
  1460 + self.debugActions = (startAction or _defaultStartDebugAction,
  1461 + successAction or _defaultSuccessDebugAction,
  1462 + exceptionAction or _defaultExceptionDebugAction)
  1463 + self.debug = True
  1464 + return self
  1465 +
  1466 + def setDebug( self, flag=True ):
  1467 + """Enable display of debugging messages while doing pattern matching.
  1468 + Set C{flag} to True to enable, False to disable."""
  1469 + if flag:
  1470 + self.setDebugActions( _defaultStartDebugAction, _defaultSuccessDebugAction, _defaultExceptionDebugAction )
  1471 + else:
  1472 + self.debug = False
  1473 + return self
  1474 +
  1475 + def __str__( self ):
  1476 + return self.name
  1477 +
  1478 + def __repr__( self ):
  1479 + return _ustr(self)
  1480 +
  1481 + def streamline( self ):
  1482 + self.streamlined = True
  1483 + self.strRepr = None
  1484 + return self
  1485 +
  1486 + def checkRecursion( self, parseElementList ):
  1487 + pass
  1488 +
  1489 + def validate( self, validateTrace=[] ):
  1490 + """Check defined expressions for valid structure, check for infinite recursive definitions."""
  1491 + self.checkRecursion( [] )
  1492 +
  1493 + def parseFile( self, file_or_filename, parseAll=False ):
  1494 + """Execute the parse expression on the given file or filename.
  1495 + If a filename is specified (instead of a file object),
  1496 + the entire file is opened, read, and closed before parsing.
  1497 + """
  1498 + try:
  1499 + file_contents = file_or_filename.read()
  1500 + except AttributeError:
  1501 + f = open(file_or_filename, "r")
  1502 + file_contents = f.read()
  1503 + f.close()
  1504 + try:
  1505 + return self.parseString(file_contents, parseAll)
  1506 + except ParseBaseException as exc:
  1507 + if ParserElement.verbose_stacktrace:
  1508 + raise
  1509 + else:
  1510 + # catch and re-raise exception from here, clears out pyparsing internal stack trace
  1511 + raise exc
  1512 +
  1513 + def __eq__(self,other):
  1514 + if isinstance(other, ParserElement):
  1515 + return self is other or self.__dict__ == other.__dict__
  1516 + elif isinstance(other, basestring):
  1517 + try:
  1518 + self.parseString(_ustr(other), parseAll=True)
  1519 + return True
  1520 + except ParseBaseException:
  1521 + return False
  1522 + else:
  1523 + return super(ParserElement,self)==other
  1524 +
  1525 + def __ne__(self,other):
  1526 + return not (self == other)
  1527 +
  1528 + def __hash__(self):
  1529 + return hash(id(self))
  1530 +
  1531 + def __req__(self,other):
  1532 + return self == other
  1533 +
  1534 + def __rne__(self,other):
  1535 + return not (self == other)
  1536 +
  1537 +
  1538 +class Token(ParserElement):
  1539 + """Abstract C{ParserElement} subclass, for defining atomic matching patterns."""
  1540 + def __init__( self ):
  1541 + super(Token,self).__init__( savelist=False )
  1542 +
  1543 + def setName(self, name):
  1544 + s = super(Token,self).setName(name)
  1545 + self.errmsg = "Expected " + self.name
  1546 + return s
  1547 +
  1548 +
  1549 +class Empty(Token):
  1550 + """An empty token, will always match."""
  1551 + def __init__( self ):
  1552 + super(Empty,self).__init__()
  1553 + self.name = "Empty"
  1554 + self.mayReturnEmpty = True
  1555 + self.mayIndexError = False
  1556 +
  1557 +
  1558 +class NoMatch(Token):
  1559 + """A token that will never match."""
  1560 + def __init__( self ):
  1561 + super(NoMatch,self).__init__()
  1562 + self.name = "NoMatch"
  1563 + self.mayReturnEmpty = True
  1564 + self.mayIndexError = False
  1565 + self.errmsg = "Unmatchable token"
  1566 +
  1567 + def parseImpl( self, instring, loc, doActions=True ):
  1568 + raise ParseException(instring, loc, self.errmsg, self)
  1569 +
  1570 +
  1571 +class Literal(Token):
  1572 + """Token to exactly match a specified string."""
  1573 + def __init__( self, matchString ):
  1574 + super(Literal,self).__init__()
  1575 + self.match = matchString
  1576 + self.matchLen = len(matchString)
  1577 + try:
  1578 + self.firstMatchChar = matchString[0]
  1579 + except IndexError:
  1580 + warnings.warn("null string passed to Literal; use Empty() instead",
  1581 + SyntaxWarning, stacklevel=2)
  1582 + self.__class__ = Empty
  1583 + self.name = '"%s"' % _ustr(self.match)
  1584 + self.errmsg = "Expected " + self.name
  1585 + self.mayReturnEmpty = False
  1586 + self.mayIndexError = False
  1587 +
  1588 + # Performance tuning: this routine gets called a *lot*
  1589 + # if this is a single character match string and the first character matches,
  1590 + # short-circuit as quickly as possible, and avoid calling startswith
  1591 + #~ @profile
  1592 + def parseImpl( self, instring, loc, doActions=True ):
  1593 + if (instring[loc] == self.firstMatchChar and
  1594 + (self.matchLen==1 or instring.startswith(self.match,loc)) ):
  1595 + return loc+self.matchLen, self.match
  1596 + raise ParseException(instring, loc, self.errmsg, self)
  1597 +_L = Literal
  1598 +ParserElement.literalStringClass = Literal
  1599 +
  1600 +class Keyword(Token):
  1601 + """Token to exactly match a specified string as a keyword, that is, it must be
  1602 + immediately followed by a non-keyword character. Compare with C{L{Literal}}::
  1603 + Literal("if") will match the leading C{'if'} in C{'ifAndOnlyIf'}.
  1604 + Keyword("if") will not; it will only match the leading C{'if'} in C{'if x=1'}, or C{'if(y==2)'}
  1605 + Accepts two optional constructor arguments in addition to the keyword string:
  1606 + C{identChars} is a string of characters that would be valid identifier characters,
  1607 + defaulting to all alphanumerics + "_" and "$"; C{caseless} allows case-insensitive
  1608 + matching, default is C{False}.
  1609 + """
  1610 + DEFAULT_KEYWORD_CHARS = alphanums+"_$"
  1611 +
  1612 + def __init__( self, matchString, identChars=DEFAULT_KEYWORD_CHARS, caseless=False ):
  1613 + super(Keyword,self).__init__()
  1614 + self.match = matchString
  1615 + self.matchLen = len(matchString)
  1616 + try:
  1617 + self.firstMatchChar = matchString[0]
  1618 + except IndexError:
  1619 + warnings.warn("null string passed to Keyword; use Empty() instead",
  1620 + SyntaxWarning, stacklevel=2)
  1621 + self.name = '"%s"' % self.match
  1622 + self.errmsg = "Expected " + self.name
  1623 + self.mayReturnEmpty = False
  1624 + self.mayIndexError = False
  1625 + self.caseless = caseless
  1626 + if caseless:
  1627 + self.caselessmatch = matchString.upper()
  1628 + identChars = identChars.upper()
  1629 + self.identChars = set(identChars)
  1630 +
  1631 + def parseImpl( self, instring, loc, doActions=True ):
  1632 + if self.caseless:
  1633 + if ( (instring[ loc:loc+self.matchLen ].upper() == self.caselessmatch) and
  1634 + (loc >= len(instring)-self.matchLen or instring[loc+self.matchLen].upper() not in self.identChars) and
  1635 + (loc == 0 or instring[loc-1].upper() not in self.identChars) ):
  1636 + return loc+self.matchLen, self.match
  1637 + else:
  1638 + if (instring[loc] == self.firstMatchChar and
  1639 + (self.matchLen==1 or instring.startswith(self.match,loc)) and
  1640 + (loc >= len(instring)-self.matchLen or instring[loc+self.matchLen] not in self.identChars) and
  1641 + (loc == 0 or instring[loc-1] not in self.identChars) ):
  1642 + return loc+self.matchLen, self.match
  1643 + raise ParseException(instring, loc, self.errmsg, self)
  1644 +
  1645 + def copy(self):
  1646 + c = super(Keyword,self).copy()
  1647 + c.identChars = Keyword.DEFAULT_KEYWORD_CHARS
  1648 + return c
  1649 +
  1650 + def setDefaultKeywordChars( chars ):
  1651 + """Overrides the default Keyword chars
  1652 + """
  1653 + Keyword.DEFAULT_KEYWORD_CHARS = chars
  1654 + setDefaultKeywordChars = staticmethod(setDefaultKeywordChars)
  1655 +
  1656 +class CaselessLiteral(Literal):
  1657 + """Token to match a specified string, ignoring case of letters.
  1658 + Note: the matched results will always be in the case of the given
  1659 + match string, NOT the case of the input text.
  1660 + """
  1661 + def __init__( self, matchString ):
  1662 + super(CaselessLiteral,self).__init__( matchString.upper() )
  1663 + # Preserve the defining literal.
  1664 + self.returnString = matchString
  1665 + self.name = "'%s'" % self.returnString
  1666 + self.errmsg = "Expected " + self.name
  1667 +
  1668 + def parseImpl( self, instring, loc, doActions=True ):
  1669 + if instring[ loc:loc+self.matchLen ].upper() == self.match:
  1670 + return loc+self.matchLen, self.returnString
  1671 + raise ParseException(instring, loc, self.errmsg, self)
  1672 +
  1673 +class CaselessKeyword(Keyword):
  1674 + def __init__( self, matchString, identChars=Keyword.DEFAULT_KEYWORD_CHARS ):
  1675 + super(CaselessKeyword,self).__init__( matchString, identChars, caseless=True )
  1676 +
  1677 + def parseImpl( self, instring, loc, doActions=True ):
  1678 + if ( (instring[ loc:loc+self.matchLen ].upper() == self.caselessmatch) and
  1679 + (loc >= len(instring)-self.matchLen or instring[loc+self.matchLen].upper() not in self.identChars) ):
  1680 + return loc+self.matchLen, self.match
  1681 + raise ParseException(instring, loc, self.errmsg, self)
  1682 +
  1683 +class Word(Token):
  1684 + """Token for matching words composed of allowed character sets.
  1685 + Defined with string containing all allowed initial characters,
  1686 + an optional string containing allowed body characters (if omitted,
  1687 + defaults to the initial character set), and an optional minimum,
  1688 + maximum, and/or exact length. The default value for C{min} is 1 (a
  1689 + minimum value < 1 is not valid); the default values for C{max} and C{exact}
  1690 + are 0, meaning no maximum or exact length restriction. An optional
  1691 + C{exclude} parameter can list characters that might be found in
  1692 + the input C{bodyChars} string; useful to define a word of all printables
  1693 + except for one or two characters, for instance.
  1694 + """
  1695 + def __init__( self, initChars, bodyChars=None, min=1, max=0, exact=0, asKeyword=False, excludeChars=None ):
  1696 + super(Word,self).__init__()
  1697 + if excludeChars:
  1698 + initChars = ''.join(c for c in initChars if c not in excludeChars)
  1699 + if bodyChars:
  1700 + bodyChars = ''.join(c for c in bodyChars if c not in excludeChars)
  1701 + self.initCharsOrig = initChars
  1702 + self.initChars = set(initChars)
  1703 + if bodyChars :
  1704 + self.bodyCharsOrig = bodyChars
  1705 + self.bodyChars = set(bodyChars)
  1706 + else:
  1707 + self.bodyCharsOrig = initChars
  1708 + self.bodyChars = set(initChars)
  1709 +
  1710 + self.maxSpecified = max > 0
  1711 +
  1712 + if min < 1:
  1713 + raise ValueError("cannot specify a minimum length < 1; use Optional(Word()) if zero-length word is permitted")
  1714 +
  1715 + self.minLen = min
  1716 +
  1717 + if max > 0:
  1718 + self.maxLen = max
  1719 + else:
  1720 + self.maxLen = _MAX_INT
  1721 +
  1722 + if exact > 0:
  1723 + self.maxLen = exact
  1724 + self.minLen = exact
  1725 +
  1726 + self.name = _ustr(self)
  1727 + self.errmsg = "Expected " + self.name
  1728 + self.mayIndexError = False
  1729 + self.asKeyword = asKeyword
  1730 +
  1731 + if ' ' not in self.initCharsOrig+self.bodyCharsOrig and (min==1 and max==0 and exact==0):
  1732 + if self.bodyCharsOrig == self.initCharsOrig:
  1733 + self.reString = "[%s]+" % _escapeRegexRangeChars(self.initCharsOrig)
  1734 + elif len(self.bodyCharsOrig) == 1:
  1735 + self.reString = "%s[%s]*" % \
  1736 + (re.escape(self.initCharsOrig),
  1737 + _escapeRegexRangeChars(self.bodyCharsOrig),)
  1738 + else:
  1739 + self.reString = "[%s][%s]*" % \
  1740 + (_escapeRegexRangeChars(self.initCharsOrig),
  1741 + _escapeRegexRangeChars(self.bodyCharsOrig),)
  1742 + if self.asKeyword:
  1743 + self.reString = r"\b"+self.reString+r"\b"
  1744 + try:
  1745 + self.re = re.compile( self.reString )
  1746 + except:
  1747 + self.re = None
  1748 +
  1749 + def parseImpl( self, instring, loc, doActions=True ):
  1750 + if self.re:
  1751 + result = self.re.match(instring,loc)
  1752 + if not result:
  1753 + raise ParseException(instring, loc, self.errmsg, self)
  1754 +
  1755 + loc = result.end()
  1756 + return loc, result.group()
  1757 +
  1758 + if not(instring[ loc ] in self.initChars):
  1759 + raise ParseException(instring, loc, self.errmsg, self)
  1760 +
  1761 + start = loc
  1762 + loc += 1
  1763 + instrlen = len(instring)
  1764 + bodychars = self.bodyChars
  1765 + maxloc = start + self.maxLen
  1766 + maxloc = min( maxloc, instrlen )
  1767 + while loc < maxloc and instring[loc] in bodychars:
  1768 + loc += 1
  1769 +
  1770 + throwException = False
  1771 + if loc - start < self.minLen:
  1772 + throwException = True
  1773 + if self.maxSpecified and loc < instrlen and instring[loc] in bodychars:
  1774 + throwException = True
  1775 + if self.asKeyword:
  1776 + if (start>0 and instring[start-1] in bodychars) or (loc<instrlen and instring[loc] in bodychars):
  1777 + throwException = True
  1778 +
  1779 + if throwException:
  1780 + raise ParseException(instring, loc, self.errmsg, self)
  1781 +
  1782 + return loc, instring[start:loc]
  1783 +
  1784 + def __str__( self ):
  1785 + try:
  1786 + return super(Word,self).__str__()
  1787 + except:
  1788 + pass
  1789 +
  1790 +
  1791 + if self.strRepr is None:
  1792 +
  1793 + def charsAsStr(s):
  1794 + if len(s)>4:
  1795 + return s[:4]+"..."
  1796 + else:
  1797 + return s
  1798 +
  1799 + if ( self.initCharsOrig != self.bodyCharsOrig ):
  1800 + self.strRepr = "W:(%s,%s)" % ( charsAsStr(self.initCharsOrig), charsAsStr(self.bodyCharsOrig) )
  1801 + else:
  1802 + self.strRepr = "W:(%s)" % charsAsStr(self.initCharsOrig)
  1803 +
  1804 + return self.strRepr
  1805 +
  1806 +
  1807 +class Regex(Token):
  1808 + """Token for matching strings that match a given regular expression.
  1809 + Defined with string specifying the regular expression in a form recognized by the inbuilt Python re module.
  1810 + """
  1811 + compiledREtype = type(re.compile("[A-Z]"))
  1812 + def __init__( self, pattern, flags=0):
  1813 + """The parameters C{pattern} and C{flags} are passed to the C{re.compile()} function as-is. See the Python C{re} module for an explanation of the acceptable patterns and flags."""
  1814 + super(Regex,self).__init__()
  1815 +
  1816 + if isinstance(pattern, basestring):
  1817 + if len(pattern) == 0:
  1818 + warnings.warn("null string passed to Regex; use Empty() instead",
  1819 + SyntaxWarning, stacklevel=2)
  1820 +
  1821 + self.pattern = pattern
  1822 + self.flags = flags
  1823 +
  1824 + try:
  1825 + self.re = re.compile(self.pattern, self.flags)
  1826 + self.reString = self.pattern
  1827 + except sre_constants.error:
  1828 + warnings.warn("invalid pattern (%s) passed to Regex" % pattern,
  1829 + SyntaxWarning, stacklevel=2)
  1830 + raise
  1831 +
  1832 + elif isinstance(pattern, Regex.compiledREtype):
  1833 + self.re = pattern
  1834 + self.pattern = \
  1835 + self.reString = str(pattern)
  1836 + self.flags = flags
  1837 +
  1838 + else:
  1839 + raise ValueError("Regex may only be constructed with a string or a compiled RE object")
  1840 +
  1841 + self.name = _ustr(self)
  1842 + self.errmsg = "Expected " + self.name
  1843 + self.mayIndexError = False
  1844 + self.mayReturnEmpty = True
  1845 +
  1846 + def parseImpl( self, instring, loc, doActions=True ):
  1847 + result = self.re.match(instring,loc)
  1848 + if not result:
  1849 + raise ParseException(instring, loc, self.errmsg, self)
  1850 +
  1851 + loc = result.end()
  1852 + d = result.groupdict()
  1853 + ret = ParseResults(result.group())
  1854 + if d:
  1855 + for k in d:
  1856 + ret[k] = d[k]
  1857 + return loc,ret
  1858 +
  1859 + def __str__( self ):
  1860 + try:
  1861 + return super(Regex,self).__str__()
  1862 + except:
  1863 + pass
  1864 +
  1865 + if self.strRepr is None:
  1866 + self.strRepr = "Re:(%s)" % repr(self.pattern)
  1867 +
  1868 + return self.strRepr
  1869 +
  1870 +
  1871 +class QuotedString(Token):
  1872 + """Token for matching strings that are delimited by quoting characters.
  1873 + """
  1874 + def __init__( self, quoteChar, escChar=None, escQuote=None, multiline=False, unquoteResults=True, endQuoteChar=None):
  1875 + """
  1876 + Defined with the following parameters:
  1877 + - quoteChar - string of one or more characters defining the quote delimiting string
  1878 + - escChar - character to escape quotes, typically backslash (default=None)
  1879 + - escQuote - special quote sequence to escape an embedded quote string (such as SQL's "" to escape an embedded ") (default=None)
  1880 + - multiline - boolean indicating whether quotes can span multiple lines (default=C{False})
  1881 + - unquoteResults - boolean indicating whether the matched text should be unquoted (default=C{True})
  1882 + - endQuoteChar - string of one or more characters defining the end of the quote delimited string (default=C{None} => same as quoteChar)
  1883 + """
  1884 + super(QuotedString,self).__init__()
  1885 +
  1886 + # remove white space from quote chars - wont work anyway
  1887 + quoteChar = quoteChar.strip()
  1888 + if len(quoteChar) == 0:
  1889 + warnings.warn("quoteChar cannot be the empty string",SyntaxWarning,stacklevel=2)
  1890 + raise SyntaxError()
  1891 +
  1892 + if endQuoteChar is None:
  1893 + endQuoteChar = quoteChar
  1894 + else:
  1895 + endQuoteChar = endQuoteChar.strip()
  1896 + if len(endQuoteChar) == 0:
  1897 + warnings.warn("endQuoteChar cannot be the empty string",SyntaxWarning,stacklevel=2)
  1898 + raise SyntaxError()
  1899 +
  1900 + self.quoteChar = quoteChar
  1901 + self.quoteCharLen = len(quoteChar)
  1902 + self.firstQuoteChar = quoteChar[0]
  1903 + self.endQuoteChar = endQuoteChar
  1904 + self.endQuoteCharLen = len(endQuoteChar)
  1905 + self.escChar = escChar
  1906 + self.escQuote = escQuote
  1907 + self.unquoteResults = unquoteResults
  1908 +
  1909 + if multiline:
  1910 + self.flags = re.MULTILINE | re.DOTALL
  1911 + self.pattern = r'%s(?:[^%s%s]' % \
  1912 + ( re.escape(self.quoteChar),
  1913 + _escapeRegexRangeChars(self.endQuoteChar[0]),
  1914 + (escChar is not None and _escapeRegexRangeChars(escChar) or '') )
  1915 + else:
  1916 + self.flags = 0
  1917 + self.pattern = r'%s(?:[^%s\n\r%s]' % \
  1918 + ( re.escape(self.quoteChar),
  1919 + _escapeRegexRangeChars(self.endQuoteChar[0]),
  1920 + (escChar is not None and _escapeRegexRangeChars(escChar) or '') )
  1921 + if len(self.endQuoteChar) > 1:
  1922 + self.pattern += (
  1923 + '|(?:' + ')|(?:'.join("%s[^%s]" % (re.escape(self.endQuoteChar[:i]),
  1924 + _escapeRegexRangeChars(self.endQuoteChar[i]))
  1925 + for i in range(len(self.endQuoteChar)-1,0,-1)) + ')'
  1926 + )
  1927 + if escQuote:
  1928 + self.pattern += (r'|(?:%s)' % re.escape(escQuote))
  1929 + if escChar:
  1930 + self.pattern += (r'|(?:%s.)' % re.escape(escChar))
  1931 + self.escCharReplacePattern = re.escape(self.escChar)+"(.)"
  1932 + self.pattern += (r')*%s' % re.escape(self.endQuoteChar))
  1933 +
  1934 + try:
  1935 + self.re = re.compile(self.pattern, self.flags)
  1936 + self.reString = self.pattern
  1937 + except sre_constants.error:
  1938 + warnings.warn("invalid pattern (%s) passed to Regex" % self.pattern,
  1939 + SyntaxWarning, stacklevel=2)
  1940 + raise
  1941 +
  1942 + self.name = _ustr(self)
  1943 + self.errmsg = "Expected " + self.name
  1944 + self.mayIndexError = False
  1945 + self.mayReturnEmpty = True
  1946 +
  1947 + def parseImpl( self, instring, loc, doActions=True ):
  1948 + result = instring[loc] == self.firstQuoteChar and self.re.match(instring,loc) or None
  1949 + if not result:
  1950 + raise ParseException(instring, loc, self.errmsg, self)
  1951 +
  1952 + loc = result.end()
  1953 + ret = result.group()
  1954 +
  1955 + if self.unquoteResults:
  1956 +
  1957 + # strip off quotes
  1958 + ret = ret[self.quoteCharLen:-self.endQuoteCharLen]
  1959 +
  1960 + if isinstance(ret,basestring):
  1961 + # replace escaped characters
  1962 + if self.escChar:
  1963 + ret = re.sub(self.escCharReplacePattern,"\g<1>",ret)
  1964 +
  1965 + # replace escaped quotes
  1966 + if self.escQuote:
  1967 + ret = ret.replace(self.escQuote, self.endQuoteChar)
  1968 +
  1969 + return loc, ret
  1970 +
  1971 + def __str__( self ):
  1972 + try:
  1973 + return super(QuotedString,self).__str__()
  1974 + except:
  1975 + pass
  1976 +
  1977 + if self.strRepr is None:
  1978 + self.strRepr = "quoted string, starting with %s ending with %s" % (self.quoteChar, self.endQuoteChar)
  1979 +
  1980 + return self.strRepr
  1981 +
  1982 +
  1983 +class CharsNotIn(Token):
  1984 + """Token for matching words composed of characters *not* in a given set.
  1985 + Defined with string containing all disallowed characters, and an optional
  1986 + minimum, maximum, and/or exact length. The default value for C{min} is 1 (a
  1987 + minimum value < 1 is not valid); the default values for C{max} and C{exact}
  1988 + are 0, meaning no maximum or exact length restriction.
  1989 + """
  1990 + def __init__( self, notChars, min=1, max=0, exact=0 ):
  1991 + super(CharsNotIn,self).__init__()
  1992 + self.skipWhitespace = False
  1993 + self.notChars = notChars
  1994 +
  1995 + if min < 1:
  1996 + raise ValueError("cannot specify a minimum length < 1; use Optional(CharsNotIn()) if zero-length char group is permitted")
  1997 +
  1998 + self.minLen = min
  1999 +
  2000 + if max > 0:
  2001 + self.maxLen = max
  2002 + else:
  2003 + self.maxLen = _MAX_INT
  2004 +
  2005 + if exact > 0:
  2006 + self.maxLen = exact
  2007 + self.minLen = exact
  2008 +
  2009 + self.name = _ustr(self)
  2010 + self.errmsg = "Expected " + self.name
  2011 + self.mayReturnEmpty = ( self.minLen == 0 )
  2012 + self.mayIndexError = False
  2013 +
  2014 + def parseImpl( self, instring, loc, doActions=True ):
  2015 + if instring[loc] in self.notChars:
  2016 + raise ParseException(instring, loc, self.errmsg, self)
  2017 +
  2018 + start = loc
  2019 + loc += 1
  2020 + notchars = self.notChars
  2021 + maxlen = min( start+self.maxLen, len(instring) )
  2022 + while loc < maxlen and \
  2023 + (instring[loc] not in notchars):
  2024 + loc += 1
  2025 +
  2026 + if loc - start < self.minLen:
  2027 + raise ParseException(instring, loc, self.errmsg, self)
  2028 +
  2029 + return loc, instring[start:loc]
  2030 +
  2031 + def __str__( self ):
  2032 + try:
  2033 + return super(CharsNotIn, self).__str__()
  2034 + except:
  2035 + pass
  2036 +
  2037 + if self.strRepr is None:
  2038 + if len(self.notChars) > 4:
  2039 + self.strRepr = "!W:(%s...)" % self.notChars[:4]
  2040 + else:
  2041 + self.strRepr = "!W:(%s)" % self.notChars
  2042 +
  2043 + return self.strRepr
  2044 +
  2045 +class White(Token):
  2046 + """Special matching class for matching whitespace. Normally, whitespace is ignored
  2047 + by pyparsing grammars. This class is included when some whitespace structures
  2048 + are significant. Define with a string containing the whitespace characters to be
  2049 + matched; default is C{" \\t\\r\\n"}. Also takes optional C{min}, C{max}, and C{exact} arguments,
  2050 + as defined for the C{L{Word}} class."""
  2051 + whiteStrs = {
  2052 + " " : "<SPC>",
  2053 + "\t": "<TAB>",
  2054 + "\n": "<LF>",
  2055 + "\r": "<CR>",
  2056 + "\f": "<FF>",
  2057 + }
  2058 + def __init__(self, ws=" \t\r\n", min=1, max=0, exact=0):
  2059 + super(White,self).__init__()
  2060 + self.matchWhite = ws
  2061 + self.setWhitespaceChars( "".join(c for c in self.whiteChars if c not in self.matchWhite) )
  2062 + #~ self.leaveWhitespace()
  2063 + self.name = ("".join(White.whiteStrs[c] for c in self.matchWhite))
  2064 + self.mayReturnEmpty = True
  2065 + self.errmsg = "Expected " + self.name
  2066 +
  2067 + self.minLen = min
  2068 +
  2069 + if max > 0:
  2070 + self.maxLen = max
  2071 + else:
  2072 + self.maxLen = _MAX_INT
  2073 +
  2074 + if exact > 0:
  2075 + self.maxLen = exact
  2076 + self.minLen = exact
  2077 +
  2078 + def parseImpl( self, instring, loc, doActions=True ):
  2079 + if not(instring[ loc ] in self.matchWhite):
  2080 + raise ParseException(instring, loc, self.errmsg, self)
  2081 + start = loc
  2082 + loc += 1
  2083 + maxloc = start + self.maxLen
  2084 + maxloc = min( maxloc, len(instring) )
  2085 + while loc < maxloc and instring[loc] in self.matchWhite:
  2086 + loc += 1
  2087 +
  2088 + if loc - start < self.minLen:
  2089 + raise ParseException(instring, loc, self.errmsg, self)
  2090 +
  2091 + return loc, instring[start:loc]
  2092 +
  2093 +
  2094 +class _PositionToken(Token):
  2095 + def __init__( self ):
  2096 + super(_PositionToken,self).__init__()
  2097 + self.name=self.__class__.__name__
  2098 + self.mayReturnEmpty = True
  2099 + self.mayIndexError = False
  2100 +
  2101 +class GoToColumn(_PositionToken):
  2102 + """Token to advance to a specific column of input text; useful for tabular report scraping."""
  2103 + def __init__( self, colno ):
  2104 + super(GoToColumn,self).__init__()
  2105 + self.col = colno
  2106 +
  2107 + def preParse( self, instring, loc ):
  2108 + if col(loc,instring) != self.col:
  2109 + instrlen = len(instring)
  2110 + if self.ignoreExprs:
  2111 + loc = self._skipIgnorables( instring, loc )
  2112 + while loc < instrlen and instring[loc].isspace() and col( loc, instring ) != self.col :
  2113 + loc += 1
  2114 + return loc
  2115 +
  2116 + def parseImpl( self, instring, loc, doActions=True ):
  2117 + thiscol = col( loc, instring )
  2118 + if thiscol > self.col:
  2119 + raise ParseException( instring, loc, "Text not in expected column", self )
  2120 + newloc = loc + self.col - thiscol
  2121 + ret = instring[ loc: newloc ]
  2122 + return newloc, ret
  2123 +
  2124 +class LineStart(_PositionToken):
  2125 + """Matches if current position is at the beginning of a line within the parse string"""
  2126 + def __init__( self ):
  2127 + super(LineStart,self).__init__()
  2128 + self.setWhitespaceChars( ParserElement.DEFAULT_WHITE_CHARS.replace("\n","") )
  2129 + self.errmsg = "Expected start of line"
  2130 +
  2131 + def preParse( self, instring, loc ):
  2132 + preloc = super(LineStart,self).preParse(instring,loc)
  2133 + if instring[preloc] == "\n":
  2134 + loc += 1
  2135 + return loc
  2136 +
  2137 + def parseImpl( self, instring, loc, doActions=True ):
  2138 + if not( loc==0 or
  2139 + (loc == self.preParse( instring, 0 )) or
  2140 + (instring[loc-1] == "\n") ): #col(loc, instring) != 1:
  2141 + raise ParseException(instring, loc, self.errmsg, self)
  2142 + return loc, []
  2143 +
  2144 +class LineEnd(_PositionToken):
  2145 + """Matches if current position is at the end of a line within the parse string"""
  2146 + def __init__( self ):
  2147 + super(LineEnd,self).__init__()
  2148 + self.setWhitespaceChars( ParserElement.DEFAULT_WHITE_CHARS.replace("\n","") )
  2149 + self.errmsg = "Expected end of line"
  2150 +
  2151 + def parseImpl( self, instring, loc, doActions=True ):
  2152 + if loc<len(instring):
  2153 + if instring[loc] == "\n":
  2154 + return loc+1, "\n"
  2155 + else:
  2156 + raise ParseException(instring, loc, self.errmsg, self)
  2157 + elif loc == len(instring):
  2158 + return loc+1, []
  2159 + else:
  2160 + raise ParseException(instring, loc, self.errmsg, self)
  2161 +
  2162 +class StringStart(_PositionToken):
  2163 + """Matches if current position is at the beginning of the parse string"""
  2164 + def __init__( self ):
  2165 + super(StringStart,self).__init__()
  2166 + self.errmsg = "Expected start of text"
  2167 +
  2168 + def parseImpl( self, instring, loc, doActions=True ):
  2169 + if loc != 0:
  2170 + # see if entire string up to here is just whitespace and ignoreables
  2171 + if loc != self.preParse( instring, 0 ):
  2172 + raise ParseException(instring, loc, self.errmsg, self)
  2173 + return loc, []
  2174 +
  2175 +class StringEnd(_PositionToken):
  2176 + """Matches if current position is at the end of the parse string"""
  2177 + def __init__( self ):
  2178 + super(StringEnd,self).__init__()
  2179 + self.errmsg = "Expected end of text"
  2180 +
  2181 + def parseImpl( self, instring, loc, doActions=True ):
  2182 + if loc < len(instring):
  2183 + raise ParseException(instring, loc, self.errmsg, self)
  2184 + elif loc == len(instring):
  2185 + return loc+1, []
  2186 + elif loc > len(instring):
  2187 + return loc, []
  2188 + else:
  2189 + raise ParseException(instring, loc, self.errmsg, self)
  2190 +
  2191 +class WordStart(_PositionToken):
  2192 + """Matches if the current position is at the beginning of a Word, and
  2193 + is not preceded by any character in a given set of C{wordChars}
  2194 + (default=C{printables}). To emulate the C{\b} behavior of regular expressions,
  2195 + use C{WordStart(alphanums)}. C{WordStart} will also match at the beginning of
  2196 + the string being parsed, or at the beginning of a line.
  2197 + """
  2198 + def __init__(self, wordChars = printables):
  2199 + super(WordStart,self).__init__()
  2200 + self.wordChars = set(wordChars)
  2201 + self.errmsg = "Not at the start of a word"
  2202 +
  2203 + def parseImpl(self, instring, loc, doActions=True ):
  2204 + if loc != 0:
  2205 + if (instring[loc-1] in self.wordChars or
  2206 + instring[loc] not in self.wordChars):
  2207 + raise ParseException(instring, loc, self.errmsg, self)
  2208 + return loc, []
  2209 +
  2210 +class WordEnd(_PositionToken):
  2211 + """Matches if the current position is at the end of a Word, and
  2212 + is not followed by any character in a given set of C{wordChars}
  2213 + (default=C{printables}). To emulate the C{\b} behavior of regular expressions,
  2214 + use C{WordEnd(alphanums)}. C{WordEnd} will also match at the end of
  2215 + the string being parsed, or at the end of a line.
  2216 + """
  2217 + def __init__(self, wordChars = printables):
  2218 + super(WordEnd,self).__init__()
  2219 + self.wordChars = set(wordChars)
  2220 + self.skipWhitespace = False
  2221 + self.errmsg = "Not at the end of a word"
  2222 +
  2223 + def parseImpl(self, instring, loc, doActions=True ):
  2224 + instrlen = len(instring)
  2225 + if instrlen>0 and loc<instrlen:
  2226 + if (instring[loc] in self.wordChars or
  2227 + instring[loc-1] not in self.wordChars):
  2228 + raise ParseException(instring, loc, self.errmsg, self)
  2229 + return loc, []
  2230 +
  2231 +
  2232 +class ParseExpression(ParserElement):
  2233 + """Abstract subclass of ParserElement, for combining and post-processing parsed tokens."""
  2234 + def __init__( self, exprs, savelist = False ):
  2235 + super(ParseExpression,self).__init__(savelist)
  2236 + if isinstance( exprs, _generatorType ):
  2237 + exprs = list(exprs)
  2238 +
  2239 + if isinstance( exprs, basestring ):
  2240 + self.exprs = [ Literal( exprs ) ]
  2241 + elif isinstance( exprs, collections.Sequence ):
  2242 + # if sequence of strings provided, wrap with Literal
  2243 + if all(isinstance(expr, basestring) for expr in exprs):
  2244 + exprs = map(Literal, exprs)
  2245 + self.exprs = list(exprs)
  2246 + else:
  2247 + try:
  2248 + self.exprs = list( exprs )
  2249 + except TypeError:
  2250 + self.exprs = [ exprs ]
  2251 + self.callPreparse = False
  2252 +
  2253 + def __getitem__( self, i ):
  2254 + return self.exprs[i]
  2255 +
  2256 + def append( self, other ):
  2257 + self.exprs.append( other )
  2258 + self.strRepr = None
  2259 + return self
  2260 +
  2261 + def leaveWhitespace( self ):
  2262 + """Extends C{leaveWhitespace} defined in base class, and also invokes C{leaveWhitespace} on
  2263 + all contained expressions."""
  2264 + self.skipWhitespace = False
  2265 + self.exprs = [ e.copy() for e in self.exprs ]
  2266 + for e in self.exprs:
  2267 + e.leaveWhitespace()
  2268 + return self
  2269 +
  2270 + def ignore( self, other ):
  2271 + if isinstance( other, Suppress ):
  2272 + if other not in self.ignoreExprs:
  2273 + super( ParseExpression, self).ignore( other )
  2274 + for e in self.exprs:
  2275 + e.ignore( self.ignoreExprs[-1] )
  2276 + else:
  2277 + super( ParseExpression, self).ignore( other )
  2278 + for e in self.exprs:
  2279 + e.ignore( self.ignoreExprs[-1] )
  2280 + return self
  2281 +
  2282 + def __str__( self ):
  2283 + try:
  2284 + return super(ParseExpression,self).__str__()
  2285 + except:
  2286 + pass
  2287 +
  2288 + if self.strRepr is None:
  2289 + self.strRepr = "%s:(%s)" % ( self.__class__.__name__, _ustr(self.exprs) )
  2290 + return self.strRepr
  2291 +
  2292 + def streamline( self ):
  2293 + super(ParseExpression,self).streamline()
  2294 +
  2295 + for e in self.exprs:
  2296 + e.streamline()
  2297 +
  2298 + # collapse nested And's of the form And( And( And( a,b), c), d) to And( a,b,c,d )
  2299 + # but only if there are no parse actions or resultsNames on the nested And's
  2300 + # (likewise for Or's and MatchFirst's)
  2301 + if ( len(self.exprs) == 2 ):
  2302 + other = self.exprs[0]
  2303 + if ( isinstance( other, self.__class__ ) and
  2304 + not(other.parseAction) and
  2305 + other.resultsName is None and
  2306 + not other.debug ):
  2307 + self.exprs = other.exprs[:] + [ self.exprs[1] ]
  2308 + self.strRepr = None
  2309 + self.mayReturnEmpty |= other.mayReturnEmpty
  2310 + self.mayIndexError |= other.mayIndexError
  2311 +
  2312 + other = self.exprs[-1]
  2313 + if ( isinstance( other, self.__class__ ) and
  2314 + not(other.parseAction) and
  2315 + other.resultsName is None and
  2316 + not other.debug ):
  2317 + self.exprs = self.exprs[:-1] + other.exprs[:]
  2318 + self.strRepr = None
  2319 + self.mayReturnEmpty |= other.mayReturnEmpty
  2320 + self.mayIndexError |= other.mayIndexError
  2321 +
  2322 + return self
  2323 +
  2324 + def setResultsName( self, name, listAllMatches=False ):
  2325 + ret = super(ParseExpression,self).setResultsName(name,listAllMatches)
  2326 + return ret
  2327 +
  2328 + def validate( self, validateTrace=[] ):
  2329 + tmp = validateTrace[:]+[self]
  2330 + for e in self.exprs:
  2331 + e.validate(tmp)
  2332 + self.checkRecursion( [] )
  2333 +
  2334 + def copy(self):
  2335 + ret = super(ParseExpression,self).copy()
  2336 + ret.exprs = [e.copy() for e in self.exprs]
  2337 + return ret
  2338 +
  2339 +class And(ParseExpression):
  2340 + """Requires all given C{ParseExpression}s to be found in the given order.
  2341 + Expressions may be separated by whitespace.
  2342 + May be constructed using the C{'+'} operator.
  2343 + """
  2344 +
  2345 + class _ErrorStop(Empty):
  2346 + def __init__(self, *args, **kwargs):
  2347 + super(And._ErrorStop,self).__init__(*args, **kwargs)
  2348 + self.name = '-'
  2349 + self.leaveWhitespace()
  2350 +
  2351 + def __init__( self, exprs, savelist = True ):
  2352 + super(And,self).__init__(exprs, savelist)
  2353 + self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)
  2354 + self.setWhitespaceChars( self.exprs[0].whiteChars )
  2355 + self.skipWhitespace = self.exprs[0].skipWhitespace
  2356 + self.callPreparse = True
  2357 +
  2358 + def parseImpl( self, instring, loc, doActions=True ):
  2359 + # pass False as last arg to _parse for first element, since we already
  2360 + # pre-parsed the string as part of our And pre-parsing
  2361 + loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
  2362 + errorStop = False
  2363 + for e in self.exprs[1:]:
  2364 + if isinstance(e, And._ErrorStop):
  2365 + errorStop = True
  2366 + continue
  2367 + if errorStop:
  2368 + try:
  2369 + loc, exprtokens = e._parse( instring, loc, doActions )
  2370 + except ParseSyntaxException:
  2371 + raise
  2372 + except ParseBaseException as pe:
  2373 + pe.__traceback__ = None
  2374 + raise ParseSyntaxException(pe)
  2375 + except IndexError:
  2376 + raise ParseSyntaxException( ParseException(instring, len(instring), self.errmsg, self) )
  2377 + else:
  2378 + loc, exprtokens = e._parse( instring, loc, doActions )
  2379 + if exprtokens or exprtokens.haskeys():
  2380 + resultlist += exprtokens
  2381 + return loc, resultlist
  2382 +
  2383 + def __iadd__(self, other ):
  2384 + if isinstance( other, basestring ):
  2385 + other = Literal( other )
  2386 + return self.append( other ) #And( [ self, other ] )
  2387 +
  2388 + def checkRecursion( self, parseElementList ):
  2389 + subRecCheckList = parseElementList[:] + [ self ]
  2390 + for e in self.exprs:
  2391 + e.checkRecursion( subRecCheckList )
  2392 + if not e.mayReturnEmpty:
  2393 + break
  2394 +
  2395 + def __str__( self ):
  2396 + if hasattr(self,"name"):
  2397 + return self.name
  2398 +
  2399 + if self.strRepr is None:
  2400 + self.strRepr = "{" + " ".join(_ustr(e) for e in self.exprs) + "}"
  2401 +
  2402 + return self.strRepr
  2403 +
  2404 +
  2405 +class Or(ParseExpression):
  2406 + """Requires that at least one C{ParseExpression} is found.
  2407 + If two expressions match, the expression that matches the longest string will be used.
  2408 + May be constructed using the C{'^'} operator.
  2409 + """
  2410 + def __init__( self, exprs, savelist = False ):
  2411 + super(Or,self).__init__(exprs, savelist)
  2412 + if self.exprs:
  2413 + self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)
  2414 + else:
  2415 + self.mayReturnEmpty = True
  2416 +
  2417 + def parseImpl( self, instring, loc, doActions=True ):
  2418 + maxExcLoc = -1
  2419 + maxMatchLoc = -1
  2420 + maxException = None
  2421 + for e in self.exprs:
  2422 + try:
  2423 + loc2 = e.tryParse( instring, loc )
  2424 + except ParseException as err:
  2425 + err.__traceback__ = None
  2426 + if err.loc > maxExcLoc:
  2427 + maxException = err
  2428 + maxExcLoc = err.loc
  2429 + except IndexError:
  2430 + if len(instring) > maxExcLoc:
  2431 + maxException = ParseException(instring,len(instring),e.errmsg,self)
  2432 + maxExcLoc = len(instring)
  2433 + else:
  2434 + if loc2 > maxMatchLoc:
  2435 + maxMatchLoc = loc2
  2436 + maxMatchExp = e
  2437 +
  2438 + if maxMatchLoc < 0:
  2439 + if maxException is not None:
  2440 + raise maxException
  2441 + else:
  2442 + raise ParseException(instring, loc, "no defined alternatives to match", self)
  2443 +
  2444 + return maxMatchExp._parse( instring, loc, doActions )
  2445 +
  2446 + def __ixor__(self, other ):
  2447 + if isinstance( other, basestring ):
  2448 + other = ParserElement.literalStringClass( other )
  2449 + return self.append( other ) #Or( [ self, other ] )
  2450 +
  2451 + def __str__( self ):
  2452 + if hasattr(self,"name"):
  2453 + return self.name
  2454 +
  2455 + if self.strRepr is None:
  2456 + self.strRepr = "{" + " ^ ".join(_ustr(e) for e in self.exprs) + "}"
  2457 +
  2458 + return self.strRepr
  2459 +
  2460 + def checkRecursion( self, parseElementList ):
  2461 + subRecCheckList = parseElementList[:] + [ self ]
  2462 + for e in self.exprs:
  2463 + e.checkRecursion( subRecCheckList )
  2464 +
  2465 +
  2466 +class MatchFirst(ParseExpression):
  2467 + """Requires that at least one C{ParseExpression} is found.
  2468 + If two expressions match, the first one listed is the one that will match.
  2469 + May be constructed using the C{'|'} operator.
  2470 + """
  2471 + def __init__( self, exprs, savelist = False ):
  2472 + super(MatchFirst,self).__init__(exprs, savelist)
  2473 + if self.exprs:
  2474 + self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)
  2475 + else:
  2476 + self.mayReturnEmpty = True
  2477 +
  2478 + def parseImpl( self, instring, loc, doActions=True ):
  2479 + maxExcLoc = -1
  2480 + maxException = None
  2481 + for e in self.exprs:
  2482 + try:
  2483 + ret = e._parse( instring, loc, doActions )
  2484 + return ret
  2485 + except ParseException as err:
  2486 + if err.loc > maxExcLoc:
  2487 + maxException = err
  2488 + maxExcLoc = err.loc
  2489 + except IndexError:
  2490 + if len(instring) > maxExcLoc:
  2491 + maxException = ParseException(instring,len(instring),e.errmsg,self)
  2492 + maxExcLoc = len(instring)
  2493 +
  2494 + # only got here if no expression matched, raise exception for match that made it the furthest
  2495 + else:
  2496 + if maxException is not None:
  2497 + raise maxException
  2498 + else:
  2499 + raise ParseException(instring, loc, "no defined alternatives to match", self)
  2500 +
  2501 + def __ior__(self, other ):
  2502 + if isinstance( other, basestring ):
  2503 + other = ParserElement.literalStringClass( other )
  2504 + return self.append( other ) #MatchFirst( [ self, other ] )
  2505 +
  2506 + def __str__( self ):
  2507 + if hasattr(self,"name"):
  2508 + return self.name
  2509 +
  2510 + if self.strRepr is None:
  2511 + self.strRepr = "{" + " | ".join(_ustr(e) for e in self.exprs) + "}"
  2512 +
  2513 + return self.strRepr
  2514 +
  2515 + def checkRecursion( self, parseElementList ):
  2516 + subRecCheckList = parseElementList[:] + [ self ]
  2517 + for e in self.exprs:
  2518 + e.checkRecursion( subRecCheckList )
  2519 +
  2520 +
  2521 +class Each(ParseExpression):
  2522 + """Requires all given C{ParseExpression}s to be found, but in any order.
  2523 + Expressions may be separated by whitespace.
  2524 + May be constructed using the C{'&'} operator.
  2525 + """
  2526 + def __init__( self, exprs, savelist = True ):
  2527 + super(Each,self).__init__(exprs, savelist)
  2528 + self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)
  2529 + self.skipWhitespace = True
  2530 + self.initExprGroups = True
  2531 +
  2532 + def parseImpl( self, instring, loc, doActions=True ):
  2533 + if self.initExprGroups:
  2534 + opt1 = [ e.expr for e in self.exprs if isinstance(e,Optional) ]
  2535 + opt2 = [ e for e in self.exprs if e.mayReturnEmpty and e not in opt1 ]
  2536 + self.optionals = opt1 + opt2
  2537 + self.multioptionals = [ e.expr for e in self.exprs if isinstance(e,ZeroOrMore) ]
  2538 + self.multirequired = [ e.expr for e in self.exprs if isinstance(e,OneOrMore) ]
  2539 + self.required = [ e for e in self.exprs if not isinstance(e,(Optional,ZeroOrMore,OneOrMore)) ]
  2540 + self.required += self.multirequired
  2541 + self.initExprGroups = False
  2542 + tmpLoc = loc
  2543 + tmpReqd = self.required[:]
  2544 + tmpOpt = self.optionals[:]
  2545 + matchOrder = []
  2546 +
  2547 + keepMatching = True
  2548 + while keepMatching:
  2549 + tmpExprs = tmpReqd + tmpOpt + self.multioptionals + self.multirequired
  2550 + failed = []
  2551 + for e in tmpExprs:
  2552 + try:
  2553 + tmpLoc = e.tryParse( instring, tmpLoc )
  2554 + except ParseException:
  2555 + failed.append(e)
  2556 + else:
  2557 + matchOrder.append(e)
  2558 + if e in tmpReqd:
  2559 + tmpReqd.remove(e)
  2560 + elif e in tmpOpt:
  2561 + tmpOpt.remove(e)
  2562 + if len(failed) == len(tmpExprs):
  2563 + keepMatching = False
  2564 +
  2565 + if tmpReqd:
  2566 + missing = ", ".join(_ustr(e) for e in tmpReqd)
  2567 + raise ParseException(instring,loc,"Missing one or more required elements (%s)" % missing )
  2568 +
  2569 + # add any unmatched Optionals, in case they have default values defined
  2570 + matchOrder += [e for e in self.exprs if isinstance(e,Optional) and e.expr in tmpOpt]
  2571 +
  2572 + resultlist = []
  2573 + for e in matchOrder:
  2574 + loc,results = e._parse(instring,loc,doActions)
  2575 + resultlist.append(results)
  2576 +
  2577 + finalResults = ParseResults([])
  2578 + for r in resultlist:
  2579 + dups = {}
  2580 + for k in r.keys():
  2581 + if k in finalResults:
  2582 + tmp = ParseResults(finalResults[k])
  2583 + tmp += ParseResults(r[k])
  2584 + dups[k] = tmp
  2585 + finalResults += ParseResults(r)
  2586 + for k,v in dups.items():
  2587 + finalResults[k] = v
  2588 + return loc, finalResults
  2589 +
  2590 + def __str__( self ):
  2591 + if hasattr(self,"name"):
  2592 + return self.name
  2593 +
  2594 + if self.strRepr is None:
  2595 + self.strRepr = "{" + " & ".join(_ustr(e) for e in self.exprs) + "}"
  2596 +
  2597 + return self.strRepr
  2598 +
  2599 + def checkRecursion( self, parseElementList ):
  2600 + subRecCheckList = parseElementList[:] + [ self ]
  2601 + for e in self.exprs:
  2602 + e.checkRecursion( subRecCheckList )
  2603 +
  2604 +
  2605 +class ParseElementEnhance(ParserElement):
  2606 + """Abstract subclass of C{ParserElement}, for combining and post-processing parsed tokens."""
  2607 + def __init__( self, expr, savelist=False ):
  2608 + super(ParseElementEnhance,self).__init__(savelist)
  2609 + if isinstance( expr, basestring ):
  2610 + expr = Literal(expr)
  2611 + self.expr = expr
  2612 + self.strRepr = None
  2613 + if expr is not None:
  2614 + self.mayIndexError = expr.mayIndexError
  2615 + self.mayReturnEmpty = expr.mayReturnEmpty
  2616 + self.setWhitespaceChars( expr.whiteChars )
  2617 + self.skipWhitespace = expr.skipWhitespace
  2618 + self.saveAsList = expr.saveAsList
  2619 + self.callPreparse = expr.callPreparse
  2620 + self.ignoreExprs.extend(expr.ignoreExprs)
  2621 +
  2622 + def parseImpl( self, instring, loc, doActions=True ):
  2623 + if self.expr is not None:
  2624 + return self.expr._parse( instring, loc, doActions, callPreParse=False )
  2625 + else:
  2626 + raise ParseException("",loc,self.errmsg,self)
  2627 +
  2628 + def leaveWhitespace( self ):
  2629 + self.skipWhitespace = False
  2630 + self.expr = self.expr.copy()
  2631 + if self.expr is not None:
  2632 + self.expr.leaveWhitespace()
  2633 + return self
  2634 +
  2635 + def ignore( self, other ):
  2636 + if isinstance( other, Suppress ):
  2637 + if other not in self.ignoreExprs:
  2638 + super( ParseElementEnhance, self).ignore( other )
  2639 + if self.expr is not None:
  2640 + self.expr.ignore( self.ignoreExprs[-1] )
  2641 + else:
  2642 + super( ParseElementEnhance, self).ignore( other )
  2643 + if self.expr is not None:
  2644 + self.expr.ignore( self.ignoreExprs[-1] )
  2645 + return self
  2646 +
  2647 + def streamline( self ):
  2648 + super(ParseElementEnhance,self).streamline()
  2649 + if self.expr is not None:
  2650 + self.expr.streamline()
  2651 + return self
  2652 +
  2653 + def checkRecursion( self, parseElementList ):
  2654 + if self in parseElementList:
  2655 + raise RecursiveGrammarException( parseElementList+[self] )
  2656 + subRecCheckList = parseElementList[:] + [ self ]
  2657 + if self.expr is not None:
  2658 + self.expr.checkRecursion( subRecCheckList )
  2659 +
  2660 + def validate( self, validateTrace=[] ):
  2661 + tmp = validateTrace[:]+[self]
  2662 + if self.expr is not None:
  2663 + self.expr.validate(tmp)
  2664 + self.checkRecursion( [] )
  2665 +
  2666 + def __str__( self ):
  2667 + try:
  2668 + return super(ParseElementEnhance,self).__str__()
  2669 + except:
  2670 + pass
  2671 +
  2672 + if self.strRepr is None and self.expr is not None:
  2673 + self.strRepr = "%s:(%s)" % ( self.__class__.__name__, _ustr(self.expr) )
  2674 + return self.strRepr
  2675 +
  2676 +
  2677 +class FollowedBy(ParseElementEnhance):
  2678 + """Lookahead matching of the given parse expression. C{FollowedBy}
  2679 + does *not* advance the parsing position within the input string, it only
  2680 + verifies that the specified parse expression matches at the current
  2681 + position. C{FollowedBy} always returns a null token list."""
  2682 + def __init__( self, expr ):
  2683 + super(FollowedBy,self).__init__(expr)
  2684 + self.mayReturnEmpty = True
  2685 +
  2686 + def parseImpl( self, instring, loc, doActions=True ):
  2687 + self.expr.tryParse( instring, loc )
  2688 + return loc, []
  2689 +
  2690 +
  2691 +class NotAny(ParseElementEnhance):
  2692 + """Lookahead to disallow matching with the given parse expression. C{NotAny}
  2693 + does *not* advance the parsing position within the input string, it only
  2694 + verifies that the specified parse expression does *not* match at the current
  2695 + position. Also, C{NotAny} does *not* skip over leading whitespace. C{NotAny}
  2696 + always returns a null token list. May be constructed using the '~' operator."""
  2697 + def __init__( self, expr ):
  2698 + super(NotAny,self).__init__(expr)
  2699 + #~ self.leaveWhitespace()
  2700 + self.skipWhitespace = False # do NOT use self.leaveWhitespace(), don't want to propagate to exprs
  2701 + self.mayReturnEmpty = True
  2702 + self.errmsg = "Found unwanted token, "+_ustr(self.expr)
  2703 +
  2704 + def parseImpl( self, instring, loc, doActions=True ):
  2705 + try:
  2706 + self.expr.tryParse( instring, loc )
  2707 + except (ParseException,IndexError):
  2708 + pass
  2709 + else:
  2710 + raise ParseException(instring, loc, self.errmsg, self)
  2711 + return loc, []
  2712 +
  2713 + def __str__( self ):
  2714 + if hasattr(self,"name"):
  2715 + return self.name
  2716 +
  2717 + if self.strRepr is None:
  2718 + self.strRepr = "~{" + _ustr(self.expr) + "}"
  2719 +
  2720 + return self.strRepr
  2721 +
  2722 +
  2723 +class ZeroOrMore(ParseElementEnhance):
  2724 + """Optional repetition of zero or more of the given expression."""
  2725 + def __init__( self, expr ):
  2726 + super(ZeroOrMore,self).__init__(expr)
  2727 + self.mayReturnEmpty = True
  2728 +
  2729 + def parseImpl( self, instring, loc, doActions=True ):
  2730 + tokens = []
  2731 + try:
  2732 + loc, tokens = self.expr._parse( instring, loc, doActions, callPreParse=False )
  2733 + hasIgnoreExprs = ( len(self.ignoreExprs) > 0 )
  2734 + while 1:
  2735 + if hasIgnoreExprs:
  2736 + preloc = self._skipIgnorables( instring, loc )
  2737 + else:
  2738 + preloc = loc
  2739 + loc, tmptokens = self.expr._parse( instring, preloc, doActions )
  2740 + if tmptokens or tmptokens.haskeys():
  2741 + tokens += tmptokens
  2742 + except (ParseException,IndexError):
  2743 + pass
  2744 +
  2745 + return loc, tokens
  2746 +
  2747 + def __str__( self ):
  2748 + if hasattr(self,"name"):
  2749 + return self.name
  2750 +
  2751 + if self.strRepr is None:
  2752 + self.strRepr = "[" + _ustr(self.expr) + "]..."
  2753 +
  2754 + return self.strRepr
  2755 +
  2756 + def setResultsName( self, name, listAllMatches=False ):
  2757 + ret = super(ZeroOrMore,self).setResultsName(name,listAllMatches)
  2758 + ret.saveAsList = True
  2759 + return ret
  2760 +
  2761 +
  2762 +class OneOrMore(ParseElementEnhance):
  2763 + """Repetition of one or more of the given expression."""
  2764 + def parseImpl( self, instring, loc, doActions=True ):
  2765 + # must be at least one
  2766 + loc, tokens = self.expr._parse( instring, loc, doActions, callPreParse=False )
  2767 + try:
  2768 + hasIgnoreExprs = ( len(self.ignoreExprs) > 0 )
  2769 + while 1:
  2770 + if hasIgnoreExprs:
  2771 + preloc = self._skipIgnorables( instring, loc )
  2772 + else:
  2773 + preloc = loc
  2774 + loc, tmptokens = self.expr._parse( instring, preloc, doActions )
  2775 + if tmptokens or tmptokens.haskeys():
  2776 + tokens += tmptokens
  2777 + except (ParseException,IndexError):
  2778 + pass
  2779 +
  2780 + return loc, tokens
  2781 +
  2782 + def __str__( self ):
  2783 + if hasattr(self,"name"):
  2784 + return self.name
  2785 +
  2786 + if self.strRepr is None:
  2787 + self.strRepr = "{" + _ustr(self.expr) + "}..."
  2788 +
  2789 + return self.strRepr
  2790 +
  2791 + def setResultsName( self, name, listAllMatches=False ):
  2792 + ret = super(OneOrMore,self).setResultsName(name,listAllMatches)
  2793 + ret.saveAsList = True
  2794 + return ret
  2795 +
  2796 +class _NullToken(object):
  2797 + def __bool__(self):
  2798 + return False
  2799 + __nonzero__ = __bool__
  2800 + def __str__(self):
  2801 + return ""
  2802 +
  2803 +_optionalNotMatched = _NullToken()
  2804 +class Optional(ParseElementEnhance):
  2805 + """Optional matching of the given expression.
  2806 + A default return string can also be specified, if the optional expression
  2807 + is not found.
  2808 + """
  2809 + def __init__( self, expr, default=_optionalNotMatched ):
  2810 + super(Optional,self).__init__( expr, savelist=False )
  2811 + self.defaultValue = default
  2812 + self.mayReturnEmpty = True
  2813 +
  2814 + def parseImpl( self, instring, loc, doActions=True ):
  2815 + try:
  2816 + loc, tokens = self.expr._parse( instring, loc, doActions, callPreParse=False )
  2817 + except (ParseException,IndexError):
  2818 + if self.defaultValue is not _optionalNotMatched:
  2819 + if self.expr.resultsName:
  2820 + tokens = ParseResults([ self.defaultValue ])
  2821 + tokens[self.expr.resultsName] = self.defaultValue
  2822 + else:
  2823 + tokens = [ self.defaultValue ]
  2824 + else:
  2825 + tokens = []
  2826 + return loc, tokens
  2827 +
  2828 + def __str__( self ):
  2829 + if hasattr(self,"name"):
  2830 + return self.name
  2831 +
  2832 + if self.strRepr is None:
  2833 + self.strRepr = "[" + _ustr(self.expr) + "]"
  2834 +
  2835 + return self.strRepr
  2836 +
  2837 +
  2838 +class SkipTo(ParseElementEnhance):
  2839 + """Token for skipping over all undefined text until the matched expression is found.
  2840 + If C{include} is set to true, the matched expression is also parsed (the skipped text
  2841 + and matched expression are returned as a 2-element list). The C{ignore}
  2842 + argument is used to define grammars (typically quoted strings and comments) that
  2843 + might contain false matches.
  2844 + """
  2845 + def __init__( self, other, include=False, ignore=None, failOn=None ):
  2846 + super( SkipTo, self ).__init__( other )
  2847 + self.ignoreExpr = ignore
  2848 + self.mayReturnEmpty = True
  2849 + self.mayIndexError = False
  2850 + self.includeMatch = include
  2851 + self.asList = False
  2852 + if failOn is not None and isinstance(failOn, basestring):
  2853 + self.failOn = Literal(failOn)
  2854 + else:
  2855 + self.failOn = failOn
  2856 + self.errmsg = "No match found for "+_ustr(self.expr)
  2857 +
  2858 + def parseImpl( self, instring, loc, doActions=True ):
  2859 + startLoc = loc
  2860 + instrlen = len(instring)
  2861 + expr = self.expr
  2862 + failParse = False
  2863 + while loc <= instrlen:
  2864 + try:
  2865 + if self.failOn:
  2866 + try:
  2867 + self.failOn.tryParse(instring, loc)
  2868 + except ParseBaseException:
  2869 + pass
  2870 + else:
  2871 + failParse = True
  2872 + raise ParseException(instring, loc, "Found expression " + str(self.failOn))
  2873 + failParse = False
  2874 + if self.ignoreExpr is not None:
  2875 + while 1:
  2876 + try:
  2877 + loc = self.ignoreExpr.tryParse(instring,loc)
  2878 + # print("found ignoreExpr, advance to", loc)
  2879 + except ParseBaseException:
  2880 + break
  2881 + expr._parse( instring, loc, doActions=False, callPreParse=False )
  2882 + skipText = instring[startLoc:loc]
  2883 + if self.includeMatch:
  2884 + loc,mat = expr._parse(instring,loc,doActions,callPreParse=False)
  2885 + if mat:
  2886 + skipRes = ParseResults( skipText )
  2887 + skipRes += mat
  2888 + return loc, [ skipRes ]
  2889 + else:
  2890 + return loc, [ skipText ]
  2891 + else:
  2892 + return loc, [ skipText ]
  2893 + except (ParseException,IndexError):
  2894 + if failParse:
  2895 + raise
  2896 + else:
  2897 + loc += 1
  2898 + raise ParseException(instring, loc, self.errmsg, self)
  2899 +
  2900 +class Forward(ParseElementEnhance):
  2901 + """Forward declaration of an expression to be defined later -
  2902 + used for recursive grammars, such as algebraic infix notation.
  2903 + When the expression is known, it is assigned to the C{Forward} variable using the '<<' operator.
  2904 +
  2905 + Note: take care when assigning to C{Forward} not to overlook precedence of operators.
  2906 + Specifically, '|' has a lower precedence than '<<', so that::
  2907 + fwdExpr << a | b | c
  2908 + will actually be evaluated as::
  2909 + (fwdExpr << a) | b | c
  2910 + thereby leaving b and c out as parseable alternatives. It is recommended that you
  2911 + explicitly group the values inserted into the C{Forward}::
  2912 + fwdExpr << (a | b | c)
  2913 + Converting to use the '<<=' operator instead will avoid this problem.
  2914 + """
  2915 + def __init__( self, other=None ):
  2916 + super(Forward,self).__init__( other, savelist=False )
  2917 +
  2918 + def __lshift__( self, other ):
  2919 + if isinstance( other, basestring ):
  2920 + other = ParserElement.literalStringClass(other)
  2921 + self.expr = other
  2922 + self.mayReturnEmpty = other.mayReturnEmpty
  2923 + self.strRepr = None
  2924 + self.mayIndexError = self.expr.mayIndexError
  2925 + self.mayReturnEmpty = self.expr.mayReturnEmpty
  2926 + self.setWhitespaceChars( self.expr.whiteChars )
  2927 + self.skipWhitespace = self.expr.skipWhitespace
  2928 + self.saveAsList = self.expr.saveAsList
  2929 + self.ignoreExprs.extend(self.expr.ignoreExprs)
  2930 + return self
  2931 +
  2932 + def __ilshift__(self, other):
  2933 + return self << other
  2934 +
  2935 + def leaveWhitespace( self ):
  2936 + self.skipWhitespace = False
  2937 + return self
  2938 +
  2939 + def streamline( self ):
  2940 + if not self.streamlined:
  2941 + self.streamlined = True
  2942 + if self.expr is not None:
  2943 + self.expr.streamline()
  2944 + return self
  2945 +
  2946 + def validate( self, validateTrace=[] ):
  2947 + if self not in validateTrace:
  2948 + tmp = validateTrace[:]+[self]
  2949 + if self.expr is not None:
  2950 + self.expr.validate(tmp)
  2951 + self.checkRecursion([])
  2952 +
  2953 + def __str__( self ):
  2954 + if hasattr(self,"name"):
  2955 + return self.name
  2956 +
  2957 + self._revertClass = self.__class__
  2958 + self.__class__ = _ForwardNoRecurse
  2959 + try:
  2960 + if self.expr is not None:
  2961 + retString = _ustr(self.expr)
  2962 + else:
  2963 + retString = "None"
  2964 + finally:
  2965 + self.__class__ = self._revertClass
  2966 + return self.__class__.__name__ + ": " + retString
  2967 +
  2968 + def copy(self):
  2969 + if self.expr is not None:
  2970 + return super(Forward,self).copy()
  2971 + else:
  2972 + ret = Forward()
  2973 + ret <<= self
  2974 + return ret
  2975 +
  2976 +class _ForwardNoRecurse(Forward):
  2977 + def __str__( self ):
  2978 + return "..."
  2979 +
  2980 +class TokenConverter(ParseElementEnhance):
  2981 + """Abstract subclass of C{ParseExpression}, for converting parsed results."""
  2982 + def __init__( self, expr, savelist=False ):
  2983 + super(TokenConverter,self).__init__( expr )#, savelist )
  2984 + self.saveAsList = False
  2985 +
  2986 +class Upcase(TokenConverter):
  2987 + """Converter to upper case all matching tokens."""
  2988 + def __init__(self, *args):
  2989 + super(Upcase,self).__init__(*args)
  2990 + warnings.warn("Upcase class is deprecated, use upcaseTokens parse action instead",
  2991 + DeprecationWarning,stacklevel=2)
  2992 +
  2993 + def postParse( self, instring, loc, tokenlist ):
  2994 + return list(map( str.upper, tokenlist ))
  2995 +
  2996 +
  2997 +class Combine(TokenConverter):
  2998 + """Converter to concatenate all matching tokens to a single string.
  2999 + By default, the matching patterns must also be contiguous in the input string;
  3000 + this can be disabled by specifying C{'adjacent=False'} in the constructor.
  3001 + """
  3002 + def __init__( self, expr, joinString="", adjacent=True ):
  3003 + super(Combine,self).__init__( expr )
  3004 + # suppress whitespace-stripping in contained parse expressions, but re-enable it on the Combine itself
  3005 + if adjacent:
  3006 + self.leaveWhitespace()
  3007 + self.adjacent = adjacent
  3008 + self.skipWhitespace = True
  3009 + self.joinString = joinString
  3010 + self.callPreparse = True
  3011 +
  3012 + def ignore( self, other ):
  3013 + if self.adjacent:
  3014 + ParserElement.ignore(self, other)
  3015 + else:
  3016 + super( Combine, self).ignore( other )
  3017 + return self
  3018 +
  3019 + def postParse( self, instring, loc, tokenlist ):
  3020 + retToks = tokenlist.copy()
  3021 + del retToks[:]
  3022 + retToks += ParseResults([ "".join(tokenlist._asStringList(self.joinString)) ], modal=self.modalResults)
  3023 +
  3024 + if self.resultsName and retToks.haskeys():
  3025 + return [ retToks ]
  3026 + else:
  3027 + return retToks
  3028 +
  3029 +class Group(TokenConverter):
  3030 + """Converter to return the matched tokens as a list - useful for returning tokens of C{L{ZeroOrMore}} and C{L{OneOrMore}} expressions."""
  3031 + def __init__( self, expr ):
  3032 + super(Group,self).__init__( expr )
  3033 + self.saveAsList = True
  3034 +
  3035 + def postParse( self, instring, loc, tokenlist ):
  3036 + return [ tokenlist ]
  3037 +
  3038 +class Dict(TokenConverter):
  3039 + """Converter to return a repetitive expression as a list, but also as a dictionary.
  3040 + Each element can also be referenced using the first token in the expression as its key.
  3041 + Useful for tabular report scraping when the first column can be used as a item key.
  3042 + """
  3043 + def __init__( self, expr ):
  3044 + super(Dict,self).__init__( expr )
  3045 + self.saveAsList = True
  3046 +
  3047 + def postParse( self, instring, loc, tokenlist ):
  3048 + for i,tok in enumerate(tokenlist):
  3049 + if len(tok) == 0:
  3050 + continue
  3051 + ikey = tok[0]
  3052 + if isinstance(ikey,int):
  3053 + ikey = _ustr(tok[0]).strip()
  3054 + if len(tok)==1:
  3055 + tokenlist[ikey] = _ParseResultsWithOffset("",i)
  3056 + elif len(tok)==2 and not isinstance(tok[1],ParseResults):
  3057 + tokenlist[ikey] = _ParseResultsWithOffset(tok[1],i)
  3058 + else:
  3059 + dictvalue = tok.copy() #ParseResults(i)
  3060 + del dictvalue[0]
  3061 + if len(dictvalue)!= 1 or (isinstance(dictvalue,ParseResults) and dictvalue.haskeys()):
  3062 + tokenlist[ikey] = _ParseResultsWithOffset(dictvalue,i)
  3063 + else:
  3064 + tokenlist[ikey] = _ParseResultsWithOffset(dictvalue[0],i)
  3065 +
  3066 + if self.resultsName:
  3067 + return [ tokenlist ]
  3068 + else:
  3069 + return tokenlist
  3070 +
  3071 +
  3072 +class Suppress(TokenConverter):
  3073 + """Converter for ignoring the results of a parsed expression."""
  3074 + def postParse( self, instring, loc, tokenlist ):
  3075 + return []
  3076 +
  3077 + def suppress( self ):
  3078 + return self
  3079 +
  3080 +
  3081 +class OnlyOnce(object):
  3082 + """Wrapper for parse actions, to ensure they are only called once."""
  3083 + def __init__(self, methodCall):
  3084 + self.callable = _trim_arity(methodCall)
  3085 + self.called = False
  3086 + def __call__(self,s,l,t):
  3087 + if not self.called:
  3088 + results = self.callable(s,l,t)
  3089 + self.called = True
  3090 + return results
  3091 + raise ParseException(s,l,"")
  3092 + def reset(self):
  3093 + self.called = False
  3094 +
  3095 +def traceParseAction(f):
  3096 + """Decorator for debugging parse actions."""
  3097 + f = _trim_arity(f)
  3098 + def z(*paArgs):
  3099 + thisFunc = f.func_name
  3100 + s,l,t = paArgs[-3:]
  3101 + if len(paArgs)>3:
  3102 + thisFunc = paArgs[0].__class__.__name__ + '.' + thisFunc
  3103 + sys.stderr.write( ">>entering %s(line: '%s', %d, %s)\n" % (thisFunc,line(l,s),l,t) )
  3104 + try:
  3105 + ret = f(*paArgs)
  3106 + except Exception as exc:
  3107 + sys.stderr.write( "<<leaving %s (exception: %s)\n" % (thisFunc,exc) )
  3108 + raise
  3109 + sys.stderr.write( "<<leaving %s (ret: %s)\n" % (thisFunc,ret) )
  3110 + return ret
  3111 + try:
  3112 + z.__name__ = f.__name__
  3113 + except AttributeError:
  3114 + pass
  3115 + return z
  3116 +
  3117 +#
  3118 +# global helpers
  3119 +#
  3120 +def delimitedList( expr, delim=",", combine=False ):
  3121 + """Helper to define a delimited list of expressions - the delimiter defaults to ','.
  3122 + By default, the list elements and delimiters can have intervening whitespace, and
  3123 + comments, but this can be overridden by passing C{combine=True} in the constructor.
  3124 + If C{combine} is set to C{True}, the matching tokens are returned as a single token
  3125 + string, with the delimiters included; otherwise, the matching tokens are returned
  3126 + as a list of tokens, with the delimiters suppressed.
  3127 + """
  3128 + dlName = _ustr(expr)+" ["+_ustr(delim)+" "+_ustr(expr)+"]..."
  3129 + if combine:
  3130 + return Combine( expr + ZeroOrMore( delim + expr ) ).setName(dlName)
  3131 + else:
  3132 + return ( expr + ZeroOrMore( Suppress( delim ) + expr ) ).setName(dlName)
  3133 +
  3134 +def countedArray( expr, intExpr=None ):
  3135 + """Helper to define a counted list of expressions.
  3136 + This helper defines a pattern of the form::
  3137 + integer expr expr expr...
  3138 + where the leading integer tells how many expr expressions follow.
  3139 + The matched tokens returns the array of expr tokens as a list - the leading count token is suppressed.
  3140 + """
  3141 + arrayExpr = Forward()
  3142 + def countFieldParseAction(s,l,t):
  3143 + n = t[0]
  3144 + arrayExpr << (n and Group(And([expr]*n)) or Group(empty))
  3145 + return []
  3146 + if intExpr is None:
  3147 + intExpr = Word(nums).setParseAction(lambda t:int(t[0]))
  3148 + else:
  3149 + intExpr = intExpr.copy()
  3150 + intExpr.setName("arrayLen")
  3151 + intExpr.addParseAction(countFieldParseAction, callDuringTry=True)
  3152 + return ( intExpr + arrayExpr )
  3153 +
  3154 +def _flatten(L):
  3155 + ret = []
  3156 + for i in L:
  3157 + if isinstance(i,list):
  3158 + ret.extend(_flatten(i))
  3159 + else:
  3160 + ret.append(i)
  3161 + return ret
  3162 +
  3163 +def matchPreviousLiteral(expr):
  3164 + """Helper to define an expression that is indirectly defined from
  3165 + the tokens matched in a previous expression, that is, it looks
  3166 + for a 'repeat' of a previous expression. For example::
  3167 + first = Word(nums)
  3168 + second = matchPreviousLiteral(first)
  3169 + matchExpr = first + ":" + second
  3170 + will match C{"1:1"}, but not C{"1:2"}. Because this matches a
  3171 + previous literal, will also match the leading C{"1:1"} in C{"1:10"}.
  3172 + If this is not desired, use C{matchPreviousExpr}.
  3173 + Do *not* use with packrat parsing enabled.
  3174 + """
  3175 + rep = Forward()
  3176 + def copyTokenToRepeater(s,l,t):
  3177 + if t:
  3178 + if len(t) == 1:
  3179 + rep << t[0]
  3180 + else:
  3181 + # flatten t tokens
  3182 + tflat = _flatten(t.asList())
  3183 + rep << And( [ Literal(tt) for tt in tflat ] )
  3184 + else:
  3185 + rep << Empty()
  3186 + expr.addParseAction(copyTokenToRepeater, callDuringTry=True)
  3187 + return rep
  3188 +
  3189 +def matchPreviousExpr(expr):
  3190 + """Helper to define an expression that is indirectly defined from
  3191 + the tokens matched in a previous expression, that is, it looks
  3192 + for a 'repeat' of a previous expression. For example::
  3193 + first = Word(nums)
  3194 + second = matchPreviousExpr(first)
  3195 + matchExpr = first + ":" + second
  3196 + will match C{"1:1"}, but not C{"1:2"}. Because this matches by
  3197 + expressions, will *not* match the leading C{"1:1"} in C{"1:10"};
  3198 + the expressions are evaluated first, and then compared, so
  3199 + C{"1"} is compared with C{"10"}.
  3200 + Do *not* use with packrat parsing enabled.
  3201 + """
  3202 + rep = Forward()
  3203 + e2 = expr.copy()
  3204 + rep <<= e2
  3205 + def copyTokenToRepeater(s,l,t):
  3206 + matchTokens = _flatten(t.asList())
  3207 + def mustMatchTheseTokens(s,l,t):
  3208 + theseTokens = _flatten(t.asList())
  3209 + if theseTokens != matchTokens:
  3210 + raise ParseException("",0,"")
  3211 + rep.setParseAction( mustMatchTheseTokens, callDuringTry=True )
  3212 + expr.addParseAction(copyTokenToRepeater, callDuringTry=True)
  3213 + return rep
  3214 +
  3215 +def _escapeRegexRangeChars(s):
  3216 + #~ escape these chars: ^-]
  3217 + for c in r"\^-]":
  3218 + s = s.replace(c,_bslash+c)
  3219 + s = s.replace("\n",r"\n")
  3220 + s = s.replace("\t",r"\t")
  3221 + return _ustr(s)
  3222 +
  3223 +def oneOf( strs, caseless=False, useRegex=True ):
  3224 + """Helper to quickly define a set of alternative Literals, and makes sure to do
  3225 + longest-first testing when there is a conflict, regardless of the input order,
  3226 + but returns a C{L{MatchFirst}} for best performance.
  3227 +
  3228 + Parameters:
  3229 + - strs - a string of space-delimited literals, or a list of string literals
  3230 + - caseless - (default=False) - treat all literals as caseless
  3231 + - useRegex - (default=True) - as an optimization, will generate a Regex
  3232 + object; otherwise, will generate a C{MatchFirst} object (if C{caseless=True}, or
  3233 + if creating a C{Regex} raises an exception)
  3234 + """
  3235 + if caseless:
  3236 + isequal = ( lambda a,b: a.upper() == b.upper() )
  3237 + masks = ( lambda a,b: b.upper().startswith(a.upper()) )
  3238 + parseElementClass = CaselessLiteral
  3239 + else:
  3240 + isequal = ( lambda a,b: a == b )
  3241 + masks = ( lambda a,b: b.startswith(a) )
  3242 + parseElementClass = Literal
  3243 +
  3244 + symbols = []
  3245 + if isinstance(strs,basestring):
  3246 + symbols = strs.split()
  3247 + elif isinstance(strs, collections.Sequence):
  3248 + symbols = list(strs[:])
  3249 + elif isinstance(strs, _generatorType):
  3250 + symbols = list(strs)
  3251 + else:
  3252 + warnings.warn("Invalid argument to oneOf, expected string or list",
  3253 + SyntaxWarning, stacklevel=2)
  3254 + if not symbols:
  3255 + return NoMatch()
  3256 +
  3257 + i = 0
  3258 + while i < len(symbols)-1:
  3259 + cur = symbols[i]
  3260 + for j,other in enumerate(symbols[i+1:]):
  3261 + if ( isequal(other, cur) ):
  3262 + del symbols[i+j+1]
  3263 + break
  3264 + elif ( masks(cur, other) ):
  3265 + del symbols[i+j+1]
  3266 + symbols.insert(i,other)
  3267 + cur = other
  3268 + break
  3269 + else:
  3270 + i += 1
  3271 +
  3272 + if not caseless and useRegex:
  3273 + #~ print (strs,"->", "|".join( [ _escapeRegexChars(sym) for sym in symbols] ))
  3274 + try:
  3275 + if len(symbols)==len("".join(symbols)):
  3276 + return Regex( "[%s]" % "".join(_escapeRegexRangeChars(sym) for sym in symbols) )
  3277 + else:
  3278 + return Regex( "|".join(re.escape(sym) for sym in symbols) )
  3279 + except:
  3280 + warnings.warn("Exception creating Regex for oneOf, building MatchFirst",
  3281 + SyntaxWarning, stacklevel=2)
  3282 +
  3283 +
  3284 + # last resort, just use MatchFirst
  3285 + return MatchFirst( [ parseElementClass(sym) for sym in symbols ] )
  3286 +
  3287 +def dictOf( key, value ):
  3288 + """Helper to easily and clearly define a dictionary by specifying the respective patterns
  3289 + for the key and value. Takes care of defining the C{L{Dict}}, C{L{ZeroOrMore}}, and C{L{Group}} tokens
  3290 + in the proper order. The key pattern can include delimiting markers or punctuation,
  3291 + as long as they are suppressed, thereby leaving the significant key text. The value
  3292 + pattern can include named results, so that the C{Dict} results can include named token
  3293 + fields.
  3294 + """
  3295 + return Dict( ZeroOrMore( Group ( key + value ) ) )
  3296 +
  3297 +def originalTextFor(expr, asString=True):
  3298 + """Helper to return the original, untokenized text for a given expression. Useful to
  3299 + restore the parsed fields of an HTML start tag into the raw tag text itself, or to
  3300 + revert separate tokens with intervening whitespace back to the original matching
  3301 + input text. Simpler to use than the parse action C{L{keepOriginalText}}, and does not
  3302 + require the inspect module to chase up the call stack. By default, returns a
  3303 + string containing the original parsed text.
  3304 +
  3305 + If the optional C{asString} argument is passed as C{False}, then the return value is a
  3306 + C{L{ParseResults}} containing any results names that were originally matched, and a
  3307 + single token containing the original matched text from the input string. So if
  3308 + the expression passed to C{L{originalTextFor}} contains expressions with defined
  3309 + results names, you must set C{asString} to C{False} if you want to preserve those
  3310 + results name values."""
  3311 + locMarker = Empty().setParseAction(lambda s,loc,t: loc)
  3312 + endlocMarker = locMarker.copy()
  3313 + endlocMarker.callPreparse = False
  3314 + matchExpr = locMarker("_original_start") + expr + endlocMarker("_original_end")
  3315 + if asString:
  3316 + extractText = lambda s,l,t: s[t._original_start:t._original_end]
  3317 + else:
  3318 + def extractText(s,l,t):
  3319 + del t[:]
  3320 + t.insert(0, s[t._original_start:t._original_end])
  3321 + del t["_original_start"]
  3322 + del t["_original_end"]
  3323 + matchExpr.setParseAction(extractText)
  3324 + return matchExpr
  3325 +
  3326 +def ungroup(expr):
  3327 + """Helper to undo pyparsing's default grouping of And expressions, even
  3328 + if all but one are non-empty."""
  3329 + return TokenConverter(expr).setParseAction(lambda t:t[0])
  3330 +
  3331 +def locatedExpr(expr):
  3332 + """Helper to decorate a returned token with its starting and ending locations in the input string.
  3333 + This helper adds the following results names:
  3334 + - locn_start = location where matched expression begins
  3335 + - locn_end = location where matched expression ends
  3336 + - value = the actual parsed results
  3337 +
  3338 + Be careful if the input text contains C{<TAB>} characters, you may want to call
  3339 + C{L{ParserElement.parseWithTabs}}
  3340 + """
  3341 + locator = Empty().setParseAction(lambda s,l,t: l)
  3342 + return Group(locator("locn_start") + expr("value") + locator.copy().leaveWhitespace()("locn_end"))
  3343 +
  3344 +
  3345 +# convenience constants for positional expressions
  3346 +empty = Empty().setName("empty")
  3347 +lineStart = LineStart().setName("lineStart")
  3348 +lineEnd = LineEnd().setName("lineEnd")
  3349 +stringStart = StringStart().setName("stringStart")
  3350 +stringEnd = StringEnd().setName("stringEnd")
  3351 +
  3352 +_escapedPunc = Word( _bslash, r"\[]-*.$+^?()~ ", exact=2 ).setParseAction(lambda s,l,t:t[0][1])
  3353 +_escapedHexChar = Regex(r"\\0?[xX][0-9a-fA-F]+").setParseAction(lambda s,l,t:unichr(int(t[0].lstrip(r'\0x'),16)))
  3354 +_escapedOctChar = Regex(r"\\0[0-7]+").setParseAction(lambda s,l,t:unichr(int(t[0][1:],8)))
  3355 +_singleChar = _escapedPunc | _escapedHexChar | _escapedOctChar | Word(printables, excludeChars=r'\]', exact=1)
  3356 +_charRange = Group(_singleChar + Suppress("-") + _singleChar)
  3357 +_reBracketExpr = Literal("[") + Optional("^").setResultsName("negate") + Group( OneOrMore( _charRange | _singleChar ) ).setResultsName("body") + "]"
  3358 +
  3359 +def srange(s):
  3360 + r"""Helper to easily define string ranges for use in Word construction. Borrows
  3361 + syntax from regexp '[]' string range definitions::
  3362 + srange("[0-9]") -> "0123456789"
  3363 + srange("[a-z]") -> "abcdefghijklmnopqrstuvwxyz"
  3364 + srange("[a-z$_]") -> "abcdefghijklmnopqrstuvwxyz$_"
  3365 + The input string must be enclosed in []'s, and the returned string is the expanded
  3366 + character set joined into a single string.
  3367 + The values enclosed in the []'s may be::
  3368 + a single character
  3369 + an escaped character with a leading backslash (such as \- or \])
  3370 + an escaped hex character with a leading '\x' (\x21, which is a '!' character)
  3371 + (\0x## is also supported for backwards compatibility)
  3372 + an escaped octal character with a leading '\0' (\041, which is a '!' character)
  3373 + a range of any of the above, separated by a dash ('a-z', etc.)
  3374 + any combination of the above ('aeiouy', 'a-zA-Z0-9_$', etc.)
  3375 + """
  3376 + _expanded = lambda p: p if not isinstance(p,ParseResults) else ''.join(unichr(c) for c in range(ord(p[0]),ord(p[1])+1))
  3377 + try:
  3378 + return "".join(_expanded(part) for part in _reBracketExpr.parseString(s).body)
  3379 + except:
  3380 + return ""
  3381 +
  3382 +def matchOnlyAtCol(n):
  3383 + """Helper method for defining parse actions that require matching at a specific
  3384 + column in the input text.
  3385 + """
  3386 + def verifyCol(strg,locn,toks):
  3387 + if col(locn,strg) != n:
  3388 + raise ParseException(strg,locn,"matched token not at column %d" % n)
  3389 + return verifyCol
  3390 +
  3391 +def replaceWith(replStr):
  3392 + """Helper method for common parse actions that simply return a literal value. Especially
  3393 + useful when used with C{L{transformString<ParserElement.transformString>}()}.
  3394 + """
  3395 + def _replFunc(*args):
  3396 + return [replStr]
  3397 + return _replFunc
  3398 +
  3399 +def removeQuotes(s,l,t):
  3400 + """Helper parse action for removing quotation marks from parsed quoted strings.
  3401 + To use, add this parse action to quoted string using::
  3402 + quotedString.setParseAction( removeQuotes )
  3403 + """
  3404 + return t[0][1:-1]
  3405 +
  3406 +def upcaseTokens(s,l,t):
  3407 + """Helper parse action to convert tokens to upper case."""
  3408 + return [ tt.upper() for tt in map(_ustr,t) ]
  3409 +
  3410 +def downcaseTokens(s,l,t):
  3411 + """Helper parse action to convert tokens to lower case."""
  3412 + return [ tt.lower() for tt in map(_ustr,t) ]
  3413 +
  3414 +def keepOriginalText(s,startLoc,t):
  3415 + """DEPRECATED - use new helper method C{L{originalTextFor}}.
  3416 + Helper parse action to preserve original parsed text,
  3417 + overriding any nested parse actions."""
  3418 + try:
  3419 + endloc = getTokensEndLoc()
  3420 + except ParseException:
  3421 + raise ParseFatalException("incorrect usage of keepOriginalText - may only be called as a parse action")
  3422 + del t[:]
  3423 + t += ParseResults(s[startLoc:endloc])
  3424 + return t
  3425 +
  3426 +def getTokensEndLoc():
  3427 + """Method to be called from within a parse action to determine the end
  3428 + location of the parsed tokens."""
  3429 + import inspect
  3430 + fstack = inspect.stack()
  3431 + try:
  3432 + # search up the stack (through intervening argument normalizers) for correct calling routine
  3433 + for f in fstack[2:]:
  3434 + if f[3] == "_parseNoCache":
  3435 + endloc = f[0].f_locals["loc"]
  3436 + return endloc
  3437 + else:
  3438 + raise ParseFatalException("incorrect usage of getTokensEndLoc - may only be called from within a parse action")
  3439 + finally:
  3440 + del fstack
  3441 +
  3442 +def _makeTags(tagStr, xml):
  3443 + """Internal helper to construct opening and closing tag expressions, given a tag name"""
  3444 + if isinstance(tagStr,basestring):
  3445 + resname = tagStr
  3446 + tagStr = Keyword(tagStr, caseless=not xml)
  3447 + else:
  3448 + resname = tagStr.name
  3449 +
  3450 + tagAttrName = Word(alphas,alphanums+"_-:")
  3451 + if (xml):
  3452 + tagAttrValue = dblQuotedString.copy().setParseAction( removeQuotes )
  3453 + openTag = Suppress("<") + tagStr("tag") + \
  3454 + Dict(ZeroOrMore(Group( tagAttrName + Suppress("=") + tagAttrValue ))) + \
  3455 + Optional("/",default=[False]).setResultsName("empty").setParseAction(lambda s,l,t:t[0]=='/') + Suppress(">")
  3456 + else:
  3457 + printablesLessRAbrack = "".join(c for c in printables if c not in ">")
  3458 + tagAttrValue = quotedString.copy().setParseAction( removeQuotes ) | Word(printablesLessRAbrack)
  3459 + openTag = Suppress("<") + tagStr("tag") + \
  3460 + Dict(ZeroOrMore(Group( tagAttrName.setParseAction(downcaseTokens) + \
  3461 + Optional( Suppress("=") + tagAttrValue ) ))) + \
  3462 + Optional("/",default=[False]).setResultsName("empty").setParseAction(lambda s,l,t:t[0]=='/') + Suppress(">")
  3463 + closeTag = Combine(_L("</") + tagStr + ">")
  3464 +
  3465 + openTag = openTag.setResultsName("start"+"".join(resname.replace(":"," ").title().split())).setName("<%s>" % tagStr)
  3466 + closeTag = closeTag.setResultsName("end"+"".join(resname.replace(":"," ").title().split())).setName("</%s>" % tagStr)
  3467 + openTag.tag = resname
  3468 + closeTag.tag = resname
  3469 + return openTag, closeTag
  3470 +
  3471 +def makeHTMLTags(tagStr):
  3472 + """Helper to construct opening and closing tag expressions for HTML, given a tag name"""
  3473 + return _makeTags( tagStr, False )
  3474 +
  3475 +def makeXMLTags(tagStr):
  3476 + """Helper to construct opening and closing tag expressions for XML, given a tag name"""
  3477 + return _makeTags( tagStr, True )
  3478 +
  3479 +def withAttribute(*args,**attrDict):
  3480 + """Helper to create a validating parse action to be used with start tags created
  3481 + with C{L{makeXMLTags}} or C{L{makeHTMLTags}}. Use C{withAttribute} to qualify a starting tag
  3482 + with a required attribute value, to avoid false matches on common tags such as
  3483 + C{<TD>} or C{<DIV>}.
  3484 +
  3485 + Call C{withAttribute} with a series of attribute names and values. Specify the list
  3486 + of filter attributes names and values as:
  3487 + - keyword arguments, as in C{(align="right")}, or
  3488 + - as an explicit dict with C{**} operator, when an attribute name is also a Python
  3489 + reserved word, as in C{**{"class":"Customer", "align":"right"}}
  3490 + - a list of name-value tuples, as in ( ("ns1:class", "Customer"), ("ns2:align","right") )
  3491 + For attribute names with a namespace prefix, you must use the second form. Attribute
  3492 + names are matched insensitive to upper/lower case.
  3493 +
  3494 + To verify that the attribute exists, but without specifying a value, pass
  3495 + C{withAttribute.ANY_VALUE} as the value.
  3496 + """
  3497 + if args:
  3498 + attrs = args[:]
  3499 + else:
  3500 + attrs = attrDict.items()
  3501 + attrs = [(k,v) for k,v in attrs]
  3502 + def pa(s,l,tokens):
  3503 + for attrName,attrValue in attrs:
  3504 + if attrName not in tokens:
  3505 + raise ParseException(s,l,"no matching attribute " + attrName)
  3506 + if attrValue != withAttribute.ANY_VALUE and tokens[attrName] != attrValue:
  3507 + raise ParseException(s,l,"attribute '%s' has value '%s', must be '%s'" %
  3508 + (attrName, tokens[attrName], attrValue))
  3509 + return pa
  3510 +withAttribute.ANY_VALUE = object()
  3511 +
  3512 +opAssoc = _Constants()
  3513 +opAssoc.LEFT = object()
  3514 +opAssoc.RIGHT = object()
  3515 +
  3516 +def infixNotation( baseExpr, opList, lpar=Suppress('('), rpar=Suppress(')') ):
  3517 + """Helper method for constructing grammars of expressions made up of
  3518 + operators working in a precedence hierarchy. Operators may be unary or
  3519 + binary, left- or right-associative. Parse actions can also be attached
  3520 + to operator expressions.
  3521 +
  3522 + Parameters:
  3523 + - baseExpr - expression representing the most basic element for the nested
  3524 + - opList - list of tuples, one for each operator precedence level in the
  3525 + expression grammar; each tuple is of the form
  3526 + (opExpr, numTerms, rightLeftAssoc, parseAction), where:
  3527 + - opExpr is the pyparsing expression for the operator;
  3528 + may also be a string, which will be converted to a Literal;
  3529 + if numTerms is 3, opExpr is a tuple of two expressions, for the
  3530 + two operators separating the 3 terms
  3531 + - numTerms is the number of terms for this operator (must
  3532 + be 1, 2, or 3)
  3533 + - rightLeftAssoc is the indicator whether the operator is
  3534 + right or left associative, using the pyparsing-defined
  3535 + constants C{opAssoc.RIGHT} and C{opAssoc.LEFT}.
  3536 + - parseAction is the parse action to be associated with
  3537 + expressions matching this operator expression (the
  3538 + parse action tuple member may be omitted)
  3539 + - lpar - expression for matching left-parentheses (default=Suppress('('))
  3540 + - rpar - expression for matching right-parentheses (default=Suppress(')'))
  3541 + """
  3542 + ret = Forward()
  3543 + lastExpr = baseExpr | ( lpar + ret + rpar )
  3544 + for i,operDef in enumerate(opList):
  3545 + opExpr,arity,rightLeftAssoc,pa = (operDef + (None,))[:4]
  3546 + if arity == 3:
  3547 + if opExpr is None or len(opExpr) != 2:
  3548 + raise ValueError("if numterms=3, opExpr must be a tuple or list of two expressions")
  3549 + opExpr1, opExpr2 = opExpr
  3550 + thisExpr = Forward()#.setName("expr%d" % i)
  3551 + if rightLeftAssoc == opAssoc.LEFT:
  3552 + if arity == 1:
  3553 + matchExpr = FollowedBy(lastExpr + opExpr) + Group( lastExpr + OneOrMore( opExpr ) )
  3554 + elif arity == 2:
  3555 + if opExpr is not None:
  3556 + matchExpr = FollowedBy(lastExpr + opExpr + lastExpr) + Group( lastExpr + OneOrMore( opExpr + lastExpr ) )
  3557 + else:
  3558 + matchExpr = FollowedBy(lastExpr+lastExpr) + Group( lastExpr + OneOrMore(lastExpr) )
  3559 + elif arity == 3:
  3560 + matchExpr = FollowedBy(lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr) + \
  3561 + Group( lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr )
  3562 + else:
  3563 + raise ValueError("operator must be unary (1), binary (2), or ternary (3)")
  3564 + elif rightLeftAssoc == opAssoc.RIGHT:
  3565 + if arity == 1:
  3566 + # try to avoid LR with this extra test
  3567 + if not isinstance(opExpr, Optional):
  3568 + opExpr = Optional(opExpr)
  3569 + matchExpr = FollowedBy(opExpr.expr + thisExpr) + Group( opExpr + thisExpr )
  3570 + elif arity == 2:
  3571 + if opExpr is not None:
  3572 + matchExpr = FollowedBy(lastExpr + opExpr + thisExpr) + Group( lastExpr + OneOrMore( opExpr + thisExpr ) )
  3573 + else:
  3574 + matchExpr = FollowedBy(lastExpr + thisExpr) + Group( lastExpr + OneOrMore( thisExpr ) )
  3575 + elif arity == 3:
  3576 + matchExpr = FollowedBy(lastExpr + opExpr1 + thisExpr + opExpr2 + thisExpr) + \
  3577 + Group( lastExpr + opExpr1 + thisExpr + opExpr2 + thisExpr )
  3578 + else:
  3579 + raise ValueError("operator must be unary (1), binary (2), or ternary (3)")
  3580 + else:
  3581 + raise ValueError("operator must indicate right or left associativity")
  3582 + if pa:
  3583 + matchExpr.setParseAction( pa )
  3584 + thisExpr <<= ( matchExpr | lastExpr )
  3585 + lastExpr = thisExpr
  3586 + ret <<= lastExpr
  3587 + return ret
  3588 +operatorPrecedence = infixNotation
  3589 +
  3590 +dblQuotedString = Regex(r'"(?:[^"\n\r\\]|(?:"")|(?:\\x[0-9a-fA-F]+)|(?:\\.))*"').setName("string enclosed in double quotes")
  3591 +sglQuotedString = Regex(r"'(?:[^'\n\r\\]|(?:'')|(?:\\x[0-9a-fA-F]+)|(?:\\.))*'").setName("string enclosed in single quotes")
  3592 +quotedString = Regex(r'''(?:"(?:[^"\n\r\\]|(?:"")|(?:\\x[0-9a-fA-F]+)|(?:\\.))*")|(?:'(?:[^'\n\r\\]|(?:'')|(?:\\x[0-9a-fA-F]+)|(?:\\.))*')''').setName("quotedString using single or double quotes")
  3593 +unicodeString = Combine(_L('u') + quotedString.copy())
  3594 +
  3595 +def nestedExpr(opener="(", closer=")", content=None, ignoreExpr=quotedString.copy()):
  3596 + """Helper method for defining nested lists enclosed in opening and closing
  3597 + delimiters ("(" and ")" are the default).
  3598 +
  3599 + Parameters:
  3600 + - opener - opening character for a nested list (default="("); can also be a pyparsing expression
  3601 + - closer - closing character for a nested list (default=")"); can also be a pyparsing expression
  3602 + - content - expression for items within the nested lists (default=None)
  3603 + - ignoreExpr - expression for ignoring opening and closing delimiters (default=quotedString)
  3604 +
  3605 + If an expression is not provided for the content argument, the nested
  3606 + expression will capture all whitespace-delimited content between delimiters
  3607 + as a list of separate values.
  3608 +
  3609 + Use the C{ignoreExpr} argument to define expressions that may contain
  3610 + opening or closing characters that should not be treated as opening
  3611 + or closing characters for nesting, such as quotedString or a comment
  3612 + expression. Specify multiple expressions using an C{L{Or}} or C{L{MatchFirst}}.
  3613 + The default is L{quotedString}, but if no expressions are to be ignored,
  3614 + then pass C{None} for this argument.
  3615 + """
  3616 + if opener == closer:
  3617 + raise ValueError("opening and closing strings cannot be the same")
  3618 + if content is None:
  3619 + if isinstance(opener,basestring) and isinstance(closer,basestring):
  3620 + if len(opener) == 1 and len(closer)==1:
  3621 + if ignoreExpr is not None:
  3622 + content = (Combine(OneOrMore(~ignoreExpr +
  3623 + CharsNotIn(opener+closer+ParserElement.DEFAULT_WHITE_CHARS,exact=1))
  3624 + ).setParseAction(lambda t:t[0].strip()))
  3625 + else:
  3626 + content = (empty.copy()+CharsNotIn(opener+closer+ParserElement.DEFAULT_WHITE_CHARS
  3627 + ).setParseAction(lambda t:t[0].strip()))
  3628 + else:
  3629 + if ignoreExpr is not None:
  3630 + content = (Combine(OneOrMore(~ignoreExpr +
  3631 + ~Literal(opener) + ~Literal(closer) +
  3632 + CharsNotIn(ParserElement.DEFAULT_WHITE_CHARS,exact=1))
  3633 + ).setParseAction(lambda t:t[0].strip()))
  3634 + else:
  3635 + content = (Combine(OneOrMore(~Literal(opener) + ~Literal(closer) +
  3636 + CharsNotIn(ParserElement.DEFAULT_WHITE_CHARS,exact=1))
  3637 + ).setParseAction(lambda t:t[0].strip()))
  3638 + else:
  3639 + raise ValueError("opening and closing arguments must be strings if no content expression is given")
  3640 + ret = Forward()
  3641 + if ignoreExpr is not None:
  3642 + ret <<= Group( Suppress(opener) + ZeroOrMore( ignoreExpr | ret | content ) + Suppress(closer) )
  3643 + else:
  3644 + ret <<= Group( Suppress(opener) + ZeroOrMore( ret | content ) + Suppress(closer) )
  3645 + return ret
  3646 +
  3647 +def indentedBlock(blockStatementExpr, indentStack, indent=True):
  3648 + """Helper method for defining space-delimited indentation blocks, such as
  3649 + those used to define block statements in Python source code.
  3650 +
  3651 + Parameters:
  3652 + - blockStatementExpr - expression defining syntax of statement that
  3653 + is repeated within the indented block
  3654 + - indentStack - list created by caller to manage indentation stack
  3655 + (multiple statementWithIndentedBlock expressions within a single grammar
  3656 + should share a common indentStack)
  3657 + - indent - boolean indicating whether block must be indented beyond the
  3658 + the current level; set to False for block of left-most statements
  3659 + (default=True)
  3660 +
  3661 + A valid block must contain at least one C{blockStatement}.
  3662 + """
  3663 + def checkPeerIndent(s,l,t):
  3664 + if l >= len(s): return
  3665 + curCol = col(l,s)
  3666 + if curCol != indentStack[-1]:
  3667 + if curCol > indentStack[-1]:
  3668 + raise ParseFatalException(s,l,"illegal nesting")
  3669 + raise ParseException(s,l,"not a peer entry")
  3670 +
  3671 + def checkSubIndent(s,l,t):
  3672 + curCol = col(l,s)
  3673 + if curCol > indentStack[-1]:
  3674 + indentStack.append( curCol )
  3675 + else:
  3676 + raise ParseException(s,l,"not a subentry")
  3677 +
  3678 + def checkUnindent(s,l,t):
  3679 + if l >= len(s): return
  3680 + curCol = col(l,s)
  3681 + if not(indentStack and curCol < indentStack[-1] and curCol <= indentStack[-2]):
  3682 + raise ParseException(s,l,"not an unindent")
  3683 + indentStack.pop()
  3684 +
  3685 + NL = OneOrMore(LineEnd().setWhitespaceChars("\t ").suppress())
  3686 + INDENT = Empty() + Empty().setParseAction(checkSubIndent)
  3687 + PEER = Empty().setParseAction(checkPeerIndent)
  3688 + UNDENT = Empty().setParseAction(checkUnindent)
  3689 + if indent:
  3690 + smExpr = Group( Optional(NL) +
  3691 + #~ FollowedBy(blockStatementExpr) +
  3692 + INDENT + (OneOrMore( PEER + Group(blockStatementExpr) + Optional(NL) )) + UNDENT)
  3693 + else:
  3694 + smExpr = Group( Optional(NL) +
  3695 + (OneOrMore( PEER + Group(blockStatementExpr) + Optional(NL) )) )
  3696 + blockStatementExpr.ignore(_bslash + LineEnd())
  3697 + return smExpr
  3698 +
  3699 +alphas8bit = srange(r"[\0xc0-\0xd6\0xd8-\0xf6\0xf8-\0xff]")
  3700 +punc8bit = srange(r"[\0xa1-\0xbf\0xd7\0xf7]")
  3701 +
  3702 +anyOpenTag,anyCloseTag = makeHTMLTags(Word(alphas,alphanums+"_:"))
  3703 +commonHTMLEntity = Combine(_L("&") + oneOf("gt lt amp nbsp quot").setResultsName("entity") +";").streamline()
  3704 +_htmlEntityMap = dict(zip("gt lt amp nbsp quot".split(),'><& "'))
  3705 +replaceHTMLEntity = lambda t : t.entity in _htmlEntityMap and _htmlEntityMap[t.entity] or None
  3706 +
  3707 +# it's easy to get these comment structures wrong - they're very common, so may as well make them available
  3708 +cStyleComment = Regex(r"/\*(?:[^*]*\*+)+?/").setName("C style comment")
  3709 +
  3710 +htmlComment = Regex(r"<!--[\s\S]*?-->")
  3711 +restOfLine = Regex(r".*").leaveWhitespace()
  3712 +dblSlashComment = Regex(r"\/\/(\\\n|.)*").setName("// comment")
  3713 +cppStyleComment = Regex(r"/(?:\*(?:[^*]*\*+)+?/|/[^\n]*(?:\n[^\n]*)*?(?:(?<!\\)|\Z))").setName("C++ style comment")
  3714 +
  3715 +javaStyleComment = cppStyleComment
  3716 +pythonStyleComment = Regex(r"#.*").setName("Python style comment")
  3717 +_commasepitem = Combine(OneOrMore(Word(printables, excludeChars=',') +
  3718 + Optional( Word(" \t") +
  3719 + ~Literal(",") + ~LineEnd() ) ) ).streamline().setName("commaItem")
  3720 +commaSeparatedList = delimitedList( Optional( quotedString.copy() | _commasepitem, default="") ).setName("commaSeparatedList")
  3721 +
  3722 +
  3723 +if __name__ == "__main__":
  3724 +
  3725 + def test( teststring ):
  3726 + try:
  3727 + tokens = simpleSQL.parseString( teststring )
  3728 + tokenlist = tokens.asList()
  3729 + print (teststring + "->" + str(tokenlist))
  3730 + print ("tokens = " + str(tokens))
  3731 + print ("tokens.columns = " + str(tokens.columns))
  3732 + print ("tokens.tables = " + str(tokens.tables))
  3733 + print (tokens.asXML("SQL",True))
  3734 + except ParseBaseException as err:
  3735 + print (teststring + "->")
  3736 + print (err.line)
  3737 + print (" "*(err.column-1) + "^")
  3738 + print (err)
  3739 + print()
  3740 +
  3741 + selectToken = CaselessLiteral( "select" )
  3742 + fromToken = CaselessLiteral( "from" )
  3743 +
  3744 + ident = Word( alphas, alphanums + "_$" )
  3745 + columnName = delimitedList( ident, ".", combine=True ).setParseAction( upcaseTokens )
  3746 + columnNameList = Group( delimitedList( columnName ) )#.setName("columns")
  3747 + tableName = delimitedList( ident, ".", combine=True ).setParseAction( upcaseTokens )
  3748 + tableNameList = Group( delimitedList( tableName ) )#.setName("tables")
  3749 + simpleSQL = ( selectToken + \
  3750 + ( '*' | columnNameList ).setResultsName( "columns" ) + \
  3751 + fromToken + \
  3752 + tableNameList.setResultsName( "tables" ) )
  3753 +
  3754 + test( "SELECT * from XYZZY, ABC" )
  3755 + test( "select * from SYS.XYZZY" )
  3756 + test( "Select A from Sys.dual" )
  3757 + test( "Select AA,BB,CC from Sys.dual" )
  3758 + test( "Select A, B, C from Sys.dual" )
  3759 + test( "Select A, B, C from Sys.dual" )
  3760 + test( "Xelect A, B, C from Sys.dual" )
  3761 + test( "Select A, B, C frox Sys.dual" )
  3762 + test( "Select" )
  3763 + test( "Select ^^^ frox Sys.dual" )
  3764 + test( "Select A, B, C from Sys.dual, Table2 " )