java: How to build an optimal AST of a regular expression?

jeudi 14 juillet 2016

How to build an optimal AST of a regular expression?

I am trying to build a parser for regular expressions by building the abstract syntax tree from scratch (without any project dependencies or tools such as the cup parser in Java, etc). I don't want to save all the information contained in the regex but, instead, I want to simplify it as much as possible.

As an example, x::=y|z should lead to the same AST as the character class x::=[yz]. However, since regexes can (and do) get very complicated, I can't decide what equivalences to implement. For instance, I don't know how to save negative choice x::=[^b], which would be equivalent to x::=a|c|d|e|...

What abstractions would you make? Could some of those abstractions lead to wrong ASTs later on?

java

jeudi 14 juillet 2016

How to build an optimal AST of a regular expression?

Aucun commentaire:

Enregistrer un commentaire