@H_403_8@
考虑到这一点,如何在Node或V8中实现?是否有可能使用Thompson NFA的JS实现来提高性能,可能只使用了有限的一部分功能(可能删除了前瞻或其他“高级”功能)?@H_403_8@
解决方法
@H_403_8@
以下是有关此引擎实现的一些引用:@H_403_8@
@H_403_8@
A fundamental decision we made early in the design of Irregexp was
that we would be willing to spend extra time compiling a regular
expression if that would make running it faster. During compilation
Irregexp first converts a regexp into an intermediate automaton
representation. This is in many ways the “natural” and most accessible
representation and makes it much easier to analyze and optimize the
regexp. For instance,when compiling /Sun|Mon/ the automaton
representation lets us recognize that both alternatives have an ‘n’ as
their third character. We can quickly scan the input until we find an
‘n’ and then start to match the regexp two characters earlier.
Irregexp looks up to four characters ahead and matches up to four
characters at a time.@H_403_8@
After optimization we generate native machine code which uses
backtracking to try different alternatives. Backtracking can be
time-consuming so we use optimizations to avoid as much of it as we
can. There are techniques to avoid backtracking altogether but the
nature of regexps in JavaScript makes it difficult to apply them in
our case,though it is something we may implement in the future.@H_403_8@