想到
my other problem,我决定我甚至不能创建一个符合罗马数字的正则表达式(更不用说上下文无关的语法,将生成它们)
问题是只匹配有效的罗马数字。
例如,990不是“XM”,它的“CMXC”
我的问题在制作正则表达式,这是为了允许或不允许某些字符,我需要回头看。
让我们以数千为例。
我可以允许M {0,2} C?M(允许900,1000,1900,2000,2900和3000)。但是,如果匹配在CM上,我不能允许以下字符是C或D(因为我已经在900)。
如何在正则表达式中表达?
如果它在正则表达式中根本不可表达,那么它在上下文无关语法中是否可表达?
尝试:
^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$
打破它:
M{0,4}
这指定了数千部分,基本上限制在0和4000之间。这是一个相对简单:
0: <empty> matched by M{0} 1000: M matched by M{1} 2000: MM matched by M{2} 3000: MMM matched by M{3} 4000: MMMM matched by M{4}
(CM|CD|D?C{0,3})
稍微复杂一点,这是为百个部分,涵盖所有的可能性:
0: <empty> matched by D?C{0} (with D not there) 100: C matched by D?C{1} (with D not there) 200: CC matched by D?C{2} (with D not there) 300: CCC matched by D?C{3} (with D not there) 400: CD matched by CD 500: D matched by D?C{0} (with D there) 600: DC matched by D?C{1} (with D there) 700: DCC matched by D?C{2} (with D there) 800: DCCC matched by D?C{3} (with D there) 900: CM matched by CM
(XC|XL|L?X{0,3})
与上一节相同的规则,但十位:
0: <empty> matched by L?X{0} (with L not there) 10: X matched by L?X{1} (with L not there) 20: XX matched by L?X{2} (with L not there) 30: XXX matched by L?X{3} (with L not there) 40: XL matched by XL 50: L matched by L?X{0} (with L there) 60: LX matched by L?X{1} (with L there) 70: LXX matched by L?X{2} (with L there) 80: LXXX matched by L?X{3} (with L there) 90: XC matched by XC
(IX|IV|V?I{0,3})
这是单位部分,处理0到9,也类似于前两个部分(罗马数字,尽管它们看起来奇怪,遵循一些逻辑规则,一旦你弄清楚它们是什么):
0: <empty> matched by V?I{0} (with V not there) 1: I matched by V?I{1} (with V not there) 2: II matched by V?I{2} (with V not there) 3: III matched by V?I{3} (with V not there) 4: IV matched by IV 5: V matched by V?I{0} (with V there) 6: VI matched by V?I{1} (with V there) 7: VII matched by V?I{2} (with V there) 8: VIII matched by V?I{3} (with V there) 9: IX matched by IX