问题描述
代码
String s = "y z a a a b c c z";
Pattern p = Pattern.compile("(a )+(b )+(c *)c");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
打印
a a a b c c
这是对的。
但从逻辑上讲,子串
a a a b c
a a b c c
a a b c
a b c c
a b c
也匹配正则表达式。
那么,我怎样才能让代码找到那些子串呢,即不仅是最扩展的,还有 children ?
So, how can I make the code find those substrings too, i.e. not only the most extended one, but also its children?
推荐答案
您可以使用,例如 *?
和 +?
。这些匹配尽可能少,与标准的 *
和 +
形成鲜明对比,这些都是贪婪的,即尽可能匹配可能。尽管如此,这只允许您找到特定的子匹配,而不是全部。使用前瞻控制非捕获组可以实现更多控制,也在文档中描述。但是为了真正找到所有子匹配,你可能必须自己做一些事情,即构建正则表达式对应的自动机,并使用自定义代码导航它。
You can use the reluctant qualifiers such as *?
and +?
. These match as little as possible, in contrast to the standard *
and +
which are greedy, i.e. match as much as possible. Still, this only allows you to find particular "sub-matches", not all of them. Some more control can be achieved using lookahead controlling non-capturing groups, also described in the docs. But in order to really find all sub-matches, you would probably have to do stuff yourself, i.e. build the automaton to which the regex corresponds and navigate it using custom code.
这篇关于找到所有匹配的子串,而不仅仅是“最大的”子串。一的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!