尝试创建正则表达式,以在一个长字符串中以及关键字没有被字母包围时找到关键字。如果字符串被短划线或下划线包围,只要它不被字母包围。只需要找到一个出现的单词,就考虑匹配。只关心在一个长字符串中找到它。目前,当单词旁边有'_'时,我无法使其变为True。有更好表达的想法吗?

编辑-我找到了一个需要真实的情况,没有将其添加到示例中。

import re

key_words = ['go', 'at', 'why', 'stop' ]

false_match = ['going_get_that', 'that_is_wstop', 'whysper','stoping_tat' ]

positive_match = ['go-around', 'go_at_going','stop-by_the_store', 'stop','something-stop', 'something_stop']
pattern = r"\b(%s)\b" % '|'.join(key_words)

for word in false_match + positive_match:
    if re.match(pattern,word):
         print True, word
    else:
         print False, word


电流输出:

False going_get_that
False that_is_wstop
False whysper
False stoping_tat
True go-around
False go_at_going
True stop-by_the_store
True stop


编辑-这必须为True

  False something-stop
  False something_stop


所需的输出:

    False going_get_that
    False that_is_wstop
    False whysper
    False stoping_tat
    True go-around
    True go_at_going
    True stop-by_the_store
    True stop
    True something-stop
    True something_stop

最佳答案

使用否定的外观(向前或向后):

import re

key_words = ['go', 'at', 'why', 'stop' ]

false_match = ['going_get_that', 'that_is_wstop', 'whysper','stoping_tat' ]

positive_match = ['go-around', 'go_at_going','stop-by_the_store', 'stop', 'something-stop', 'something_stop']
pattern = r"(?<![a-zA-Z])(%s)(?![a-zA-Z])" % '|'.join(key_words)

for word in false_match + positive_match:
    if re.search(pattern,word):
         print True, word
    else:
         print False, word

关于python - 查找关键字符之间的字型,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/29061479/

10-17 02:29