本文介绍了Markdown 重点的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了强调,我正在尝试匹配以下降价文本:

I'm trying to match the following markdown text for emphasis:

_this should match_
__this shouldn't__
_ neither should this _
_nor this _
this _should match_as well_
__       (double underscore, shouldn't match)

我自己的努力以及 SO 上的其他解决方案所面临的问题是它们最终仍与第三行匹配:

The issue that I'm facing with my own efforts as well as other solutions on SO is that they still end up matching the third line:

_ 这也不应该 _

有没有办法检查我的特定用例?我的目标是浏览器应用程序,因为 Firefox 和 Safari 还没有支持lookbehinds,有没有办法在没有后视的情况下做到这一点?

Is there a way to check of my particular use case? I'm aiming this for browser applications, and since Firefox and Safari are yet to support lookbehinds, is there a way to do this without lookbehinds?

这是我到目前为止提出的正则表达式模式:/(_)((?!\1|\s).*)?\1/

Here's the regex pattern that I've come up with so far: /(_)((?!\1|\s).*)?\1/

幸运的是,我能够完成几乎所有的检查,但是我的模式仍然匹配:

Luckily, I'm able to fulfil almost all of my checks, however my pattern still matches:

_nor this _
__       (double underscore, shouldn't match)    

那么,有没有办法确保下划线之间至少有一个字符,并且它们与文本之间没有被空格隔开?

So, is there a way to ensure that there is atleast one character between the underscores, and that they are not separated from the text by a space?

regexr 游乐场链接:regexr.com/5300j

Link to regexr playground: regexr.com/5300j

示例:

const regex = /(_)((?!\1|\s).*)?\1/gm;
const str = `_this should match_
__this shouldn't__
_ neither should this _
_nor this _
this _should match_as well_
__
_ neither should this _`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

推荐答案

您可以使用其中之一

\b_(?![_\s])(.*?[^_\s])_\b
\b_(?![_\s])(.*?[^_\s])_(?!\S)

查看正则表达式演示

详情

  • \b - 在匹配之前不允许出现字符字符(字母、数字、_)
  • _ - 下划线
  • (?![_\s]) - _
  • 之后不允许有 _ 或空格字符
  • (.*?[^_\s]) - 第 1 组:
    • .*? - 除换行符以外的任意 0 个或更多字符,尽可能少
    • [^_\s] - 除 _ 和空格外的任何 1 个字符
    • \b - no word char (letter, digit, _) allowed immediately before the match
    • _ - an underscore
    • (?![_\s]) - no _ or whitespace chars are allowed immediately after _
    • (.*?[^_\s]) - Group 1:
      • .*? - any 0 or more chars other than line break chars, as few as possible
      • [^_\s] - any 1 char other than _ and whitespace

      注意 (?!\S) 如果在当前位置的右侧没有非空白字符并且充当右侧空白边界,则匹配失败.

      Note that (?!\S) fails the match if there is no non-whitespace char immediately to the right of the current location and acts as a right-hand whitespace boundary.

      这篇关于Markdown 重点的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-01 22:29