本文介绍了在PHP(和安全)中检索MySQL全文搜索的匹配上下文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的MySQL表页面上进行全文搜索。我在其标题(纯文本,VARCHAR,255)或内容(html,TEXT)中显示与关键字匹配的页面列表。当在内容字段中找到匹配项时,我想显示找到匹配项的片段。我不知道如何去做这件事。



您能否让我朝正确的方向发展?

  $查询='选择
*,
MATCH(标题,内容)AGAINST('。$ keyword。')AS分数
FROM
页面
WHERE
MATCH(标题,内容)AGAINST('。$ keyword。')
ORDER BY
score
DESC';
$ result = mysql_query($ query)或die(mysql_error());
if(mysql_num_rows($ result)> 0){
$ output。='< p>您的关键字符合以下网页:< / p>';
while($ row = mysql_fetch_assoc($ result)){

$ title = htmlentities($ row ['title']);
$ content = htmlentities(strip_tags($ row ['content']));
$ content = limit_text($ content,250); //将其减少到250个字符加

$ output。='< h2>'。$ title。'< / h2>';
if(trim($ content)!=''){
$ output。='< p>。$ content。'< / p>'; //我想在这里放置一个带匹配上下文的代码段
}
}
} else {
$ output。='< p>找不到关键字.. < / p为H.';
}

另外,我有一个关于安全性的问题。现在我用三种方法检查 $ keyword


  • 非空白?

  • 超过2个字符?

  • 不危险? (见下文)


    我使用正则表达式来匹配以下内容,以查看用户输入是否是危险的

     < script |& lt; script |& gt; script | document。| alert | bcc:| cc:| x-mailer: | to:| recipient | truncate | drop table 

    这可能有点荒谬和易于工作但它至少是XSS攻击的最小保护形式。建议如何过滤一个用于搜索的关键字?是矫枉过正吗?

    解决方案

    这应该让你开始在上下文部分...

      //返回部分内容其中关键字匹配
    函数get_surrounding_text($关键字,$内容,$填充)
    {
    $ position = strpos($ content,$ keyword);
    //开始于(where关键字被发现 - 填充),从内容中检索
    //(填充+关键字长度+填充)字符
    $ snippet = substr($ content,$ position - $ padding,(strlen($ keyword)+ $ padding * 2));
    返回'...'。 $ snippet。 ...;
    }

    $ content ='这是一个非常长的字符串,其中有一个魔术字被埋在其中。
    $ keyword ='magic';
    echo get_surrounding_text($ keyword,$ content,15); //回声'...带有魔术字的字符串...'

    这个函数的作用是不考虑填充边界超出内容字符串的情况,例如在内容开始或结尾附近找到关键字时。它也没有考虑多个匹配等,但它应该有希望至少指出你在正确的方向。


    I'm doing a fulltext search on my MySQL table "pages". I'm displaying a list of pages that match the keyword in their "title" (plain text, VARCHAR, 255) or "content" (html, TEXT). When the match is found in the "content" field, I'd like to display the snippet in which the match was found. I have no idea how to go about this.

    Can you put me in the right direction?

    $query = '  SELECT 
                    *, 
                    MATCH(title, content) AGAINST("'.$keyword.'") AS score 
                FROM 
                    page 
                WHERE 
                    MATCH(title, content) AGAINST("'.$keyword.'")
                ORDER BY 
                    score 
                DESC    ';
    $result = mysql_query($query) or die (mysql_error());
    if(mysql_num_rows($result) > 0) {   
        $output .= '<p>Your keyword matches the following pages:</p>';
        while($row = mysql_fetch_assoc($result)){
    
            $title      = htmlentities($row['title']);
            $content    = htmlentities(strip_tags($row['content']));
            $content    = limit_text($content, 250); // Cuts it down to 250 characters plus ...
    
            $output .= '<h2>'.$title.'</h2>';
            if(trim($content) != '') {
                $output .= '<p>'.$content.'</p>'; // I'd like to place a snippet here with the matched context
            }           
        }   
    } else {
        $output .= '<p>Keyword not found...</p>';       
    }
    

    Also, I have a question regarding security. Right now I'm checking $keyword in three ways:

    • Not blank?
    • More than 2 characters?
    • Not dangerous? (see below)

    I use a regular expression to match the following, to see if the user input is dangerous

    <script|&lt;script|&gt;script|document.|alert|bcc:|cc:|x-mailer:|to:|recipient|truncate|drop table
    

    This might be a little bit ridiculous and easy to work around, but it is at least a minimal form of protection against XSS exploits. What is the recommended way to secure filter a keyword intended for search? Is PHPIDS overkill?

    解决方案

    This should get you started on the "context" part...

    // return the part of the content where the keyword was matched
    function get_surrounding_text($keyword, $content, $padding)
    {
        $position = strpos($content, $keyword);
        // starting at (where keyword was found - padding), retrieve
        // (padding + keyword length + padding) characters from the content
        $snippet = substr($content, $position - $padding, (strlen($keyword) + $padding * 2));
        return '...' . $snippet . '...';
    }
    
    $content = 'this is a really long string of characters with a magic word buried somewhere in it';
    $keyword = 'magic';
    echo get_surrounding_text($keyword, $content, 15); // echoes '... string with a magic word in it...'
    

    This function does not account for cases where the padding boundaries would go outside the content string, like when the keyword is found near the beginning or end of the content. It also doesn't account for multiple matches, etc. But it should hopefully at least point you in the right direction.

    这篇关于在PHP(和安全)中检索MySQL全文搜索的匹配上下文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-03 10:01