如果源包含Elastic Search Server中的给定搜索文本，则获取所有文档

本文介绍了如果源包含Elastic Search Server中的给定搜索文本，则获取所有文档的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我很喜欢弹性搜索。我将弹性搜索索引中的字段映射到string。如果字段值包含给定的搜索文本，我需要检索文档。

  JSON1：{\id\：\1\，\message\ ：\欢迎使用弹性搜索\
 JSON2：{\id\：\2\，\message\：\弹性搜索\}

如果我用弹性搜索，我需要同时获取这两个记录。我只得到第一个。

现在我正在获取基于FTS的文档。请指导我在弹性搜索中实现psql中的/ ilike搜索。

提前感谢

解决方案

这是一个令牌化的问题。你可以看看NGram

您可以使用路线 / _analyze

默认情况下，弹性搜索是如何标记的。

curl -XGET'localhost：9200 / _analyze？tokenizer = standard'-d'这是一个测试弹性搜索'

  {
tokens：[{
token：this，
start_offset：0，
end_offset 
type：< ALPHANUM>，
position：1 
}，{
token：is，
start_offset ：5，
end_offset：7，
type：< ALPHANUM>，
position：2 
}，{
 ：a，
start_offset：8，
end_offset：9，
 type：< ALPHANUM>，
position：3 
}，{
token：test，
start_offset b $ bend_offset：14，
type：< ALPHANUM>，
position：4 
}，{
token：elasticsearch ，
start_offset：15，
end_offset：28，
type：< ALPHANUM>，
position：5 
 
]

}

这里是一个例子，nGram和默认值

curl -XGET'localhost：9200 / _analyze？tokenizer = nGram'-d'this是一个测试弹性搜索'

  {
tokens：[{
token：t，
start_offset：0，
end_offset：1，
type：word，
position 
}，{
token：h，
start_offset：1，
end_offset 2，
type：word，
position：2 
}，{
token：i，
start_offset 2，
end_offset：3，
type：word，
position：3 
}，{
token ，
start_offset：3，
end_offset：4，
type：word，
position：4 
}，{ 
token：，
start_offset：4，
end_offset：5，
type：word，
position ：5 
}，{
token：i，
start_offset：5，
end_offset：6，
type单词，
position：6 
}，{
token：s，
start_offset：6，
end_offset ，
type：word，
position：7 
}，{ 
token：，
start_offset：7，
end_offset：8，
type：word，
position ：8 
}，{
token：a，
start_offset：8，
end_offset：9，
type word
position：9 
}，{
token：，
start_offset：9，
end_offset 
type：word，
position：10 
}，{
token：t，
start_offset 
end_offset：11，
type：word，
position：11 
}，{
token：e 
start_offset：11，
end_offset：12，
type：word，
position：12 
}，{
token：s，
start_offset：12，
 end_offset：13，
type：word，
position：13 
}，{
token：t，
 start_offset：13，
end_offset：14，
type：word，
position：14 
}，{
 ：
start_offset：14，
end_offset：15，
type：word，
position：15 
 }，{
token：e，
start_offset：15，
end_offset：16，
type：word，
position：16 
}，{
token：l，
start_offset：16，
end_offset：17，
类型：word，
position：17 
}，{
token：a，
start_offset：17，
 end_offset：18，
type：word，
 position：18 
}，{
token：s，
start_offset：18，
end_offset：19，
 ：word，
position：19 
}，{
token：t，
start_offset：19，
end_offset ：20，
type：word，
position：20 
}，{
token：i，
start_offset ：20，
end_offset：21，
type：word，
position：21 
}，{
token c，
start_offset：21，
end_offset：22，
type：word，
position：22 
} ，{
token：s，
start_offset：22，
end_offset：23，
type：word，
 position：23 
}，{
token：e， 
start_offset：23，
end_offset：24，
type：word，
position：24 
}，{
token：a，
start_offset：24，
end_offset：25，
type：word，
position 25 
}，{
token：r，
start_offset：25，
end_offset：26，
type： 
position：26 
}，{
token：c，
start_offset：26，
end_offset 
type：word，
position：27 
}，{
token：h，
start_offset 
end_offset：28，
type：word，
position：28 
}，{
token：th 
start_offset：0，
end_offset：2 ，
type：word，
position：29 
}，{
token：
start_offset ，
end_offset：3，
type：word，
position：30 
}，{
token：is ，
start_offset：2，
end_offset：4，
type：word，
position：31 
}，{
token：s，
start_offset：3，
end_offset：5，
type：word，
position ：$ 32 
}，$ $ $ $ $ $ $ $$$$$$$$单词，
position：33 
}，{
token：is，
start_offset：5，
end_offset ，
type：word，
position：34 
 }，
token：s，
start_offset：6，
end_offset：8，
type：word，
position：35 
}，{
token：a，
start_offset：7，
end_offset：9，
类型：word，
position：36 
}，{
token：a，
start_offset：8，
 end_offset：10，
type：word，
position：37 
}，{
token：t，
 start_offset：9，
end_offset：11，
type：word，
position：38 
}，{
 ：te，
start_offset：10，
end_offset：12，
type：word，
position：39 
 }，{
token：es，
start_offse t：11，
end_offset：13，
type：word，
position：40 
}，{
token ：st，
start_offset：12，
end_offset：14，
type：word，
position：41 
 }，{
token：t，
start_offset：13，
end_offset：15，
type：word，
position：42 
}，{
token：e，
start_offset：14，
end_offset：16，
类型：word，
position：43 
}，{
token：el，
start_offset：15，
 end_offset：17，
type：word，
position：44 
}，{
token：la，
 start_offset：16，
end_offset：18，
type ：word，
position：45 
}，{
token：as，
start_offset：17，
end_offset ：19，
type：word，
position：46 
}，{
token：st，
start_offset ：18，
end_offset：20，
type：word，
position：47 
}，{
token ti，
start_offset：19，
end_offset：21，
type：word，
position：48 
} ，{
token：ic，
start_offset：20，
end_offset：22，
type：word，
 position：49 
}，{
token：cs，
start_offset：21，
end_offset：23，
 ：word，
position：50 
}，{
 令牌：se，
start_offset：22，
end_offset：24，
type：word，
position：51 
}，{
token：ea，
start_offset：23，
end_offset：25，
type：word 
position：52 
}，{
token：ar，
start_offset：24，
end_offset：26，
type：word，
position：53 
}，{
token：rc，
start_offset：25，
end_offset：27，
type：word，
position：54 
}，{
token：ch，
start_offset：26，
end_offset：28，
type：word，
position：55 
} 
] 
}

这是一个链接在您的索引
中设置适当的分析器/标记器的示例

然后你的查询应该返回预期的文档。

I am new to Elastic Search. I mapped a field to 'string' in Elastic search index. I need to retrieve the documents if field value contains the given search text.

JSON1 : "{\"id\":\"1\",\"message\":\"Welcome to elastic search\"}"
JSON2 : "{\"id\":\"2\",\"message\":\"elasticsearch\"}"

If I search with 'elastic', I need to get both the records. I am getting only first one.

Now I am getting documents based on FTS. Please guide me to achieve search like/ilike in psql in Elastic Search.

Thanks in advance.

解决方案

It's a matter of tokenizer. You can take a look at NGram http://www.elasticsearch.org/guide/reference/index-modules/analysis/ngram-tokenizer/

You can test it using the route /_analyze

Here is how Elasticsearch tokenize by default.

curl -XGET 'localhost:9200/_analyze?tokenizer=standard' -d 'this is a test elasticsearch'

{
"tokens": [{
        "token": "this",
        "start_offset": 0,
        "end_offset": 4,
        "type": "<ALPHANUM>",
        "position": 1
    }, {
        "token": "is",
        "start_offset": 5,
        "end_offset": 7,
        "type": "<ALPHANUM>",
        "position": 2
    }, {
        "token": "a",
        "start_offset": 8,
        "end_offset": 9,
        "type": "<ALPHANUM>",
        "position": 3
    }, {
        "token": "test",
        "start_offset": 10,
        "end_offset": 14,
        "type": "<ALPHANUM>",
        "position": 4
    }, {
        "token": "elasticsearch",
        "start_offset": 15,
        "end_offset": 28,
        "type": "<ALPHANUM>",
        "position": 5
    }
]

}

Here is an example with nGram and the default values

curl -XGET 'localhost:9200/_analyze?tokenizer=nGram' -d 'this is a test elasticsearch'

{
    "tokens": [{
            "token": "t",
            "start_offset": 0,
            "end_offset": 1,
            "type": "word",
            "position": 1
        }, {
            "token": "h",
            "start_offset": 1,
            "end_offset": 2,
            "type": "word",
            "position": 2
        }, {
            "token": "i",
            "start_offset": 2,
            "end_offset": 3,
            "type": "word",
            "position": 3
        }, {
            "token": "s",
            "start_offset": 3,
            "end_offset": 4,
            "type": "word",
            "position": 4
        }, {
            "token": " ",
            "start_offset": 4,
            "end_offset": 5,
            "type": "word",
            "position": 5
        }, {
            "token": "i",
            "start_offset": 5,
            "end_offset": 6,
            "type": "word",
            "position": 6
        }, {
            "token": "s",
            "start_offset": 6,
            "end_offset": 7,
            "type": "word",
            "position": 7
        }, {
            "token": " ",
            "start_offset": 7,
            "end_offset": 8,
            "type": "word",
            "position": 8
        }, {
            "token": "a",
            "start_offset": 8,
            "end_offset": 9,
            "type": "word",
            "position": 9
        }, {
            "token": " ",
            "start_offset": 9,
            "end_offset": 10,
            "type": "word",
            "position": 10
        }, {
            "token": "t",
            "start_offset": 10,
            "end_offset": 11,
            "type": "word",
            "position": 11
        }, {
            "token": "e",
            "start_offset": 11,
            "end_offset": 12,
            "type": "word",
            "position": 12
        }, {
            "token": "s",
            "start_offset": 12,
            "end_offset": 13,
            "type": "word",
            "position": 13
        }, {
            "token": "t",
            "start_offset": 13,
            "end_offset": 14,
            "type": "word",
            "position": 14
        }, {
            "token": " ",
            "start_offset": 14,
            "end_offset": 15,
            "type": "word",
            "position": 15
        }, {
            "token": "e",
            "start_offset": 15,
            "end_offset": 16,
            "type": "word",
            "position": 16
        }, {
            "token": "l",
            "start_offset": 16,
            "end_offset": 17,
            "type": "word",
            "position": 17
        }, {
            "token": "a",
            "start_offset": 17,
            "end_offset": 18,
            "type": "word",
            "position": 18
        }, {
            "token": "s",
            "start_offset": 18,
            "end_offset": 19,
            "type": "word",
            "position": 19
        }, {
            "token": "t",
            "start_offset": 19,
            "end_offset": 20,
            "type": "word",
            "position": 20
        }, {
            "token": "i",
            "start_offset": 20,
            "end_offset": 21,
            "type": "word",
            "position": 21
        }, {
            "token": "c",
            "start_offset": 21,
            "end_offset": 22,
            "type": "word",
            "position": 22
        }, {
            "token": "s",
            "start_offset": 22,
            "end_offset": 23,
            "type": "word",
            "position": 23
        }, {
            "token": "e",
            "start_offset": 23,
            "end_offset": 24,
            "type": "word",
            "position": 24
        }, {
            "token": "a",
            "start_offset": 24,
            "end_offset": 25,
            "type": "word",
            "position": 25
        }, {
            "token": "r",
            "start_offset": 25,
            "end_offset": 26,
            "type": "word",
            "position": 26
        }, {
            "token": "c",
            "start_offset": 26,
            "end_offset": 27,
            "type": "word",
            "position": 27
        }, {
            "token": "h",
            "start_offset": 27,
            "end_offset": 28,
            "type": "word",
            "position": 28
        }, {
            "token": "th",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 29
        }, {
            "token": "hi",
            "start_offset": 1,
            "end_offset": 3,
            "type": "word",
            "position": 30
        }, {
            "token": "is",
            "start_offset": 2,
            "end_offset": 4,
            "type": "word",
            "position": 31
        }, {
            "token": "s ",
            "start_offset": 3,
            "end_offset": 5,
            "type": "word",
            "position": 32
        }, {
            "token": " i",
            "start_offset": 4,
            "end_offset": 6,
            "type": "word",
            "position": 33
        }, {
            "token": "is",
            "start_offset": 5,
            "end_offset": 7,
            "type": "word",
            "position": 34
        }, {
            "token": "s ",
            "start_offset": 6,
            "end_offset": 8,
            "type": "word",
            "position": 35
        }, {
            "token": " a",
            "start_offset": 7,
            "end_offset": 9,
            "type": "word",
            "position": 36
        }, {
            "token": "a ",
            "start_offset": 8,
            "end_offset": 10,
            "type": "word",
            "position": 37
        }, {
            "token": " t",
            "start_offset": 9,
            "end_offset": 11,
            "type": "word",
            "position": 38
        }, {
            "token": "te",
            "start_offset": 10,
            "end_offset": 12,
            "type": "word",
            "position": 39
        }, {
            "token": "es",
            "start_offset": 11,
            "end_offset": 13,
            "type": "word",
            "position": 40
        }, {
            "token": "st",
            "start_offset": 12,
            "end_offset": 14,
            "type": "word",
            "position": 41
        }, {
            "token": "t ",
            "start_offset": 13,
            "end_offset": 15,
            "type": "word",
            "position": 42
        }, {
            "token": " e",
            "start_offset": 14,
            "end_offset": 16,
            "type": "word",
            "position": 43
        }, {
            "token": "el",
            "start_offset": 15,
            "end_offset": 17,
            "type": "word",
            "position": 44
        }, {
            "token": "la",
            "start_offset": 16,
            "end_offset": 18,
            "type": "word",
            "position": 45
        }, {
            "token": "as",
            "start_offset": 17,
            "end_offset": 19,
            "type": "word",
            "position": 46
        }, {
            "token": "st",
            "start_offset": 18,
            "end_offset": 20,
            "type": "word",
            "position": 47
        }, {
            "token": "ti",
            "start_offset": 19,
            "end_offset": 21,
            "type": "word",
            "position": 48
        }, {
            "token": "ic",
            "start_offset": 20,
            "end_offset": 22,
            "type": "word",
            "position": 49
        }, {
            "token": "cs",
            "start_offset": 21,
            "end_offset": 23,
            "type": "word",
            "position": 50
        }, {
            "token": "se",
            "start_offset": 22,
            "end_offset": 24,
            "type": "word",
            "position": 51
        }, {
            "token": "ea",
            "start_offset": 23,
            "end_offset": 25,
            "type": "word",
            "position": 52
        }, {
            "token": "ar",
            "start_offset": 24,
            "end_offset": 26,
            "type": "word",
            "position": 53
        }, {
            "token": "rc",
            "start_offset": 25,
            "end_offset": 27,
            "type": "word",
            "position": 54
        }, {
            "token": "ch",
            "start_offset": 26,
            "end_offset": 28,
            "type": "word",
            "position": 55
        }
    ]
}

Here is a link with an example to set the proper analyzer/tokenizer in your indexHow to setup a tokenizer in elasticsearch

Then your query should return the expected documents.

这篇关于如果源包含Elastic Search Server中的给定搜索文本，则获取所有文档的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！