问题描述
我在搜索时遇到了问题,只是不知道该怎么做。我的文档具有以下格式:
I got a problem with a search I just can't figure out how to do it. My docs are of the following form:
{
"timestamp":"2015-03-17T15:05:04.563Z",
"session_id":"1",
"user_id":"jan"
}
让我们说一个会话ID的第一个时间戳是登录,最后一个时间戳是注销。我希望所有会话都具有所有登录和注销文档(如果可能,按 user_id
排序)。我设法通过聚合得到正确的时间戳:
Let's say the first timestamp of a session id is the "Login" and the last timestamp is the "Logout". I want to have all "login" and "logout" docs for all sessions (if possible sorted by user_id
). I managed to get the right timestamps with aggregations:
{
"aggs" : {
"group_by_uid" : {
"terms" : {
"field" : "user_id"
},
"aggs" : {
"group_by_sid" : {
"terms" : {
"field" : "session_id"
},
"aggs" : {
"max_date" : {
"max": { "field" : "timestamp" }
},
"min_date" : {
"min": { "field" : "timestamp" }
}
}
}
}
}
}
}
但是我如何获得相应的文档?我也不必介意我是否必须进行2次搜索(一次用于登录,一次用于注销)。我尝试过热门匹配和排序的东西,但是我总是会遇到解析错误:/
But how do I get the corresponding docs? I also don't mind if i have to do 2 searches (one for the logins and one for the logouts). I tried tome top hits aggregations and sorting stuff but I always get parse errors :/
我希望有人能给我一个提示:)
I hope someone can give me a hint :)
最诚挚的问候,
Jan
Best regards,Jan
推荐答案
以下是基于Sloan Ahrens提出的方法。优点是开始和结束会话条目位于同一存储桶中。
Here's a solution in a single search based on the approach proposed by Sloan Ahrens. The advantage is that the start and end session entries are in the same bucket.
{
"aggs": {
"group_by_uid": {
"terms": {
"field": "user_id"
},
"aggs": {
"group_by_sid": {
"terms": {
"field": "session_id"
},
"aggs": {
"session_start": {
"top_hits": {
"size": 1,
"sort": [ { "timestamp": { "order": "asc" } } ]
}
},
"session_end": {
"top_hits": {
"size": 1,
"sort": [ { "timestamp": { "order": "desc" } } ]
}
}
}
}
}
}
}
}
干杯,
1月
Cheers,Jan
这篇关于Elastic(搜索):获取带有最大和最小时间戳值的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!