我想在Elasticsearch数据中搜索每个组的最大和。例如:

数据是:

id  | gId | cost
----|-----|------
1   |  1  | 20
2   |  1  | 15
3   |  2  | 30
4   |  1  | 30   *
5   |  2  | 40   *
6   |  1  | 20
7   |  2  | 30
8   |  3  | 45   *
9   |  1  | 10

我使用 sum_bucket 对每个组的最大值进行求和。这是我的查询:
{
    "aggs": {
        "T1":{
            "terms": {
                "field": "gId",
                "size":3
            },
            "aggs":{
                "MAX_COST":{
                    "max": {
                        "field": "cost"
                    }
                }
            }
        },
        "T2":{
            "sum_bucket": {
                "buckets_path": "T1>MAX_COST"
            }
        }
    },
    "size": 0
}

查询响应是
"T1": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [                     |
        {                            |
            "key": 1,                |
            "doc_count": 5,          |
            "MAX": {                 |
                "value": 30          |
            }                        |
        },                           |
        {                            | How can ignore this part to return
            "key": 2,                | from elasticsearch query response
            "doc_count": 3,          |
            "MAX": {                 |
                "value": 40          |
            }                        |
        },                           |
        {                            |
            "key": 3,                |
            "doc_count": 1,          |
            "MAX": {                 |
                "value": 45          |
            }                        |
        }                            |
    ]
},
"T2": {
    "value": 115
}

T2.value是所需的结果。但是我想在查询结果T1.buckets中忽略网络性能问题,因为我的数据非常大。通过将T1.terms.size设置为特定数字,T2.value结果中仅结果效果的最高数。如何写出我所查询的结果而忽略T1.buckets或对每组最大和的总和更好的查询求和?

最佳答案

您可以使用 filter_path 仅返回响应的一部分

var searchResponse = client.Search<Document>(s => s
    .FilterPath(new[] { "T2.value" }) // paths to include in response
    .Aggregations(a => a
        // ... rest of aggs here
    )
);

请记住,结合使用NEST的filter_path有时可能会导致内部序列化程序无法反序列化响应,因为该结构是意外的。在这种情况下,您可以使用高级客户端上公开的低级客户端来处理响应
var searchDescriptor = new SearchDescriptor<Document>()
    .Aggregations(a => a
        // ... rest of aggs here
    );

var searchResponse = client.LowLevel.Search<StringResponse>(
    "index",
    "type",
    PostData.Serializable(searchDescriptor),
    new SearchRequestParameters
    {
        QueryString = new Dictionary<string, object>
        {
            ["filter_path"] = "T2.value"
        }
    });

// do something with JSON string response
var json = searchResponse.Body;

关于elasticsearch - Elasticsearch中每个组的最大总和,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55793748/

10-16 07:31