1、第一个分析需求:计算每个tag下的商品数量
GET /ecommerce/product/_search
{
"aggs": {
"group_by_tags": {
"terms": { "field": "tags" }
}
}
}
执行之后的结果是:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." } ],"type": "search_phase_execution_exception","reason": "all shards Failed","phase": "query","grouped": true,"Failed_shards": [ { "shard": 0,"index": "ecommerce","node": "urqovJ9yQPCO6fNM70Lc8w","reason": { "type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." } } ],"caused_by": { "type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." } },"status": 400 }
上面的报错的意思是要将文本field的fielddata属性设置为true
PUT /ecommerce/_mapping/product
{
"properties": {
"tags": {
"type": "text","fielddata": true
}
}
}
设置完成之后的效果是:
{
"acknowledged": true }
然后再执行下面的操作:
GET /ecommerce/product/_search
{
"aggs": {
"group_by_tags": {
"terms": {"field": "tags"}
}
}
}
执行,然后看最后面的结果:
"aggregations": {
"group_by_tags": {
"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [
{
"key": "fangzhu","doc_count": 2
},{
"key": "meibai",{
"key": "qingxin","doc_count": 1
}
]
}
}
说明按照tags里面的内容进行了buckets分组统计,可以看到每个tags出现的次数。
GET /ecommerce/product/_search
{
"size": 0,"aggs": {
"all_tags": {
"terms": { "field": "tags" }
}
}
}
{
"took": 20,"timed_out": false,"_shards": { "total": 5,"successful": 5,"Failed": 0 },"hits": { "total": 4,"max_score": 0,"hits": [] },"aggregations": { "group_by_tags": { "doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [ { "key": "fangzhu","doc_count": 2 },{ "key": "meibai",{ "key": "qingxin","doc_count": 1 } ] } } }
2、第二个聚合分析的需求:对名称中包含yagao的商品,计算每个tag下的商品数量
GET /ecommerce/product/_search
{
"size": 0,"query": {
"match": {
"name": "yagao"
}
},"aggs": {
"all_tags": {
"terms": {
"field": "tags"
}
}
}
}
运行结果是:
{
"took": 6,"aggregations": { "all_tags": { "doc_count_error_upper_bound": 0,"doc_count": 1 } ] } } }
3、第三个聚合分析的需求:先分组,再算每组的平均值,计算每个tag下的商品的平均价格
GET /ecommerce/product/_search
{
"size": 0,"aggs" : {
"group_by_tags" : {
"terms" : { "field" : "tags" },"aggs" : {
"avg_price" : {
"avg" : { "field" : "price" }
}
}
}
}
}
{
"took": 8,"doc_count": 2,"avg_price": { "value": 27.5 } },"avg_price": { "value": 40 } },"doc_count": 1,"avg_price": { "value": 40 } } ] } } }
4、第四个数据分析需求:计算每个tag下的商品的平均价格,并且按照平均价格降序排序
GET /ecommerce/product/_search
{
"size": 0,"aggs" : {
"all_tags" : {
"terms" : { "field" : "tags","order": { "avg_price": "desc" } },"aggs" : {
"avg_price" : {
"avg" : { "field" : "price" }
}
}
}
}
}
下面的语句的意思是:按照tags进行分组,并按照它里面的平均值进行降序排列
"terms" : { "field" : "tags","order": { "avg_price": "desc" } }
上面的运行结果是:
{
"took": 3,"buckets": [ { "key": "meibai",{ "key": "fangzhu","avg_price": { "value": 27.5 } } ] } } }
5、第五个数据分析需求:按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格
GET /ecommerce/product/_search
{
"size": 0,"aggs": {
"group_by_price": {
"range": {
"field": "price","ranges": [
{
"from": 0,"to": 20
},{
"from": 20,"to": 40
},{
"from": 40,"to": 50
}
]
},"aggs": {
"group_by_tags": {
"terms": {
"field": "tags"
},"aggs": {
"average_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
最终的结果:
{
"took": 61,"aggregations": { "group_by_price": { "buckets": [ { "key": "0.0-20.0","from": 0,"to": 20,"doc_count": 0,"group_by_tags": { "doc_count_error_upper_bound": 0,"buckets": [] } },{ "key": "20.0-40.0","from": 20,"to": 40,"average_price": { "value": 27.5 } },"average_price": { "value": 30 } } ] } },{ "key": "40.0-50.0","from": 40,"to": 50,"buckets": [ { "key": "qingxin","average_price": { "value": 40 } } ] } } ] } } }