大家平时都用过ELK做日志搜索和展示吧?kibana中有Visualize模块,能够生成各种各样的图表,但是只能对单一化表进行统计处理.
现在我们要面临一个这样的需求:
1.我有两张原始搜索表和订单表
2.现在要计算每天的搜索到下单的转化率并且在前端以报表的形式来进行展示
分析:
这个问题涉及到每张表根据时间做统计后再次做处理生产一组数据,传递到前端JS生成要展示的报表的流程.
按道理说做这个过程不是很难,但是需要写相应的sql,后台代码,前端JS代码,如果有需要这种报表,我们每次都要写类似的代码
很麻烦,现在想将这类需求的解决方案简化为某些查询语句,然后就可以自动化的出现相应的报表.
ELK官方文档地址 :https://www.elastic.co/guide/index.html
我是从ELK来尝试解决这个问题的,上面说到Visualize模块生成单表图表,查看了其发送的请求参数.发现其使用到了ES中的聚合(Aggregations).首先要验证下ES是否能支持上述的需求的解决,如果ES能直接进行多索引聚合后的计算工作,那么在看看前端时候
能够根据查询来画图即可.然后我将问题转化为求不同索引聚类后的再计算问题.带着问题查询官方问题,查找解决方案.
首先我们看一条基本的聚类语句(语句测试皆在ELK Dev Tools下):
- {
- "size": 0,"query": {
- "bool": {
- "must": [
- {
- "query_string": {
- "analyze_wildcard": true,"query": "*"
- }
- },{
- "range": {
- "@timestamp": {
- "gte": 1508401847040,"lte": 1508402747040,"format": "epoch_millis"
- }
- }
- }
- ],"must_not": []
- }
- },"aggs": {
- "out_field": {
- "date_histogram": {
- "field": "@timestamp","interval": "60s","time_zone": "Asia/Shanghai","min_doc_count": 1
- },"aggs": {
- "sum_time":{
- "sum":{
- "field" : "time"
- }
- }
- }
- }
- }
- }
- {
- "aggs" : {
- "red_products" : {
- "filter" : { "term": { "color": "red" } },"aggs" : {
- "avg_price" : { "avg" : { "field" : "price" } }
- }
- }
- }
- }
- {
- "bucket_script": {
- "buckets_path": {
- "my_var1": "the_sum","my_var2": "the_value_count"
- },"script": "params.my_var1 / params.my_var2"
- }
- }
这里看到,我们能根据聚合的桶的路径来获取聚合后的结果,并且能写脚本再次进行计算处理.到这里我们的理论知识算是理清楚了,是可行的,下面写了一个实验性的查询及其结果的展示:
- 查询语句:
- {
- "size": 0,"min_doc_count": 1,"order": {
- "sum_atime" : "desc"
- }
- },"aggs": {
- "sum_atime":{
- "sum":{
- "script": {
- "inline": "doc['atime'].value","lang": "expression"
- }
- }
- },"sum_time":{
- "sum":{
- "field" : "time"
- }
- },"opr_doc_two":{
- "bucket_script": {
- "buckets_path": {
- "tShirtSales": "sum_atime","totalSales": "sum_time"
- },"script": "params.tShirtSales / params.totalSales"
- }
- }
- }
- }
- }
- }
- 结果展示:
- {
- "took": 216,"timed_out": false,"_shards": {
- "total": 700,"successful": 265,"Failed": 435,"failures": [
- {
- "shard": 0,"index": ".kibana","node": "53Qqb82cRXSQ2upasHdfbA","reason": {
- "type": "script_exception","reason": "link error","caused_by": {
- "type": "parse_exception","reason": "parse_exception: Field [atime] does not exist in mappings"
- },"script_stack": [
- "doc['atime'].value"," ^---- HERE"
- ],"script": "doc['atime'].value","lang": "expression"
- }
- }
- ]
- },"hits": {
- "total": 313546,"max_score": 0,"hits": []
- },"aggregations": {
- "out_field": {
- "buckets": [
- {
- "key_as_string": "2017-10-19T16:41:00.000+08:00","key": 1508402460000,"doc_count": 25583,"sum_time": {
- "value": 969439
- },"sum_atime": {
- "value": 651787
- },"opr_doc_two": {
- "value": 0.6723342056591493
- }
- },{
- "key_as_string": "2017-10-19T16:40:00.000+08:00","key": 1508402400000,"doc_count": 22246,"sum_time": {
- "value": 845529
- },"sum_atime": {
- "value": 554232
- },"opr_doc_two": {
- "value": 0.6554855007929947
- }
- },{
- "key_as_string": "2017-10-19T16:42:00.000+08:00","key": 1508402520000,"doc_count": 20575,"sum_time": {
- "value": 778218
- },"sum_atime": {
- "value": 521955
- },"opr_doc_two": {
- "value": 0.6707053807544929
- }
- },{
- "key_as_string": "2017-10-19T16:34:00.000+08:00","key": 1508402040000,"doc_count": 21250,"sum_time": {
- "value": 767982
- },"sum_atime": {
- "value": 513100
- },"opr_doc_two": {
- "value": 0.6681146172696756
- }
- },{
- "key_as_string": "2017-10-19T16:31:00.000+08:00","key": 1508401860000,"doc_count": 19940,"sum_time": {
- "value": 775634
- },"sum_atime": {
- "value": 509484
- },"opr_doc_two": {
- "value": 0.6568613547111137
- }
- },{
- "key_as_string": "2017-10-19T16:35:00.000+08:00","key": 1508402100000,"doc_count": 21473,"sum_time": {
- "value": 793853
- },"sum_atime": {
- "value": 502308
- },"opr_doc_two": {
- "value": 0.6327468687527792
- }
- },{
- "key_as_string": "2017-10-19T16:43:00.000+08:00","key": 1508402580000,"doc_count": 20519,"sum_time": {
- "value": 786313
- },"sum_atime": {
- "value": 501412
- },"opr_doc_two": {
- "value": 0.6376748190606031
- }
- },{
- "key_as_string": "2017-10-19T16:38:00.000+08:00","key": 1508402280000,"doc_count": 20088,"sum_time": {
- "value": 730432
- },"sum_atime": {
- "value": 482921
- },"opr_doc_two": {
- "value": 0.661144363883291
- }
- },{
- "key_as_string": "2017-10-19T16:39:00.000+08:00","key": 1508402340000,"doc_count": 19921,"sum_time": {
- "value": 719913
- },"sum_atime": {
- "value": 479382
- },"opr_doc_two": {
- "value": 0.6658887948960499
- }
- },{
- "key_as_string": "2017-10-19T16:33:00.000+08:00","key": 1508401980000,"doc_count": 19884,"sum_time": {
- "value": 735780
- },"sum_atime": {
- "value": 479306
- },"opr_doc_two": {
- "value": 0.6514256978988284
- }
- },{
- "key_as_string": "2017-10-19T16:36:00.000+08:00","key": 1508402160000,"doc_count": 20203,"sum_time": {
- "value": 739149
- },"sum_atime": {
- "value": 473916
- },"opr_doc_two": {
- "value": 0.6411643660479822
- }
- },{
- "key_as_string": "2017-10-19T16:44:00.000+08:00","key": 1508402640000,"doc_count": 19370,"sum_time": {
- "value": 712834
- },"sum_atime": {
- "value": 469712
- },"opr_doc_two": {
- "value": 0.6589360215702393
- }
- },{
- "key_as_string": "2017-10-19T16:45:00.000+08:00","key": 1508402700000,"doc_count": 17464,"sum_time": {
- "value": 670617
- },"sum_atime": {
- "value": 466811
- },"opr_doc_two": {
- "value": 0.696091807991745
- }
- },{
- "key_as_string": "2017-10-19T16:37:00.000+08:00","key": 1508402220000,"doc_count": 19610,"sum_time": {
- "value": 742356
- },"sum_atime": {
- "value": 465973
- },"opr_doc_two": {
- "value": 0.627694798721907
- }
- },{
- "key_as_string": "2017-10-19T16:32:00.000+08:00","key": 1508401920000,"doc_count": 21206,"sum_time": {
- "value": 734534
- },"sum_atime": {
- "value": 456836
- },"opr_doc_two": {
- "value": 0.6219398965874963
- }
- },{
- "key_as_string": "2017-10-19T16:30:00.000+08:00","key": 1508401800000,"doc_count": 4214,"sum_time": {
- "value": 158243
- },"sum_atime": {
- "value": 103033
- },"opr_doc_two": {
- "value": 0.6511062100693238
- }
- }
- ]
- }
- }
- }
截图:
Timelione
Timelion is a time series data visualizer that enables you to combine totally independent data sources within a single visualization. It’s driven by a simple expression language you use to retrieve time series data,perform calculations to tease out the answers to complex questions,and visualize the results.
看看原始数据,自己算下,计算的比值是正确的: