Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupBy not grouping by a tag created with eval #1765

Open
petetnt opened this issue Jan 18, 2018 · 15 comments · May be fixed by #1773
Open

groupBy not grouping by a tag created with eval #1765

petetnt opened this issue Jan 18, 2018 · 15 comments · May be fixed by #1773

Comments

@petetnt
Copy link
Contributor

petetnt commented Jan 18, 2018

  • Full details of your operating system (or distribution) e.g. 64-bit Ubuntu 14.04.
    • Running InfluxDB/Kapacitor/Chronograf as Docker containers on MacOSX, latest Docker.
  • The version of Kapacitor you are running
    • Latest, 1.4.
  • Whether you installed it using a pre-built package, or built it from source.
    • Official Docker container

We are running into an issue with TICKscript and its groupBy behaviour.

We have two sets of measurements, indoor_temperatures and outdoor_temperatures, which we query with a batch.

The queries look as follows:

var out_temp = batch
    |query('SELECT mean(temperature) FROM yyyy')
        .every(10s)
        .period(120d)
        .groupBy(time(1h))
        .fill(0)

var in_temp = batch
    |query('SELECT mean(temperature) FROM xxxx')
        .every(10s)
        .period(120d)
        .groupBy(time(1h))
        .fill(0)

If we HTTP out both of them, they create the following sets of data:

{
  "series": [
    {
      "name": "outdoor_temperatures",
      "columns": [
        "time",
        "mean"
      ],
      "values": [
        [
          "2017-09-20T17:00:00Z",
          0
        ],
        [
          "2017-09-20T18:00:00Z",
          11.5
        ]
        ... the rest
      ]
    }
  ]
}

{
  "series": [
    {
      "name": "indoor_measurements",
      "columns": [
        "time",
        "mean"
      ],
      "values": [
        [
          "2017-09-20T17:00:00Z",
          585.44012944984
        ],
        [
          "2017-09-20T18:00:00Z",
          592.94890510949
        ]
        ... the rest
      ]
    }
  ]
}

Now we do a full join of them, which gives us expected results

out_temp
    |join(in_temp)
        .as('out_temp_mean', 'in_temp_mean')
        .tolerance(5m)
        .fill(0)

httpOut:

{
  "series": [
    {
      "name": "outdoor_temperatures",
      "columns": [
        "time",
        "in_temp_mean.mean",
        "out_temp_mean.mean"
      ],
      "values": [
        [
          "2017-09-20T17:00:00Z",
          586.10175438596,
          0
        ],
        [
          "2017-09-20T18:00:00Z",
          592.94890510949,
          11.5
        ]
        ... the rest
      ]
    }
  ]
}

Which looks perfect. The issue raises when we want to round the out_temp_mean.mean down and groupBy it

So we go ahead and extend the script

out_temp
    |join(in_temp)
        .as('out_temp_mean', 'in_temp_mean')
        .tolerance(5m)
        .fill(0)
    |eval(lambda: string(floor("out_temp_mean.mean")))
        .as('bucket')
        .tags('bucket')
        .keep('out_temp_mean.mean', 'in_temp_mean.mean')

After which the output STILL looks as it should:

{
  "series": [
    {
      "name": "outdoor_temperatures",
      "columns": [
        "time",
        "in_temp_mean.mean",
        "out_temp_mean.mean",
        "bucket"
      ],
      "values": [
        [
          "2017-09-20T17:00:00Z",
          586.99190283401,
          0,
          "0"
        ],
        [
          "2017-09-20T18:00:00Z",
          592.94890510949,
          11.5,
          "11"
        ]
      ]
    }
  ]
}

Now only thing left is to group the values by the new tag bucket:

out_temp
    |join(in_temp)
        .as('out_temp_mean', 'in_temp_mean')
        .tolerance(5m)
        .fill(0)
    |eval(lambda: string(floor("out_temp_mean.mean")))
        .as('bucket')
        .tags('bucket')
        .keep('out_temp_mean.mean', 'in_temp_mean.mean')
    |groupBy('bucket')

After which everything goes awry and we are greeted with series: null

{
  "series": null
}

Is this expected behaviour? A bug? Or something else?

@petetnt
Copy link
Contributor Author

petetnt commented Jan 18, 2018

It seems that the exact feature was added in #731.

@nathanielc does the usage look correct to you?

@nathanielc
Copy link
Contributor

@petetnt Your script and usage look fine. The groupBy node has to buffer some data in order to ensure it recreates the groups correctly. Could that be the issue? Does it still happen if you simplify the script and just have an eval|groupBy step?

@petetnt
Copy link
Contributor Author

petetnt commented Jan 18, 2018

Seems that you might be right @nathanielc, simplied the script to the following and the issue still persists:

var out_temp = batch
    |query('SELECT mean(temperature) FROM xxx')
        .every(10s)
        .period(120d)
        .groupBy(time(1h))
        .fill(0)
    |httpOut('1')
    |eval(lambda: string(floor("mean")))
        .as('bucket')
        .tags('bucket')
        .keep('mean')
    |httpOut('2')
    |groupBy('bucket')
    |httpOut('3')

Endpoints 1 and 2 give expected outputs, 3 is again null.

Also ruled out regressions by trying with Kapacitor 1.3.4 and 1.2.0 which both had the same behaviour (except 1.2.0 returned {} instead of { series: null }).

Not quite sure what to try next 🤔

@nathanielc
Copy link
Contributor

Can you change out the httpOut nodes for log nodes and share the logs for each step?

@petetnt
Copy link
Contributor Author

petetnt commented Jan 18, 2018

Here you go, it seems that ghissue3 is never being called either 🤔 (nothing with loglevel DEBUG either.)

var out_temp = batch
    |query('SELECT mean(temperature) FROM "xxxxx" LIMIT 5')
        .every(60s)
        .period(120d)
        .groupBy(time(1h))
        .fill(0)
    |log()
        .prefix('ghissue1')
    |eval(lambda: string(floor("mean")))
        .as('bucket')
        .tags('bucket')
        .keep('mean')
    |log()
        .prefix('ghissue2')
    |groupBy('bucket')
    |log()
        .prefix('ghissue3')
ts=2018-01-18T20:22:05.807Z lvl=debug msg="starting next batch query" service=kapacitor task_master=main task=Test node=log6 query="SELECT mean(temperature) FROM xxxxxxx WHERE time >= '2017-09-20T20:22:05.805615392Z' AND time < '2018-01-18T20:22:05.805615392Z' GROUP BY time(1h, 0s) fill(0) LIMIT 5"
ts=2018-01-18T20:22:05.835Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:05.835Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:22:05.836Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:22:05.837Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:22:05.838Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:22:05.839Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:05.840Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:05.841Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:05.842Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:22:05.844Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:22:05.845Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:22:05.846Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:22:05.847Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:05.847Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:22:10.023Z lvl=info msg="http request" service=http host=172.18.0.2 username=- start=2018-01-18T20:22:10.017520352Z method=POST uri=/write?consistency=&db=_internal&precision=ns&rp=monitor protocol=HTTP/1.1 status=204 referer=- user-agent=InfluxDBClient request-id=42d8b1a9-fc8d-11e7-80b8-000000000000 duration=5.763459ms
ts=2018-01-18T20:22:20.025Z lvl=info msg="http request" service=http host=172.18.0.2 username=- start=2018-01-18T20:22:20.022088758Z method=POST uri=/write?consistency=&db=_internal&precision=ns&rp=monitor protocol=HTTP/1.1 status=204 referer=- user-agent=InfluxDBClient request-id=48cf4521-fc8d-11e7-80b9-000000000000 duration=3.103017ms

@nathanielc
Copy link
Contributor

@petetnt Can you let the script run for a few minutes so that ghissue1 and 2 each have a few batches of data being processed? My guess at this point is that the groupBy is just buffering the data more than expected. After the run can you share the logs and the kapacitor show output? Thanks!

@petetnt
Copy link
Contributor Author

petetnt commented Jan 18, 2018

Sure thing. We thought about the buffering thing too, when couple of times the data seemed to show up after coming back from a coffee break or two only never to be seen again.

I let it run for 20 minutes just now, there's the results:

ID: Test
Error: 
Template: 
Type: batch
Status: enabled
Executing: true
Created: 18 Jan 18 12:44 UTC
Modified: 18 Jan 18 20:25 UTC
LastEnabled: 18 Jan 18 20:25 UTC
Databases Retention Policies: ["xxxxxxx"."autogen"]
TICKscript:
var out_temp = batch
    |query('SELECT mean(temperature) FROM xxxxxx LIMIT 5')
        .every(10s)
        .period(120d)
        .groupBy(time(1h))
        .fill(0)
    |log()
        .prefix('ghissue1')
    |eval(lambda: string(floor("mean")))
        .as('bucket')
        .tags('bucket')
        .keep('mean')
    |log()
        .prefix('ghissue2')
    |groupBy('bucket')
    |log()
        .level('DEBUG')
        .prefix('ghissue3')

DOT:
digraph Test {
graph [throughput="0.00 batches/s"];

query1 [avg_exec_time_ns="74.388397ms" batches_queried="104" errors="0" points_queried="520" working_cardinality="0" ];
query1 -> log2 [processed="104"];

log2 [avg_exec_time_ns="8.010612ms" errors="0" working_cardinality="0" ];
log2 -> eval3 [processed="104"];

eval3 [avg_exec_time_ns="61.799µs" errors="0" working_cardinality="1" ];
eval3 -> log4 [processed="104"];

log4 [avg_exec_time_ns="1.281644ms" errors="0" working_cardinality="0" ];
log4 -> groupby5 [processed="104"];

groupby5 [avg_exec_time_ns="41.629µs" errors="0" working_cardinality="2" ];
groupby5 -> log6 [processed="0"];

log6 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
}

The logs repeat showed nothing new, basically this for 15 minutes:


ts=2018-01-18T20:43:17.076Z lvl=debug msg="starting next batch query" service=kapacitor task_master=main task=Test node=log6 query="SELECT mean(temperature) FROM xxxxxxxx WHERE time >= '2017-09-20T20:43:17.074861855Z' AND time < '2018-01-18T20:43:17.074861855Z' GROUP BY time(1h, 0s) fill(0) LIMIT 5"
ts=2018-01-18T20:43:17.100Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:17.101Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:17.102Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:17.103Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:17.104Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:17.105Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:17.107Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:17.107Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:17.108Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:17.110Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:17.111Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:17.112Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:17.113Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:17.114Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:20.019Z lvl=info msg="http request" service=http host=172.18.0.2 username=- start=2018-01-18T20:43:20.015650704Z method=POST uri=/write?consistency=&db=_internal&precision=ns&rp=monitor protocol=HTTP/1.1 status=204 referer=- user-agent=InfluxDBClient request-id=37d30822-fc90-11e7-8162-000000000000 duration=3.994661ms
ts=2018-01-18T20:43:27.074Z lvl=debug msg="starting next batch query" service=kapacitor task_master=main task=Test node=log6 query="SELECT mean(temperature) FROM xxxxxxx WHERE time >= '2017-09-20T20:43:27.074198796Z' AND time < '2018-01-18T20:43:27.074198796Z' GROUP BY time(1h, 0s) fill(0) LIMIT 5"
ts=2018-01-18T20:43:27.102Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:27.103Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:27.103Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:27.105Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:27.105Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:27.106Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:27.107Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:27.108Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:27.108Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:27.110Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:27.111Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:27.112Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:27.112Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:27.113Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:28.092Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:28.075801533Z method=GET uri=/kapacitor/v1/tasks/Test?dot-view=attributes&replay-id=&script-format=formatted protocol=HTTP/1.1 status=200 referer=- user-agent=KapacitorClient request-id=3ca0e947-fc90-11e7-8163-000000000000 duration=16.176739ms
ts=2018-01-18T20:43:28.123Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:28.122792466Z method=GET uri=/kapacitor/v1/ping protocol=HTTP/1.1 status=204 referer=http://localhost:8888/sources/1/tickscript/Test user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36" request-id=3ca814f8-fc90-11e7-8165-000000000000 duration=47.399µs
ts=2018-01-18T20:43:28.134Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:28.122639667Z method=GET uri=/kapacitor/v1/ping protocol=HTTP/1.1 status=204 referer=http://localhost:8888/sources/1/tickscript/Test user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36" request-id=3ca814f8-fc90-11e7-8165-000000000000 duration=5.661244ms
ts=2018-01-18T20:43:28.147Z lvl=info msg="created log session" service=sessions id=256706e7-766e-4a44-a56f-798f68fade66 content-type=application/json tags_0=task=Test
ts=2018-01-18T20:43:30.019Z lvl=info msg="http request" service=http host=172.18.0.2 username=- start=2018-01-18T20:43:30.016077038Z method=POST uri=/write?consistency=&db=_internal&precision=ns&rp=monitor protocol=HTTP/1.1 status=204 referer=- user-agent=InfluxDBClient request-id=3dc8f94f-fc90-11e7-8167-000000000000 duration=3.208563ms
ts=2018-01-18T20:43:32.601Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:32.591282052Z method=GET uri=/kapacitor/v1/tasks/Test?dot-view=attributes&replay-id=&script-format=formatted protocol=HTTP/1.1 status=200 referer=- user-agent=KapacitorClient request-id=3f51eb50-fc90-11e7-8168-000000000000 duration=10.421183ms
ts=2018-01-18T20:43:32.616Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:32.605069418Z method=GET uri=/kapacitor/v1/tasks/Test?dot-view=attributes&replay-id=&script-format=formatted protocol=HTTP/1.1 status=200 referer=- user-agent=KapacitorClient request-id=3f5405cf-fc90-11e7-8169-000000000000 duration=11.393087ms
ts=2018-01-18T20:43:32.624Z lvl=info msg="http request" service=http host=172.18.0.4 username=- start=2018-01-18T20:43:32.623936192Z method=PATCH uri=/kapacitor/v1/tasks/Test protocol=HTTP/1.1 status=400 referer=- user-agent=KapacitorClient request-id=3f56e6c1-fc90-11e7-816a-000000000000 duration=439.875µs
ts=2018-01-18T20:43:37.075Z lvl=debug msg="starting next batch query" service=kapacitor task_master=main task=Test node=log6 query="SELECT mean(temperature) FROM xxxxxxxxxxx WHERE time >= '2017-09-20T20:43:37.074282598Z' AND time < '2018-01-18T20:43:37.074282598Z' GROUP BY time(1h, 0s) fill(0) LIMIT 5"
ts=2018-01-18T20:43:37.106Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:37.107Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:37.108Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:37.109Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:37.110Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:37.112Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:37.112Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:37.114Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:37.115Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T20:43:37.117Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T20:43:37.117Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T20:43:37.119Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T20:43:37.119Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T20:43:37.120Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T00:00:00Z

I'll let it run in the background with every(5m) for a longer while and see if ghissue3 ever shows up.

Thanks alot for your help, really appreciated!

@nathanielc
Copy link
Contributor

@petetnt Thanks for all these details. At this point it definitely looks like a bug. I should have enough to go on. I'll take a look at the code in a bit.

@petetnt
Copy link
Contributor Author

petetnt commented Jan 18, 2018

Awesome! Do ping back if you need any additional details or need something to be tested, much appreciated!

@petetnt petetnt changed the title groupBy not grouping by a tag created with eval after a join groupBy not grouping by a tag created with eval Jan 18, 2018
@petetnt
Copy link
Contributor Author

petetnt commented Jan 19, 2018

Okay, eventually the ghissue3 log showed up:


ts=2018-01-18T21:03:46.218Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.220Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=0 time=2017-09-20T21:00:00Z
ts=2018-01-18T21:03:46.220Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T21:03:46.221Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T21:03:46.222Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.223Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  field_mean=10.3 time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.224Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue1 name=outdoor_temperatures group=  time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.225Z lvl=info msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.226Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=0 field_mean=0 time=2017-09-20T21:00:00Z
ts=2018-01-18T21:03:46.227Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T21:03:46.228Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T21:03:46.230Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.231Z lvl=info msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group= tag_bucket=10 field_mean=10.3 time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.231Z lvl=info msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue2 name=outdoor_temperatures group=  time=2017-09-21T01:00:00Z
ts=2018-01-18T21:03:46.233Z lvl=debug msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=0 tag_bucket=0 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.237Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=0 tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T21:03:46.237Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=0 tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T21:03:46.239Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=0 tag_bucket=0 field_mean=0 time=2017-09-20T20:00:00Z
ts=2018-01-18T21:03:46.240Z lvl=debug msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=0 tag_bucket=0 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.242Z lvl=debug msg="begin batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.244Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T21:03:46.245Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T21:03:46.247Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-20T21:00:00Z
ts=2018-01-18T21:03:46.249Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T21:03:46.251Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T21:03:46.253Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.2 time=2017-09-20T22:00:00Z
ts=2018-01-18T21:03:46.255Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T21:03:46.256Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T21:03:46.257Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.3 time=2017-09-20T23:00:00Z
ts=2018-01-18T21:03:46.260Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.266Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.267Z lvl=debug msg="batch point" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 field_mean=10.4 time=2017-09-21T00:00:00Z
ts=2018-01-18T21:03:46.269Z lvl=debug msg="end batch" service=kapacitor task_master=main task=Test node=log6 prefix=ghissue3 name=outdoor_temperatures group=bucket=10 tag_bucket=10 time=2017-09-21T00:00:00Z

No idea how long the script was running for it to appear though? Some hours maybe 🤔 (Should have written the logs to a file 🗄 )

@petetnt
Copy link
Contributor Author

petetnt commented Jan 19, 2018

Possibly related to #1249 (comment). Not sure why though, the data does match always 🤔

@petetnt
Copy link
Contributor Author

petetnt commented Jan 19, 2018

Here's an output of the data when it's working. The grouping works correctly, but it also quadruples all the values for some reason??

{
	"series": [{
			"name": "outdoor_temperatures",
			"tags": {
				"bucket": "14"
			},
			"columns": [
				"time",
				"mean"
			],
			"values": [
				[
					"2017-09-21T11:00:00Z",
					14
				],
				[
					"2017-09-21T11:00:00Z",
					14
				],
				[
					"2017-09-21T11:00:00Z",
					14
				],
				[
					"2017-09-21T11:00:00Z",
					14
				],
				[
					"2017-09-21T12:00:00Z",
					14.1
				],
				[
					"2017-09-21T12:00:00Z",
					14.1
				],
				[
					"2017-09-21T12:00:00Z",
					14.1
				],
				[
					"2017-09-21T12:00:00Z",
					14.1
				],
				[
					"2017-09-21T14:00:00Z",
					14
				],
				[
					"2017-09-21T14:00:00Z",
					14
				],
				[
					"2017-09-21T14:00:00Z",
					14
				],
				[
					"2017-09-21T14:00:00Z",
					14
				]
			]
		},
		{
			"name": "outdoor_temperatures",
			"tags": {
				"bucket": "13"
			},
			"columns": [
				"time",
				"mean"
			],
			"values": [
				[
					"2017-09-21T13:00:00Z",
					13.8
				],
				[
					"2017-09-21T13:00:00Z",
					13.8
				],
				[
					"2017-09-21T13:00:00Z",
					13.8
				],
				[
					"2017-09-21T13:00:00Z",
					13.8
				]
			]
		},
		{
			"name": "outdoor_temperatures",
			"tags": {
				"bucket": "0"
			},
			"columns": [
				"time",
				"mean"
			],
			"values": [
				[
					"2017-09-21T10:00:00Z",
					0
				],
				[
					"2017-09-21T10:00:00Z",
					0
				],
				[
					"2017-09-21T10:00:00Z",
					0
				],
				[
					"2017-09-21T10:00:00Z",
					0
				]
			]
		}
	]
}

@valstu
Copy link

valstu commented Jan 22, 2018

We tried this with even smaller data sets period(1d) and and we're still facing the same issue 🤔

@petetnt
Copy link
Contributor Author

petetnt commented Jan 23, 2018

I have spent a while debugging this now with @valstu (first time figuring out Golang code at scale 😂 ) and here's some findings:

In there lines:

https://github.com/influxdata/kapacitor/blob/master/group_by.go#L135-L137

The time.Time t is only changes during exact hours, which means that

https://github.com/influxdata/kapacitor/blob/master/group_by.go#L137

is only called once an hour -> meaning that data comes through only once a hour.

The t comes from calling Time() on edge.BeginBatchPoint here:

https://github.com/influxdata/kapacitor/blob/master/group_by.go#L76-L80

Possible fix pending at #1773

@petetnt petetnt linked a pull request Jan 23, 2018 that will close this issue
3 tasks
@Hitesh-Agrawal
Copy link

Hitesh-Agrawal commented May 7, 2020

Spend hours debugging this and facing a similar issue the "groupBy not grouping by a tag created with eval"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants