Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added output as JSON #375

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Added output as JSON #375

wants to merge 4 commits into from

Conversation

ZuluPro
Copy link

@ZuluPro ZuluPro commented Jan 24, 2019

Hello I added JSON output option with -j / --json-output.

Other outputs are disabled if JSON is enabled ande vice-versa.

@Shuiei
Copy link

Shuiei commented Jan 25, 2019

This need to be merged! I need it!

@leGoupil
Copy link

Why has not that been done yet?

@ZuluPro
Copy link
Author

ZuluPro commented Jan 27, 2019

@wg Any news ?

@methane
Copy link

methane commented Jan 28, 2019

@ALL Please don't hurry.
@ZuluPro sample output helps to review output format.

src/wrk.c Outdated Show resolved Hide resolved
src/wrk.c Outdated Show resolved Hide resolved
src/wrk.c Outdated Show resolved Hide resolved
src/wrk.c Outdated Show resolved Hide resolved
src/wrk.c Outdated Show resolved Hide resolved
}
printf(" \"runtime\": %"PRIu64",\n", runtime_us);
printf(" \"bytes\": %"PRIu64",\n", bytes);
printf(" \"bytes_per_sec\": %Lf,\n", bytes_per_s);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON is for machine. Why bytes_per_sec is needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several reason, because:

  • It is declared in the original text output
  • it is internally calculated and without documentation, no one is expected to know how you get it
  • If the value could be given by the output instead of code the calculation elsewhere, it's better
  • There's no cost to give it to user

Copy link

@methane methane Jan 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It is declared in the original text output

I don't think JSON output should be same to original text output. Text is mainly for human. But JSON is more for machine.

Too detailed information for human eye are omitted in text output, but it can be included in JSON.
On the other hand, values easily calculated by other values can be omitted in JSON.

People expect backward compatibility about JSON format.
Adding items later is easy. Removing or changing items later is difficult.
So you should be careful about choosing which value is included, and how to format it.

it is internally calculated and without documentation, no one is expected to know how you get it
If the value could be given by the output instead of code the calculation elsewhere, it's better
There's no cost to give it to user

All of these reasons seems not enough for inclusion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@methane You don't say why it shouldn't be included.

Otherwise, even you know the runtime and wrk knows the number of transfered bytes. The throughput is still a value researched by a lot of people.

printf(" \"latency_percentile_90\": %.2Lf,\n", lat_perc_90);
printf(" \"latency_percentile_90\": %.2Lf,\n", lat_perc_90);
printf(" \"latency_percentile_95\": %.2Lf,\n", lat_perc_95);
printf(" \"latency_percentile_99\": %.2Lf,\n", lat_perc_99);
Copy link

@methane methane Jan 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really beautiful? How about

"latency_percentile": {
  "50": ...
  "75": ...
...

Copy link
Author

@ZuluPro ZuluPro Jan 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful or not, I think it in several aspects:

  • It gives exactly what a user could need.
  • I kept code simple and readable

I'm not against a new format, but let @wg decide about it.

@methane
Copy link

methane commented Jan 28, 2019

This is duplicate of #72 and there are no describe about why report.lua is not enough.

Addtionally, I expect more detailed information for JSON. I don't expect "for human" information (like avg) in JSON. For example, someone may want histogram instead of percentile.

If wrk has -J option, people would expect backward compatibility. It's difficult to refactoring JSON format.
So I think report.lua is much better. It is customizable. It's easier to keep backward compatibility, because people can keep using old report even when wrk updated it's report.lua format.

long double lat_perc_95 = stats_percentile(statistics.latency, 95.0) / 1000000.0;
long double lat_perc_99 = stats_percentile(statistics.latency, 99.0) / 1000000.0;
printf("{\n");
printf(" \"url\": \"%s\",\n", url);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not be escaped here. Is this %s secure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by secure ?

If you think about code injection, this is just JSON. And a parser would expect only 1 object.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone do this.

#!/bin/sh
RESULT=$(wrk -j ...)
my-dommand $RESULT

With url is "} || sudo rm /etc/passwd #

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will give:

my-dommand {"url": ""} || sudo rm /etc/passwd", "script": null}

So the ending quote is missing

Feel free to propose an enhancement 👍

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look my code #

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which code are you talking about ?

If you're pointing me the ending #, the problem is still not here as the closing quote " is not here. In interactive terminal, it will wait for you to close quotes. And in non-interactive, it will raise a syntax error.

src/wrk.c Outdated Show resolved Hide resolved
@ZuluPro
Copy link
Author

ZuluPro commented Jan 28, 2019

Hello @methane

@ZuluPro sample output helps to review output format.

Here's an example of output:

{
    "url": "https://example.com",
    "threads": 1,
    "connections": 1,
    "duration": 2,
    "timeout": 1000,
    "script": null,
    "runtime_us": 2008747,
    "bytes": 241470,
    "bytes_per_sec": 120209.264780,
    "requests_count": 3,
    "requests_per_sec": 1.493468,
    "requests_mean": 1.666667,
    "requests_stdev": 1.527525,
    "requests_min": 0,
    "requests_max": 3,
    "requests_within_stdev": 66.67,
    "latency_mean": 0.399244,
    "latency_stdev": 0.095570,
    "latency_min": 0.329713,
    "latency_max": 0.508223,
    "latency_within_stdev": 66.67,
    "latency_percentile_50": 0.36,
    "latency_percentile_75": 0.51,
    "latency_percentile_90": 0.51,
    "latency_percentile_95": 0.51,
    "latency_percentile_99": 0.51,
    "errors_connect": 0,
    "errors_read": 0,
    "errors_write": 0,
    "errors_timeout": 0,
    "errors_status": 0
}

This is duplicate of #72 ...

Your link is pointing to an issue, not a pull request, but yes, there were several (closed) contributions about JSON output:

... and there are no describe about why report.lua is not enough.

report.lua gives only a poor latency distribution, if you look at my JSON output there are tons more:

  • Run configuration
  • Requests data
  • Latency data
  • Errors

Addtionally, I expect more detailed information for JSON. I don't expect "for human" information (like avg) in JSON. For example, someone may want histogram instead of percentile.

This is a first contribution of JSON output and it actually gives more than the existing text one with more accuracy. Your example of histogram is a totally other feature/PR. Otherwise it could be enabled with the --latency option.

If wrk has -J option, people would expect backward compatibility. It's difficult to refactoring JSON format.

It's not hard if you warn users. People could expect changement if they change major or minor version.

So I think report.lua is much better. It is customizable. It's easier to keep backward compatibility, because people can keep using old report even when wrk updated it's report.lua format.

report.lua is not comparable, as said, it's a basis to make custom report with LuA. But except programmers, people expect a command working out-of-the-box. If I am a sysadmin I don't expect to learn LuA just to have machine-readable data.

because people can keep using old report even when wrk updated it's report.lua format.

Solutions aren't mutually exclusive, the ability to use LuA isn't erased by the JSON option

@methane
Copy link

methane commented Jan 28, 2019

report.lua gives only a poor latency distribution, if you look at my JSON output there are tons more:

"report.lua" is just an example. You can propose adding "reportjson.lua" example.
Then people can customize it for their usage.

@methane
Copy link

methane commented Jan 28, 2019

Your link is pointing to an issue, not a pull request, but yes, there were several (closed) contributions about JSON output:

...

report.lua is not comparable, as said, it's a basis to make custom report with LuA. But except programmers, people expect a command working out-of-the-box. If I am a sysadmin I don't expect to learn LuA just to have machine-readable data.

...

Solutions aren't mutually exclusive, the ability to use LuA isn't erased by the JSON option

I meant you should describe these reasoning before sending PR with only "Hello I added JSON output option", and pinging. Especially for features already rejected in the past.
I don't oppose against this. (+0)

@wg
Copy link
Owner

wg commented Jan 28, 2019

@methane thank you for taking the time to review and comment on this PR!

@ZuluPro as methane points out it should be possible to generate a JSON report using the Lua scripting API. This is preferable to hard-coding reports in wrk itself because everyone will have different requirements for format, content, etc.

My recommendation would be to develop JSON reporting as a Lua script and publish it as a blog post or to your own GitHub repo. I've left this repo's wiki open and it'd be a great contribution if you wanted to curate a page about available scripts and including yours.

@ZuluPro
Copy link
Author

ZuluPro commented Jan 28, 2019

Hello @wg

Except the latency details, the output gives almost everything, so no one could ask more data. If someone has issue with format it would be JSON's keys or unit of measurement: nothing hard as JSON is typed and machine readable.

I would like to point the project's audience : Benchmarkers. You are asking for LuA programming just to be able to have machine-readable output. For a sysadmin, your project stay a handy tool, but just for human eyes.

@ZuluPro
Copy link
Author

ZuluPro commented Jan 28, 2019

@methane

"report.lua" is just an example. You can propose adding "reportjson.lua" example. Then people can customize it for their usage.

As said, people who aren't programmers won't use scripts. Seeing reaction on top of this PR, I think a lot of people are agreed.

I meant you should describe these reasoning before sending PR with only "Hello I added JSON output option", and pinging. Especially for features already rejected in the past.

I saw the other PR after did mine, seeing the popularity of this project I didn't think this feature could be already rejected. About reason, nowadays, it's obvious that a machine-readable output is a base requirements for a benchmark command.

@methane
Copy link

methane commented Jan 28, 2019

I tried to implement sample implementation of JSON output.

https://gist.github.com/methane/8bb4fedf27e4d13db80078302f8954da

As you can see, it's difficult to combine with tools like jq by pipe.
Both of text output and Lua output are written into stdout.

@ZuluPro
Copy link
Author

ZuluPro commented Jan 28, 2019

@methane

As you can see, it's difficult to combine with tools like jq by pipe.

Yes this is why my -j option disables the default output. Text is useless as we aim machines.

Does the lua environment allow to disable the default output ?

@wg
Copy link
Owner

wg commented Jan 29, 2019

I'd suggest writing JSON output to a file whose path is specified via a script arg. Most likely someone who wants machine-readable output is also invoking wrk with a script. Easy enough to record the command line parameters in that script, generate a temp file name, pass that to wrk and read the results if wrk completes successfully.

@ZuluPro
Copy link
Author

ZuluPro commented Jan 29, 2019

I'd suggest writing JSON output to a file whose path is specified via a script arg.

@wg But with your suggestion we lose the ability to just pipe to another command, i.e. a JSON parser.

As what I coded is an option, it doesn't change the default behavior, but the new output should be also be given in usual conditions.

@methane
Copy link

methane commented Jan 29, 2019

Yes this is why my -j option disables the default output. Text is useless as we aim machines.

If so, why didn't describe it from first?
You should describe "why this is needed" when (or before) sending pull request.

@wg But with your suggestion we lose the ability to just pipe to another command, i.e. a JSON parser.

Using pipe is not a strict requirement for me.

@methane
Copy link

methane commented Jan 29, 2019

@wg How can I use script args from done func?

@wg
Copy link
Owner

wg commented Jan 29, 2019

@methane good question, it's a little tricky since done runs in a different context than init, need to grab a reference to one thread in setup, store the arg in a global in init, then use thread:get.

I'd use some Lua JSON library like https://github.com/rxi/json.lua to avoid a lot of ugly manual JSON building and then a script like:

json = require "json"

function setup(thread)
   thread0 = thread0 or thread
end
                   
function init(args)
   file = args[1] or "/dev/null"
end

function done(summary, latency, requests)
   file = io.open(thread0:get("file"), "w")

   percentiles = {}
   
   for _, p in pairs({ 50, 90, 99, 99.999 }) do
      k = string.format("%g%%", p)
      v = latency:percentile(p)
      percentiles[k] = v
   end
   
   file:write(json.encode({
       duration = summary.duration,
       requests = summary.requests,
       bytes    = summary.bytes,
       errors   = summary.errors,
       latency  = {
          min         = latency.min,
          max         = latency.max,
          mean        = latency.mean,
          stdev       = latency.stdev,
          percentiles = percentiles,
       },
   }))
   file:close()
end

Run as wrk ... -s report.lua -- output.json

@methane
Copy link

methane commented Jan 29, 2019

Wow, good example!
Would you post it in Wiki? I think it is very helpful for many users.

@Nex-Otaku
Copy link

Nex-Otaku commented Mar 5, 2020

I also need this option.
It would really helped to make atomated benchmark I am working on now.

Lua script is highly customisable, but it not helps. It just makes everything more complicated (

Why messing with additional files, when you could add just a small parameter to command? It would be so much easier...

@yordis
Copy link

yordis commented Mar 25, 2021

Hey folks, any updates on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants