Skip to content

2217 add io module for profiler #2287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

smedegaard
Copy link

@smedegaard smedegaard commented Jun 4, 2025

Proposed changes

As described in the issue this is an effort to extend and improve the output of CK Profiler.

Discussion

This pull request is more to have a conversation around than a final implementation.

Disclaimer, the author is not experienced in C++. Input and feedback on how to do things in a more idiomatic way is welcome.

What's in the PR

This PR includes

  • adding argparse for easier parsing of arguments
  • io_profiler
    • support for output to:
      • console (the default option. will use the same format as previously)
      • newline delimited json (jsonl)

Output Format

The output format is intended to capture the same data as the current implementations, and be flexible enough to support the different kinds of operations available.

What is NOT in the PR

The PR has only updated profiler/include/profiler/profile_gemm_impl.hpp to use the new io_profiler functions.
If this approach to output is accepted, all the output logic will live in io_profiler. Therefore all the implementation headers must be updated to leverage it.

It does not capture meta data like GPU name, architecture, CK version, etc

Example

Tested on Radeon RX 7900 XT

./bin/ckProfiler gemm 1 2 1 2 0 1 3840 4096 4096 4096 4096 4096 5 10 -o jsonl=result.jsonl

cat result.jsonl

{"operation":"DeviceGemmDl<256, 128, 128, 16, 2, 4, 4, 1>","time_ms":3.20498,"tflops":40.2027,"gb_per_sec":30.0997,"is_best":false,"timestamp":"2025-06-04T11:12:33.466Z","N":4096,"M":3840,"layout_b":"RowMajor","output_datatype":"unknown","K":4096,"weight_datatype":"unknown","input_datatype":"unknown","layout_c":"RowMajor","layout_a":"ColumnMajor","datatype":"unknown","operation_type":"gemm","operation_params":{"N":4096,"M":3840,"layout_b":"RowMajor","output_datatype":"unknown","K":4096,"weight_datatype":"unknown","input_datatype":"unknown","layout_c":"RowMajor","layout_a":"ColumnMajor","datatype":"unknown","operation_type":"gemm"}}
{"operation":"DeviceGemmDl<256, 128, 128, 16, 2, 4, 4, 1>","time_ms":3.14917,"tflops":40.9153,"gb_per_sec":30.6332,"is_best":false,"timestamp":"2025-06-04T11:12:33.655Z","N":4096,"M":3840,"layout_b":"RowMajor","output_datatype":"unknown","K":4096,"weight_datatype":"unknown","input_datatype":"unknown","layout_c":"RowMajor","layout_a":"ColumnMajor","datatype":"unknown","operation_type":"gemm","operation_params":{"N":4096,"M":3840,"layout_b":"RowMajor","output_datatype":"unknown","K":4096,"weight_datatype":"unknown","input_datatype":"unknown","layout_c":"RowMajor","layout_a":"ColumnMajor","datatype":"unknown","operation_type":"gemm"}}
...

Output truncated for brevity

@smedegaard smedegaard self-assigned this Jun 4, 2025
@smedegaard smedegaard added the enhancement New feature or request label Jun 4, 2025
@smedegaard smedegaard linked an issue Jun 4, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add IO module for profiler
1 participant