Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sycl ext intel kernel queries extension #16834

Open
wants to merge 12 commits into
base: sycl
Choose a base branch
from
4 changes: 3 additions & 1 deletion llvm/include/llvm/SYCLLowerIR/DeviceConfigFile.td
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ def AspectExt_oneapi_cuda_cluster_group : Aspect<"ext_oneapi_cuda_cluster_group"
def AspectExt_intel_fpga_task_sequence : Aspect<"ext_intel_fpga_task_sequence">;
def AspectExt_oneapi_atomic16 : Aspect<"ext_oneapi_atomic16">;
def AspectExt_oneapi_virtual_functions : Aspect<"ext_oneapi_virtual_functions">;
def AspectExt_intel_spill_mem_size : Aspect<"ext_intel_spill_mem_size">;
kurapov-peter marked this conversation as resolved.
Show resolved Hide resolved
// Deprecated aspects
def AspectInt64_base_atomics : Aspect<"int64_base_atomics">;
def AspectInt64_extended_atomics : Aspect<"int64_extended_atomics">;
Expand Down Expand Up @@ -150,7 +151,8 @@ def : TargetInfo<"__TestAspectList",
AspectExt_oneapi_queue_profiling_tag, AspectExt_oneapi_virtual_mem, AspectExt_oneapi_cuda_cluster_group,
AspectExt_intel_fpga_task_sequence,
AspectExt_oneapi_atomic16,
AspectExt_oneapi_virtual_functions],
AspectExt_oneapi_virtual_functions,
AspectExt_intel_spill_mem_size],
[]>;
// This definition serves the only purpose of testing whether the deprecated aspect list defined in here and in SYCL RT
// match.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ attributes #2 = { nounwind }
!77 = !{!"ext_oneapi_bindless_images_sample_2d_usm", i32 79}
!78 = !{!"ext_oneapi_atomic16", i32 80}
!79 = !{!"ext_oneapi_virtual_functions", i32 81}
!79 = !{!"ext_intel_spill_mem_size", i32 82}
!80 = !{!"host", i32 0}
!81 = !{!"int64_base_atomics", i32 7}
!82 = !{!"int64_extended_atomics", i32 8}
Expand Down
2 changes: 1 addition & 1 deletion sycl/cmake/modules/FetchUnifiedRuntime.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ if(SYCL_UR_USE_FETCH_CONTENT)
CACHE PATH "Path to external '${name}' adapter source dir" FORCE)
endfunction()

set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git")
set(UNIFIED_RUNTIME_REPO "https://github.com/kurapov-peter/unified-runtime.git")
kurapov-peter marked this conversation as resolved.
Show resolved Hide resolved
include(${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules/UnifiedRuntimeTag.cmake)

set(UMF_BUILD_EXAMPLES OFF CACHE INTERNAL "EXAMPLES")
Expand Down
2 changes: 1 addition & 1 deletion sycl/cmake/modules/UnifiedRuntimeTag.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
# Date: Mon Jan 27 10:40:02 2025 +0000
# Merge pull request #2551 from przemektmalon/przemek/bindless-images-host-usm
# Enable creation of bindless images backed by host USM
set(UNIFIED_RUNTIME_TAG 0bb6789f0113ea937d861fd67fd677b91ecdeb8b)
set(UNIFIED_RUNTIME_TAG e6b61c6768f9e12b46ed921a4bdf3283de53772f)
114 changes: 114 additions & 0 deletions sycl/doc/extensions/proposed/sycl_ext_intel_kernel_queries.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
= sycl_ext_intel_kernel_queries

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]
:endnote: &#8212;{nbsp}end{nbsp}note

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) 2025 Intel Corporation. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 9 specification. All
references below to the "core SYCL specification" or to section numbers in the
SYCL specification refer to that revision.


== Status

This is a proposed extension specification, intended to gather community
feedback. Interfaces defined in this specification may not be implemented yet
or may be in a preliminary state. The specification itself may also change in
incompatible ways before it is finalized. *Shipping software products should
not rely on APIs defined in this specification.*


== Backend support status

The APIs in this extension may be used only on a device that has
`aspect::ext_intel_kernel_queries`. The application must check that the device
has this aspect before using any of the APIs in this extension.
If the application fails to do this, the implementation throws a synchronous
exception with the `errc::feature_not_supported` error code when the info is
queried.


== Overview

Some low-level information about kernels help applications to optimize and
control the backend compilation. For example, an application may query the size
of the memory allocated for register spilling by the GPU backend compiler and
recompile a kernel with a different set of compilation flags to avoid associated
overheads.
This extension provides a way to access those details about compiled kernels for
Intel GPUs. It is useful for developers tuning for those devices.
kurapov-peter marked this conversation as resolved.
Show resolved Hide resolved


== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro `SYCL_EXT_INTEL_KERNEL_QUERIES` to one of the values defined in the table
below. Applications can test for the existence of this macro to determine if
the implementation supports this feature, or applications can test the macro's
value to determine which of the extension's features the implementation
supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|Initial version of this extension.
|===

=== Usage

The information about a kernel can be queried through existing APIs:

```c++
namespace extinfo = ext::intel::info::kernel;
auto kb = get_kernel_bundle<KernelName, bundle_state::executable>(ctx);
auto ret = kb.get_kernel(kernelID).get_info<extinfo::Param>(dev);
```
kurapov-peter marked this conversation as resolved.
Show resolved Hide resolved

_Constraints:_ Available only if `Param` is a valid information descriptor
introduced by this extension.

_Remarks:_ In addition to the default constraints and preconditions, exception
is thrown if a device does not have an appropriate aspect.

== Issues

None.
9 changes: 9 additions & 0 deletions sycl/include/sycl/detail/info_desc_helpers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,15 @@ struct IsKernelInfo<info::kernel_device_specific::ext_codeplay_num_regs>
#include <sycl/info/kernel_device_specific_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

#define __SYCL_PARAM_TRAITS_SPEC(Namespace, DescType, Desc, ReturnT, UrCode) \
template <> \
struct is_##DescType##_info_desc<Namespace::info::DescType::Desc> \
: std::true_type { \
using return_type = Namespace::info::DescType::Desc::return_type; \
};
#include <sycl/info/ext_intel_kernel_info_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

#define __SYCL_PARAM_TRAITS_SPEC(DescType, Desc, ReturnT, UrCode) \
template <> \
struct is_##DescType##_info_desc<info::DescType::Desc> : std::true_type { \
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/info/aspects.def
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,4 @@ __SYCL_ASPECT(ext_oneapi_bindless_images_sample_1d_usm, 78)
__SYCL_ASPECT(ext_oneapi_bindless_images_sample_2d_usm, 79)
__SYCL_ASPECT(ext_oneapi_atomic16, 80)
__SYCL_ASPECT(ext_oneapi_virtual_functions, 81)
__SYCL_ASPECT(ext_intel_spill_mem_size, 82)
1 change: 1 addition & 0 deletions sycl/include/sycl/info/ext_intel_kernel_info_traits.def
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__SYCL_PARAM_TRAITS_SPEC(ext::intel, kernel, spill_mem_size, uint32_t, UR_KERNEL_INFO_SPILL_MEM_SIZE)
1 change: 1 addition & 0 deletions sycl/include/sycl/info/info_desc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,7 @@ struct work_item_progress_capabilities;
} // namespace ext::oneapi::experimental::info::device
#include <sycl/info/ext_codeplay_device_traits.def>
#include <sycl/info/ext_intel_device_traits.def>
#include <sycl/info/ext_intel_kernel_info_traits.def>
#include <sycl/info/ext_oneapi_device_traits.def>
#include <sycl/info/ext_oneapi_kernel_queue_specific_traits.def>

Expand Down
5 changes: 5 additions & 0 deletions sycl/source/detail/device_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -785,6 +785,11 @@ bool device_impl::has(aspect Aspect) const {
BE == sycl::backend::opencl;
return (is_cpu() || is_gpu()) && isCompatibleBE;
}
case aspect::ext_intel_spill_mem_size: {
backend BE = getBackend();
bool isCompatibleBE = BE == sycl::backend::ext_oneapi_level_zero;
return is_gpu() && isCompatibleBE;
}
}

return false; // This device aspect has not been implemented yet.
Expand Down
15 changes: 15 additions & 0 deletions sycl/source/detail/kernel_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,21 @@ inline typename syclex::info::kernel_queue_specific::max_num_work_groups::
DynamicLocalMemorySize);
}

template <>
inline typename ext::intel::info::kernel::spill_mem_size::return_type
kernel_impl::get_info<ext::intel::info::kernel::spill_mem_size>(
const device &Device) const {
if (!Device.has(aspect::ext_intel_spill_mem_size))
throw exception(
make_error_code(errc::feature_not_supported),
"This device does not have the ext_intel_spill_mem_size aspect");

return get_kernel_device_specific_info<
ext::intel::info::kernel::spill_mem_size>(
this->getHandleRef(), getSyclObjImpl(Device)->getHandleRef(),
getAdapter());
}

} // namespace detail
} // namespace _V1
} // namespace sycl
46 changes: 46 additions & 0 deletions sycl/source/detail/kernel_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,52 @@ uint32_t get_kernel_device_specific_info_with_input(ur_kernel_handle_t Kernel,

return Result;
}

template <>
inline ext::intel::info::kernel::spill_mem_size::return_type
get_kernel_device_specific_info<ext::intel::info::kernel::spill_mem_size>(
ur_kernel_handle_t Kernel, ur_device_handle_t Device,
const AdapterPtr &Adapter) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to check for the aspect here as well?

size_t ResultSize = 0;
ur_kernel_info_t PropName = UR_KERNEL_INFO_SPILL_MEM_SIZE;

// First call to get the number of device images
Adapter->call<UrApiKind::urKernelGetInfo>(Kernel, PropName, 0, nullptr,
&ResultSize);

size_t DeviceCount = ResultSize / sizeof(uint32_t);

// Second call to retrieve the data
std::vector<uint32_t> Result(DeviceCount);
Adapter->call<UrApiKind::urKernelGetInfo>(Kernel, PropName, ResultSize,
Result.data(), nullptr);

ur_program_handle_t Program;
Adapter->call<UrApiKind::urKernelGetInfo>(Kernel, UR_KERNEL_INFO_PROGRAM,
sizeof(ur_program_handle_t),
&Program, nullptr);
// Retrieve the associated device list
size_t URDevicesSize = 0;
Adapter->call<UrApiKind::urProgramGetInfo>(Program, UR_PROGRAM_INFO_DEVICES,
0, nullptr, &URDevicesSize);

std::vector<ur_device_handle_t> URDevices(URDevicesSize /
sizeof(ur_device_handle_t));
Adapter->call<UrApiKind::urProgramGetInfo>(Program, UR_PROGRAM_INFO_DEVICES,
URDevicesSize, URDevices.data(),
nullptr);
assert(Result.size() == URDevices.size());

// Map the result back to the program devices
for (size_t idx = 0; idx < URDevices.size(); ++idx) {
if (URDevices[idx] == Device)
return Result[idx];
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping to devices relies on UR providing the same order within a single application (see oneapi-src/unified-runtime#2614 for details). There was a problem inside UR implementation that demanded the API to return an array of values for spills instead of a single value (due to device pointer unavailability). This implementation hides that inconvenience from SYCL users.

throw exception(make_error_code(errc::runtime),
"ext::intel::info::kernel::spill_mem_size failed to retrieve "
"the requested value");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right type of exception?

}

} // namespace detail
} // namespace _V1
} // namespace sycl
9 changes: 9 additions & 0 deletions sycl/source/detail/ur_info_code.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,15 @@ template <typename T> struct UrInfoCode;
#include <sycl/info/ext_oneapi_device_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

#define __SYCL_PARAM_TRAITS_SPEC(Namespace, DescType, Desc, ReturnT, UrCode) \
template <> struct UrInfoCode<Namespace::info::DescType::Desc> { \
static constexpr ur_kernel_info_t value = \
static_cast<ur_kernel_info_t>(UrCode); \
};

#include <sycl/info/ext_intel_kernel_info_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

} // namespace detail
} // namespace _V1
} // namespace sycl
3 changes: 3 additions & 0 deletions sycl/test-e2e/Basic/aspects.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ int main() {
if (plt.has(aspect::ext_oneapi_virtual_functions)) {
std::cout << " ext_oneapi_virtual_functions" << std::endl;
}
if (plt.has(aspect::ext_intel_spill_mem_size)) {
std::cout << " ext_intel_spill_mem_size" << std::endl;
}
}
std::cout << "Passed." << std::endl;
return 0;
Expand Down
9 changes: 9 additions & 0 deletions sycl/test-e2e/Basic/kernel_info.cpp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a better way to test the thing. This would just return zero. Any suggestions?

Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,9 @@ int main() {
const cl_uint compileNumSg =
krn.get_info<info::kernel_device_specific::compile_num_sub_groups>(dev);
assert(compileNumSg <= maxNumSg);
const spillMemSz =
krn.get_info<ext::intel::info::kernel::spill_mem_size>(dev);
assert(spillMemSz >= 0);

// Use ext_oneapi_get_kernel_info extension and check that answers match.
const size_t wgSizeExt = syclex::get_kernel_info<
Expand All @@ -125,6 +128,12 @@ int main() {
dev);
assert(compileNumSgExt == compileNumSg);

const uint32_t spillMemSizeExt =
syclex::get_kernel_info<SingleTask,
ext::intel::info::kernel::spill_mem_size>(ctx,
dev);
assert(spillMemSizeExt == spillMemSz);

// Use ext_oneapi_get_kernel_info extension with queue parameter and check the
// result.
const size_t wgSizeExtQ =
Expand Down