Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support of __float2bfloat162_rn #3425

Closed
jinz2014 opened this issue Mar 1, 2024 · 5 comments
Closed

support of __float2bfloat162_rn #3425

jinz2014 opened this issue Mar 1, 2024 · 5 comments

Comments

@jinz2014
Copy link

jinz2014 commented Mar 1, 2024

error: use of undeclared identifier '__float2bfloat162_rn'; did you mean '__float22bfloat162_rn'?
return __float2bfloat162_rn(a);

Thanks

@cjatin
Copy link
Contributor

cjatin commented Mar 12, 2024

Can you share some more information, where are you seeing this error.

@jinz2014
Copy link
Author

I hope this link will eventually let you add all missing functions related to conversions among float, half and bfloat16.

https://github.com/ROCm/HIPIFY/blob/amd-staging/docs/tables/CUDA_Device_API_supported_by_HIP.md

@houseroad
Copy link

CUDA provided __float2bfloat162_rn, however, hip doesn't have the corresponding definition. So when hipifying the code, it cannot compile.

@yiakwy-xpu-ml-framework-team
Copy link

yiakwy-xpu-ml-framework-team commented Sep 6, 2024

error: use of undeclared identifier '__float2bfloat162_rn'; did you mean '__float22bfloat162_rn'?
return __float2bfloat162_rn(a);

// Following math functions included in ROCM6.2 SDK :
// __hmul: bfloat16 -> bfloat16 , __hmul2: bfloat16 -> bfloat16,
// __floats2bfloat162_rn: (float,float) -> __hip_bfloat162,
// __float22bfloat162_rn: float2 -> __hip_bfloat162,
// __float2bfloat162_rn : float -> __hip_bfloat162,
// __bfloat1622float2 : __hip_bfloat162 -> float2

Here is an simple implementation of __floats2bfloat162_rn:

__inline__ __device__ __nv_bfloat162 __floats2bfloat162_rn(const float a, const float b) {
  __nv_bfloat162 val;
  val = __nv_bfloat162(__float2bfloat16_rn(a), __float2bfloat16_rn(b));
  return val;
}

it means accept two floats, and convert each of them to bf16 and pack it into 32bit number.

The signature '__float2bfloat162_rn' also implemented in hip, it accepts a single float:

__HOST_DEVICE__ inline __hip_bfloat162 __float2bfloat162_rn(const float a) {
  return __hip_bfloat162{__float2bfloat16(a), __float2bfloat16(a)};
}

here is the implemenation:

/**
 * \ingroup HIP_INTRINSIC_BFLOAT16_CONV
 * \brief Converts float to bfloat16
 */
__HOST_DEVICE__ inline __hip_bfloat16 __float2bfloat16(float f) {
  __hip_bfloat16 ret;
  union {
    float fp32;
    unsigned int u32;
  } u = {f};
  if (~u.u32 & 0x7f800000) {
    // When the exponent bits are not all 1s, then the value is zero, normal,
    // or subnormal. We round the bfloat16 mantissa up by adding 0x7FFF, plus
    // 1 if the least significant bit of the bfloat16 mantissa is 1 (odd).
    // This causes the bfloat16's mantissa to be incremented by 1 if the 16
    // least significant bits of the float mantissa are greater than 0x8000,
    // or if they are equal to 0x8000 and the least significant bit of the
    // bfloat16 mantissa is 1 (odd). This causes it to be rounded to even when
    // the lower 16 bits are exactly 0x8000. If the bfloat16 mantissa already
    // has the value 0x7f, then incrementing it causes it to become 0x00 and
    // the exponent is incremented by one, which is the next higher FP value
    // to the unrounded bfloat16 value. When the bfloat16 value is subnormal
    // with an exponent of 0x00 and a mantissa of 0x7F, it may be rounded up
    // to a normal value with an exponent of 0x01 and a mantissa of 0x00.
    // When the bfloat16 value has an exponent of 0xFE and a mantissa of 0x7F,
    // incrementing it causes it to become an exponent of 0xFF and a mantissa
    // of 0x00, which is Inf, the next higher value to the unrounded value.
    u.u32 += 0x7fff + ((u.u32 >> 16) & 1);  // Round to nearest, round to even
  } else if (u.u32 & 0xffff) {
    // When all of the exponent bits are 1, the value is Inf or NaN.
    // Inf is indicated by a zero mantissa. NaN is indicated by any nonzero
    // mantissa bit. Quiet NaN is indicated by the most significant mantissa
    // bit being 1. Signaling NaN is indicated by the most significant
    // mantissa bit being 0 but some other bit(s) being 1. If any of the
    // lower 16 bits of the mantissa are 1, we set the least significant bit
    // of the bfloat16 mantissa, in order to preserve signaling NaN in case
    // the bloat16's mantissa bits are all 0.
    u.u32 |= 0x10000;  // Preserve signaling NaN
  }

  ret.data = (u.u32 >> 16);
  return ret;
}

@houseroad @jinz2014 please updated your SDK to rocm 6.2 using this script

@jinz2014
Copy link
Author

jinz2014 commented Sep 6, 2024

Thanks for the answer.

@jinz2014 jinz2014 closed this as completed Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants