Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intrinsic code 没有体现算法的原本的设计,是否有计划升级intrinsic code,并设置vset为ta mu #5448

Open
ArthurLiu-jbl opened this issue May 7, 2024 · 0 comments

Comments

@ArthurLiu-jbl
Copy link

ArthurLiu-jbl commented May 7, 2024

error log | 日志或报错信息 | ログ

Test_relu : 测试结果: -0.067289 exception: 0.667084

ReLU_riscv::forward_inplace()
#if __riscv_vector

        int n = size;
        while (n > 0)
        {
            size_t vl = vsetvl_e32m8(n);

            vfloat32m8_t _p = vle32_v_f32m8(ptr, vl);
            vbool4_t _b = vmflt_vf_f32m8_b4(_p, .0f, vl);
            _p = vfmul_vf_f32m8_m(_b, _p, slope, vl); //slope: float(float32_t)
            vse32_v_f32m8(ptr, _p, vl);

            ptr += vl;
            n -= vl;
        }

#else // __riscv_vector
for (int i = 0; i < size; i++)
{
if (*ptr < 0)
*ptr *= slope;
ptr++;
}
#endif // __riscv_vector

context | 编译/运行环境 | バックグラウンド

clang 17
汇编代码:
vsetvli a3, a3, e32, m8, ta, ma
vle32.v v8, (a1)
vle32.v v16, (a0)
vmflt.vf v0, v8, fa5
vfmul.vv v8, v8, v16, v0.t
vse32.v v8, (a1)

how to reproduce | 复现步骤 | 再現方法

  1. 用clang17编译ncnn RVV 版本;
  2. intrinsic代码没有显式设置为mu, 而mask=1的值后面也需要用到,目前只是处理了mask=0的值,mask=1的值需要保留原矩阵的值。那么intrinsic的code需要升级,或者改为嵌入汇编。
  3. 测试环境用spike,将spike的vset指令改为在ta ma的情况下对mask=1的数据置0. 只对mask=0值进行处理。

more | 其他 | その他

Tasks

No tasks being tracked yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant