Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc][Polish] avoid memory leak and explain some points #54

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

muyuuuu
Copy link
Contributor

@muyuuuu muyuuuu commented Nov 25, 2024

  1. 避免内存泄漏
  2. 一次 IO,多次计算那里是我个人的理解,在之前 opencl 优化算子中也用到了,算一种通用思路吧
  3. 矩阵大小不为 32 的倍数时会有段错误。我见过的优化方法是先取出能整除 32 的图像区域进行 cuda 加速,边界部分用 C 处理。我不确定百度的优化方法,就没写解决方案
  4. TM 的解释,我看了好久看懂了 TM 的用法,擅自主张加了个解释

Copy link
Collaborator

@AndSonder AndSonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AndSonder
Copy link
Collaborator

感谢对文档的补充!

“矩阵大小不为 32 的倍数时会有段错误” 这个问题其实在很多sgemm优化算法里面都会遇到,文档里面的文章主要还是学习这些优化方法,考虑特别多边界情况的话会让代码变的非常复杂

@AndSonder AndSonder merged commit 2d37c86 into PaddleJitLab:develop Nov 26, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants