Skip to content

hw08 GeLee #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

hw08 GeLee #4

wants to merge 1 commit into from

Conversation

GeLee-Q
Copy link

@GeLee-Q GeLee-Q commented Feb 6, 2022

作业解释

fill_sin 改成“网格跨步循环”以后,这里三重尖括号里的参数如何调整?10 分

解决方法

  • 采取网格跨步循环,gridDim 改为32即可。
  • 将相关函数改为模板函数;采取lambda表达式来当函子。

这里的“边角料法”对于不是 1024 整数倍的 n 会出错,为什么?请修复:10 分

原因:

  • 如果不是整数倍,就会漏掉最后几个元素;

方法:

  • 既可以采用向上取整,
  • 也可以网格跨步循环来解决;

这里 CPU 访问数据前漏了一步什么操作?请补上:10 分

调用 cudaDeviceSynchronize(),让 CPU 陷入等待,等 GPU 完成队列的所有任务后再返回。

所遇到困难

  • MSVC bug

CudaAllocator.h 在编译期间会报错

error: no suitable user-defined conversion from "std::_Rebind_alloc_t<CudaAllocator<int>, int>" to "std::_Rebind_alloc_t<std::_Rebind_alloc_t<CudaAllocator<int>, int>

查询文档:

https://docs.microsoft.com/zh-cn/cpp/standard-library/allocators?view=msvc-170

在头文件中增加如下内容即可。

	CudaAllocator() noexcept {};
    template<class U> CudaAllocator(const CudaAllocator<U>&) noexcept {}
    template<class U> bool operator==(const CudaAllocator<U>&) const noexcept
    {
        return true;
    }
    template<class U> bool operator!=(const CudaAllocator<U>&) const noexcept
    {
        return false;
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant