Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
为了方便比较提升倍数,所有的
for
循环使用任务域指定使用的线程数为4每个函数只有一个
for
循环,故使用默认的auto_partitioner
自动进行任务分配除
saxpy
和scanner
函数,其余都能达到接近4倍的性能提升,magicfilter
甚至达到了8倍观察到
saxpy
和scanner
涉及到同时对数组进行读和写的操作,在scanner
中每个线程算出对应的local_ret
才会进行下一步的累加操作,这也会造成线程之间的等待时间,这一部分应该还有优化的地方sqrdot
的串行结果为5165.4,会出现浮点误差,并行则可以避免浮点误差,结果为5792.62关于内卷:tbb已经够强大了,不卷了不卷了,还要补前面的课😂