Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed vs numpy #522

Open
CuriousCat-7 opened this issue Aug 3, 2021 · 1 comment
Open

speed vs numpy #522

CuriousCat-7 opened this issue Aug 3, 2021 · 1 comment

Comments

@CuriousCat-7
Copy link

CuriousCat-7 commented Aug 3, 2021

import nimpy
import times
import arraymancer

var 
  tic, toc: float

# for math
let np = pyImport("numpy")
tic = cpuTime()
for i in 0..<200:
  discard np.sqrt(np.cos(np.sin(np.linspace(0, 10, 1000))))
toc = cpuTime()
echo "np time: ", toc - tic

tic = cpuTime()
for i in 0..<200:
  discard sqrt(cos(sin(arraymancer.linspace(0, 10, 1000))))
toc = cpuTime()
echo "arraymancer time: ", toc - tic

Shell and output:

 nim c -r npy                                                     
Hint: used config file '/home/neo/.choosenim/toolchains/nim-1.4.8/config/nim.cfg' [Conf]
Hint: used config file '/home/neo/.choosenim/toolchains/nim-1.4.8/config/config.nims' [Conf]
.................................................................................................................................................................................................................................CC: read
CC: write
CC: stdlib_times.nim
CC: stdlib_random.nim
CC: ../../../.nimble/pkgs/arraymancer-0.7.5/arraymancer/tensor/ufunc.nim
CC: npy.nim

Hint:  [Link]
Hint: 135625 lines; 1.764s; 180.102MiB peakmem; Debug build; proj: /home/neo/work/nim-projects/nim-learn/npy; out: /home/neo/work/nim-projects/nim-learn/npy [SuccessX]
Hint: /home/neo/work/nim-projects/nim-learn/npy  [Exec]
np time: 0.012997972
arraymancer time: 0.03080387

If it is compiled with release

nim c -r -d:release npy

I get time:

np time: 0.01219863
arraymancer time: 0.007503163999999993

Could I improve the speed further?

@mratsim
Copy link
Owner

mratsim commented Aug 4, 2021

You can fuse sqrt cos sin in a single pass over the data

import nimpy
import times
import arraymancer

var
  tic, toc: float

# for math
let np = pyImport("numpy")
tic = epochTime()
for i in 0..<200:
  discard np.sqrt(np.cos(np.sin(np.linspace(0, 10, 1000))))
toc = epochTime()
echo "np time: ", toc - tic

tic = epochTime()
for i in 0..<200:
  discard sqrt(cos(sin(arraymancer.linspace(0, 10, 1000))))
toc = epochTime()
echo "arraymancer time: ", toc - tic

tic = epochTime()
for i in 0..<200:
  var t = arraymancer.linspace(0, 10, 1000)
  t.apply_inline():
    x.sin().cos().sqrt()
toc = epochTime()
echo "arraymancer fused time: ", toc - tic
$  nim c -d:danger --hints:off --warnings:off -d:danger -r --outdir:build build/speedtest.nim 
np time: 0.009390830993652344
arraymancer time: 0.005604982376098633
arraymancer fused time: 0.004479646682739258

Depending on the number of cores you have, using -d:openmp might also accelerate. I have 36 cores unfortunately and OpenMP doesn't deal with contention that well with the unfused code (not enough work per item).

$  nim c -d:openmp --hints:off --warnings:off -d:danger -r --outdir:build build/speedtest.nim 
np time: 0.009420156478881836
arraymancer time: 0.04207587242126465
arraymancer fused time: 0.005712270736694336

Note: for benchmarking CPU time might give you the wrong figures with parallel code that involves multiple CPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants