New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat: add a page about compiling + parallel and GPU computing #252

Merged

dpshelio merged 3 commits into main from saransh/performance-updates

Sep 6, 2024

Member

Saransh-cpp commented Aug 21, 2024 •

edited

Loading

Closes #239

I will need some help here to filter what content is underexplained and what is overexplained 🙂

Saransh-cpp requested a review from dpshelio

August 21, 2024 20:18

dpshelio requested changes

View reviewed changes

Member

dpshelio left a comment

Very amazing job! I don't think we need to remove or add much to it. I think it can be merged after solved the few comments I left.

ch08performance/030misc-libs.ipynb.py Outdated Show resolved Hide resolved

ch08performance/030misc-libs.ipynb.py Outdated Show resolved Hide resolved

ch08performance/030misc-libs.ipynb.py Show resolved Hide resolved

ch08performance/030misc-libs.ipynb.py Outdated

Comment on lines 93 to 116

+              # %%
+              _ = jit_f(data)
+              # %%
+              # %%timeit
+              jit_f(data)
+              # %% [markdown]
+              # Surprisingly, the JITted function was slower than plain Python and NumPy
+              # implementation! Why did this happen? Numba does not provide valuable performance
+              # gains over pure Python or NumPy code for simple operations and small dataset.
+              # The JITted function turned out to be slower than the non-JIT implementation
+              # because of the compilation overhead. Let's try increasing the size of our
+              # data and perform a non-NumPy list comprehension on the data.
+              #

Member

dpshelio Aug 22, 2024

I'm confused, why do you run it first on line 94, and then again? Wouldn't the first one on line 94 generates the compilation code, and therefore, the timed line (98) it just the running time?

Member Author

Saransh-cpp Aug 23, 2024

Yes, I was just trying to time the running time and not the compile time. Should I not compile the function before timing it?

Member

dpshelio Aug 23, 2024

I think, in that case, it doesn't produce any difference, because as you said, the optimisation is not going to be much better than what numpy already does. We could leave it as it is and explain that for measuring JIT performance is normally done in that way. Though ideally, we should also measure the compilation to show that overhead.
So, I'd say, in all the jit examples, run it once with:

%timeit -n 1 -r 1
jit_function(data) # compilation and run

and then

%timeit
jit_function(data) # just run

This way we ensure that the compilation timing is calculated separately and don't affect the average (though in these examples is probably spread out). We need to note, however, that the result from the compilation run could be very noisy and could give a higher than real value (I think we mention that in the previous lesson when introducing timeit).

Member Author

Saransh-cpp Aug 23, 2024

Thank you! Working on this right now 🚀

ch08performance/030misc-libs.ipynb.py Outdated Show resolved Hide resolved

ch08performance/030misc-libs.ipynb.py Outdated Show resolved Hide resolved

ch08performance/030misc-libs.ipynb.py

Comment on lines +202 to +219

		# That does not work. The error might be solvable or it might just be out of Numba's
		# scope. Numba does not distinguish between plain Python

Member

dpshelio Aug 22, 2024

It feels that we should give some insight into the error. ~~But I don't know exactly where it is...~~
[Edit]: After playing with it, I think I've managed to so by flatting the arrays (and taking only the real part on the comparison)

@njit
def mandel_numpy(position,limit=50):
    value = position.flatten()
    diverged_at_count = np.zeros(position.shape).flatten()
    while limit > 0:
        limit -= 1
        value = value**2+position.flatten()
        diverging = (value * np.conj(value)).real > 4
        if limit == 49:  print(diverging.shape, diverged_at_count.shape)
        first_diverged_this_time = (np.logical_and(diverging, diverged_at_count == 0))
        diverging_at_count[first_diverged_this_time] = limit
        value[diverging] = 2

    return diverged_at_count.reshape(position.shape)

ymatrix, xmatrix = np.mgrid[ymin:ymax:ystep, xmin:xmax:xstep]
values = xmatrix + 1j * ymatrix

Member Author

Saransh-cpp Aug 23, 2024

Thank you for looking into this! 😄

ch08performance/030misc-libs.ipynb.py Outdated

+              # %%
+              df
+              # The computation gave us a dask object and not the actual answer. Why is that?
+              #  We can visualise the dask task graph using -

Member

dpshelio Aug 22, 2024

using ... is that a -?

Member Author

Saransh-cpp Aug 23, 2024

Ah I think I misplaced something here. Will fix it!

Saransh-cpp requested a review from dpshelio

August 24, 2024 09:30

Saransh-cpp and others added 3 commits

August 25, 2024 15:21


          feat: add a page about compiling + parallel and GPU computing

d043d39


          Apply suggestions from code review

23037f4

Co-authored-by: David Pérez-Suárez <[email protected]>


          final suggestions from the review

6edb6da

Saransh-cpp force-pushed the saransh/performance-updates branch from ad6d3bf to 6edb6da Compare

August 25, 2024 14:21

dpshelio approved these changes

View reviewed changes

Member

dpshelio left a comment

Awesome!!!!

dpshelio merged commit 224597e into main

5 checks passed

dpshelio deleted the saransh/performance-updates branch

September 6, 2024 18:21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet