Skip to content

Commit 6edb6da

Browse files
committed
final suggestions from the review
1 parent 23037f4 commit 6edb6da

File tree

1 file changed

+52
-12
lines changed

1 file changed

+52
-12
lines changed

ch08performance/030misc-libs.ipynb.py

Lines changed: 52 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -97,18 +97,21 @@ def jit_f(x):
9797
f(data)
9898

9999
# %%
100-
_ = jit_f(data)
100+
# %%timeit -n 1 -r 1
101+
_ = jit_f(data) # compilation and run
101102

102103
# %%
103104
# %%timeit
104-
jit_f(data)
105+
jit_f(data) # just run
105106

106107
# %% [markdown]
107108
# Surprisingly, the JITted function was slower than plain Python and NumPy
108109
# implementation! Why did this happen? Numba does not provide valuable performance
109110
# gains over pure Python or NumPy code for simple operations and small dataset.
110111
# The JITted function turned out to be slower than the non-JIT implementation
111-
# because of the compilation overhead. Let's try increasing the size of our
112+
# because of the compilation overhead. Note that the result from the compilation
113+
# run could be very noisy and could give a higher than real value, as mentioned
114+
# in the previous lessons. Let's try increasing the size of our
112115
# data and perform a non-NumPy list comprehension on the data.
113116
#
114117
# The `jit` decorator with `nopython=True` is so widely used there exists an alias
@@ -130,9 +133,13 @@ def jit_f(x):
130133
# %%timeit
131134
f(data)
132135

136+
# %%
137+
# %%timeit -n 1 -r 1
138+
_ = jit_f(data) # compilation and run
139+
133140
# %%
134141
# %%timeit
135-
jit_f(data)
142+
jit_f(data) # just run
136143

137144
# %% [markdown]
138145
# That was way faster than the non-JIT function! But, the result was still slower
@@ -168,9 +175,13 @@ def mandel1(position, limit=50):
168175
xs = [(xmin + (xmax - xmin) * i / resolution) for i in range(resolution)]
169176
ys = [(ymin + (ymax - ymin) * i / resolution) for i in range(resolution)]
170177

178+
# %%
179+
# %%timeit -n 1 -r 1
180+
data = [[mandel1(complex(x, y)) for x in xs] for y in ys] # compilation and run
181+
171182
# %%
172183
# %%timeit
173-
data = [[mandel1(complex(x, y)) for x in xs] for y in ys]
184+
data = [[mandel1(complex(x, y)) for x in xs] for y in ys] # just run
174185

175186
# %% [markdown]
176187
# The compiled code already beats our fastest NumPy implementation! It is not
@@ -203,7 +214,6 @@ def mandel_numpy(position,limit=50):
203214
# %%timeit
204215
mandel_numpy(values)
205216

206-
207217
# %% [markdown]
208218
# That does not work. The error might be solvable or it might just be out of Numba's
209219
# scope. Numba does not distinguish between plain Python
@@ -213,6 +223,39 @@ def mandel_numpy(position,limit=50):
213223
# a subset of Python and NumPy so it is possible that a NumPy snippet does not
214224
# work but a simplified Python loop does.
215225

226+
# %% [markdown]
227+
# Let's make minor adjustments to fix the NumPy implementation and measure its
228+
# performance. We flatten the NumPy arrays and consider only the real part
229+
# while performing the comparison.
230+
# %%
231+
@njit
232+
def mandel_numpy(position,limit=50):
233+
value = position.flatten()
234+
diverged_at_count = np.zeros(position.shape).flatten()
235+
while limit > 0:
236+
limit -= 1
237+
value = value**2 + position.flatten()
238+
diverging = (value * np.conj(value)).real > 4
239+
first_diverged_this_time = (np.logical_and(diverging, diverged_at_count == 0))
240+
diverged_at_count[first_diverged_this_time] = limit
241+
value[diverging] = 2
242+
243+
return diverged_at_count.reshape(position.shape)
244+
245+
ymatrix, xmatrix = np.mgrid[ymin:ymax:ystep, xmin:xmax:xstep]
246+
values = xmatrix + 1j * ymatrix
247+
248+
# %%
249+
# %%timeit -n 1 -r 1
250+
mandel_numpy(values) # compilation and run
251+
252+
# %%
253+
# %%timeit
254+
mandel_numpy(values) # just run
255+
256+
# %% [markdown]
257+
# The code performs similar to the plain Python example!
258+
216259
# %% [markdown]
217260
# Numba also has functionalities to vectorize, parallelize, and strictly type check
218261
# the code. All of these functions boost the performance even further or helps
@@ -287,16 +330,13 @@ def mandel_numpy(position,limit=50):
287330
# %%
288331
df
289332
# The computation gave us a dask object and not the actual answer. Why is that?
290-
# We can visualise the dask task graph using -
291-
292-
# %% [markdown]
293333
# Displaying the dataframe just displays the metadata of the variable, and not any
294334
# data. This is because of the "lazy" nature of dask. Dask has "lazy" execution,
295335
# which means that it will store the operations on the data and
296336
# create a task graph for the same, but will not execute the operations until a
297337
# user explicitly asks for the result. The metadata specifies `npartitions=10`,
298338
# which means that the dataframe is split into 10 parts that will be accessed parallely.
299-
# We can get explicitly tell dask to give us the dataframe values using `.compute()`.
339+
# We can explicitly tell dask to give us the dataframe values using `.compute()`.
300340

301341
# %%
302342
df.compute()
@@ -316,14 +356,14 @@ def mandel_numpy(position,limit=50):
316356
dask.visualize(new_df, filename="visualization.png")
317357

318358
# %% [markdown]
319-
# We can see that the task graph starts with 10 independent branches because out dataframe
359+
# We can see that the task graph starts with 10 independent branches because our dataframe
320360
# was split into 10 partitions at the start. Let's compute the answer.
321361

322362
# %%
323363
new_df.compute()
324364

325365
# %% [markdown]
326-
# Similarly, one can peform such computations on arrays and selected Python data structures
366+
# Similarly, one can peform such computations on arrays and selected Python data structures.
327367
#
328368

329369
# #### Dask support in Scientific Python ecosystem

0 commit comments

Comments
 (0)