@@ -97,18 +97,21 @@ def jit_f(x):
97
97
f (data )
98
98
99
99
# %%
100
- _ = jit_f (data )
100
+ # %%timeit -n 1 -r 1
101
+ _ = jit_f (data ) # compilation and run
101
102
102
103
# %%
103
104
# %%timeit
104
- jit_f (data )
105
+ jit_f (data ) # just run
105
106
106
107
# %% [markdown]
107
108
# Surprisingly, the JITted function was slower than plain Python and NumPy
108
109
# implementation! Why did this happen? Numba does not provide valuable performance
109
110
# gains over pure Python or NumPy code for simple operations and small dataset.
110
111
# The JITted function turned out to be slower than the non-JIT implementation
111
- # because of the compilation overhead. Let's try increasing the size of our
112
+ # because of the compilation overhead. Note that the result from the compilation
113
+ # run could be very noisy and could give a higher than real value, as mentioned
114
+ # in the previous lessons. Let's try increasing the size of our
112
115
# data and perform a non-NumPy list comprehension on the data.
113
116
#
114
117
# The `jit` decorator with `nopython=True` is so widely used there exists an alias
@@ -130,9 +133,13 @@ def jit_f(x):
130
133
# %%timeit
131
134
f (data )
132
135
136
+ # %%
137
+ # %%timeit -n 1 -r 1
138
+ _ = jit_f (data ) # compilation and run
139
+
133
140
# %%
134
141
# %%timeit
135
- jit_f (data )
142
+ jit_f (data ) # just run
136
143
137
144
# %% [markdown]
138
145
# That was way faster than the non-JIT function! But, the result was still slower
@@ -168,9 +175,13 @@ def mandel1(position, limit=50):
168
175
xs = [(xmin + (xmax - xmin ) * i / resolution ) for i in range (resolution )]
169
176
ys = [(ymin + (ymax - ymin ) * i / resolution ) for i in range (resolution )]
170
177
178
+ # %%
179
+ # %%timeit -n 1 -r 1
180
+ data = [[mandel1 (complex (x , y )) for x in xs ] for y in ys ] # compilation and run
181
+
171
182
# %%
172
183
# %%timeit
173
- data = [[mandel1 (complex (x , y )) for x in xs ] for y in ys ]
184
+ data = [[mandel1 (complex (x , y )) for x in xs ] for y in ys ] # just run
174
185
175
186
# %% [markdown]
176
187
# The compiled code already beats our fastest NumPy implementation! It is not
@@ -203,7 +214,6 @@ def mandel_numpy(position,limit=50):
203
214
# %%timeit
204
215
mandel_numpy (values )
205
216
206
-
207
217
# %% [markdown]
208
218
# That does not work. The error might be solvable or it might just be out of Numba's
209
219
# scope. Numba does not distinguish between plain Python
@@ -213,6 +223,39 @@ def mandel_numpy(position,limit=50):
213
223
# a subset of Python and NumPy so it is possible that a NumPy snippet does not
214
224
# work but a simplified Python loop does.
215
225
226
+ # %% [markdown]
227
+ # Let's make minor adjustments to fix the NumPy implementation and measure its
228
+ # performance. We flatten the NumPy arrays and consider only the real part
229
+ # while performing the comparison.
230
+ # %%
231
+ @njit
232
+ def mandel_numpy (position ,limit = 50 ):
233
+ value = position .flatten ()
234
+ diverged_at_count = np .zeros (position .shape ).flatten ()
235
+ while limit > 0 :
236
+ limit -= 1
237
+ value = value ** 2 + position .flatten ()
238
+ diverging = (value * np .conj (value )).real > 4
239
+ first_diverged_this_time = (np .logical_and (diverging , diverged_at_count == 0 ))
240
+ diverged_at_count [first_diverged_this_time ] = limit
241
+ value [diverging ] = 2
242
+
243
+ return diverged_at_count .reshape (position .shape )
244
+
245
+ ymatrix , xmatrix = np .mgrid [ymin :ymax :ystep , xmin :xmax :xstep ]
246
+ values = xmatrix + 1j * ymatrix
247
+
248
+ # %%
249
+ # %%timeit -n 1 -r 1
250
+ mandel_numpy (values ) # compilation and run
251
+
252
+ # %%
253
+ # %%timeit
254
+ mandel_numpy (values ) # just run
255
+
256
+ # %% [markdown]
257
+ # The code performs similar to the plain Python example!
258
+
216
259
# %% [markdown]
217
260
# Numba also has functionalities to vectorize, parallelize, and strictly type check
218
261
# the code. All of these functions boost the performance even further or helps
@@ -287,16 +330,13 @@ def mandel_numpy(position,limit=50):
287
330
# %%
288
331
df
289
332
# The computation gave us a dask object and not the actual answer. Why is that?
290
- # We can visualise the dask task graph using -
291
-
292
- # %% [markdown]
293
333
# Displaying the dataframe just displays the metadata of the variable, and not any
294
334
# data. This is because of the "lazy" nature of dask. Dask has "lazy" execution,
295
335
# which means that it will store the operations on the data and
296
336
# create a task graph for the same, but will not execute the operations until a
297
337
# user explicitly asks for the result. The metadata specifies `npartitions=10`,
298
338
# which means that the dataframe is split into 10 parts that will be accessed parallely.
299
- # We can get explicitly tell dask to give us the dataframe values using `.compute()`.
339
+ # We can explicitly tell dask to give us the dataframe values using `.compute()`.
300
340
301
341
# %%
302
342
df .compute ()
@@ -316,14 +356,14 @@ def mandel_numpy(position,limit=50):
316
356
dask .visualize (new_df , filename = "visualization.png" )
317
357
318
358
# %% [markdown]
319
- # We can see that the task graph starts with 10 independent branches because out dataframe
359
+ # We can see that the task graph starts with 10 independent branches because our dataframe
320
360
# was split into 10 partitions at the start. Let's compute the answer.
321
361
322
362
# %%
323
363
new_df .compute ()
324
364
325
365
# %% [markdown]
326
- # Similarly, one can peform such computations on arrays and selected Python data structures
366
+ # Similarly, one can peform such computations on arrays and selected Python data structures.
327
367
#
328
368
329
369
# #### Dask support in Scientific Python ecosystem
0 commit comments