|
89 | 89 | "\n",
|
90 | 90 | "References:<br>\n",
|
91 | 91 | "**An Introduction to the Bootstrap.**<br>\n",
|
92 |
| - "B. Efron and R. Tibshirani, *Chapman & Hall/CRC, (1993).\n", |
| 92 | + "B. Efron and R. Tibshirani, Chapman & Hall/CRC, (1993).\n", |
93 | 93 | "\n",
|
94 | 94 | "**[What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784504/)**<br>\n",
|
95 | 95 | "T. C. Hesterberg, *The American Statistician*, 69(4), 371–386, (2015).<br>\n",
|
|
106 | 106 | {
|
107 | 107 | "cell_type": "code",
|
108 | 108 | "execution_count": null,
|
109 |
| - "metadata": { |
110 |
| - "ExecuteTime": { |
111 |
| - "end_time": "2018-11-27T07:53:54.139086Z", |
112 |
| - "start_time": "2018-11-27T07:53:53.019320Z" |
113 |
| - } |
114 |
| - }, |
| 109 | + "metadata": {}, |
115 | 110 | "outputs": [],
|
116 | 111 | "source": [
|
117 | 112 | "import numpy as np\n",
|
|
130 | 125 | "cell_type": "code",
|
131 | 126 | "execution_count": null,
|
132 | 127 | "metadata": {
|
133 |
| - "ExecuteTime": { |
134 |
| - "end_time": "2018-11-27T07:54:26.080833Z", |
135 |
| - "start_time": "2018-11-27T07:54:24.422776Z" |
136 |
| - }, |
137 | 128 | "scrolled": true
|
138 | 129 | },
|
139 | 130 | "outputs": [],
|
|
152 | 143 | "cell_type": "markdown",
|
153 | 144 | "metadata": {},
|
154 | 145 | "source": [
|
155 |
| - "<div class=\"alert alert-warning\">**Exercice:**<br>\n", |
| 146 | + "<div class=\"alert alert-warning\">\n", |
| 147 | + "\n", |
| 148 | + "**Exercice:**<br>\n", |
156 | 149 | "Implement a Bootstrap algorithm to estimate the distribution of the empirical average and empirical median estimators on this data.<br>\n",
|
157 | 150 | "Plot the histogram of these distributions (use [`plt.hist`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html)).<br>\n",
|
158 | 151 | "Estimate the estimator's empirical average from the bootstrap samples and the mean of its distribution.\n",
|
|
162 | 155 | {
|
163 | 156 | "cell_type": "code",
|
164 | 157 | "execution_count": null,
|
165 |
| - "metadata": { |
166 |
| - "ExecuteTime": { |
167 |
| - "end_time": "2018-11-27T08:05:50.870129Z", |
168 |
| - "start_time": "2018-11-27T08:05:45.840992Z" |
169 |
| - } |
170 |
| - }, |
| 158 | + "metadata": {}, |
171 | 159 | "outputs": [],
|
172 | 160 | "source": [
|
173 | 161 | "# %load solutions/code1.py\n",
|
|
239 | 227 | "\n",
|
240 | 228 | "In practice, we never get $\\varphi_A$.\n",
|
241 | 229 | "\n",
|
242 |
| - "<div class=\"alert alert-success\">**Key results:**<br>\n", |
| 230 | + "<div class=\"alert alert-success\">\n", |
| 231 | + "\n", |
| 232 | + "**Key results:**<br>\n", |
243 | 233 | "*1st result:* $\\varphi_A$ is always at least as good as $\\varphi$; $e_A \\leq e$<br>\n",
|
244 | 234 | "<br>\n",
|
245 | 235 | "*2nd result:* The highest the variance of $\\varphi$ across training sets $\\mathcal{T}$, the more improvement $\\varphi_A$ produces.<br>\n",
|
|
277 | 267 | "cell_type": "code",
|
278 | 268 | "execution_count": null,
|
279 | 269 | "metadata": {
|
280 |
| - "ExecuteTime": { |
281 |
| - "end_time": "2018-11-27T08:30:50.285839Z", |
282 |
| - "start_time": "2018-11-27T08:30:47.046918Z" |
283 |
| - }, |
284 | 270 | "scrolled": true
|
285 | 271 | },
|
286 | 272 | "outputs": [],
|
|
307 | 293 | "\n",
|
308 | 294 | "Xblue = X[y==-1]\n",
|
309 | 295 | "Xred = X[y==1]\n",
|
310 |
| - "plt.figure()\n", |
| 296 | + "fig=plt.figure(figsize=(8, 8), dpi= 80, facecolor='w', edgecolor='k')\n", |
311 | 297 | "plt.scatter(Xblue[:,0],Xblue[:,1],c='b')\n",
|
312 |
| - "_=plt.scatter(Xred[:,0],Xred[:,1],c='r')" |
| 298 | + "plt.scatter(Xred[:,0],Xred[:,1],c='r');" |
313 | 299 | ]
|
314 | 300 | },
|
315 | 301 | {
|
|
325 | 311 | {
|
326 | 312 | "cell_type": "code",
|
327 | 313 | "execution_count": null,
|
328 |
| - "metadata": { |
329 |
| - "ExecuteTime": { |
330 |
| - "end_time": "2018-11-27T08:31:51.917021Z", |
331 |
| - "start_time": "2018-11-27T08:31:51.427330Z" |
332 |
| - } |
333 |
| - }, |
| 314 | + "metadata": {}, |
334 | 315 | "outputs": [],
|
335 | 316 | "source": [
|
336 | 317 | "from sklearn import tree\n",
|
|
378 | 359 | {
|
379 | 360 | "cell_type": "code",
|
380 | 361 | "execution_count": null,
|
381 |
| - "metadata": { |
382 |
| - "ExecuteTime": { |
383 |
| - "end_time": "2018-11-27T08:31:54.298461Z", |
384 |
| - "start_time": "2018-11-27T08:31:54.127632Z" |
385 |
| - } |
386 |
| - }, |
| 362 | + "metadata": {}, |
387 | 363 | "outputs": [],
|
388 | 364 | "source": [
|
389 | 365 | "### Generate data\n",
|
|
402 | 378 | "cell_type": "markdown",
|
403 | 379 | "metadata": {},
|
404 | 380 | "source": [
|
405 |
| - "<div class=\"alert alert-warning\">**Exercice:**<br>\n", |
| 381 | + "<div class=\"alert alert-warning\">\n", |
| 382 | + "\n", |
| 383 | + "**Exercice:**<br>\n", |
406 | 384 | "Implement a Bagging procedure that builds a forest of 101 trees.<br>\n",
|
407 | 385 | "Monitor the training and generalization error of individual trees and of the forest, along the forest growth.<br>\n",
|
408 | 386 | "Display and comment.\n",
|
|
412 | 390 | {
|
413 | 391 | "cell_type": "code",
|
414 | 392 | "execution_count": null,
|
415 |
| - "metadata": { |
416 |
| - "ExecuteTime": { |
417 |
| - "end_time": "2018-11-27T08:42:35.501698Z", |
418 |
| - "start_time": "2018-11-27T08:42:35.495866Z" |
419 |
| - } |
420 |
| - }, |
| 393 | + "metadata": {}, |
421 | 394 | "outputs": [],
|
422 | 395 | "source": [
|
423 | 396 | "from sklearn import tree\n",
|
|
435 | 408 | {
|
436 | 409 | "cell_type": "code",
|
437 | 410 | "execution_count": null,
|
438 |
| - "metadata": { |
439 |
| - "ExecuteTime": { |
440 |
| - "end_time": "2018-11-27T08:42:36.951140Z", |
441 |
| - "start_time": "2018-11-27T08:42:36.091230Z" |
442 |
| - } |
443 |
| - }, |
| 411 | + "metadata": {}, |
444 | 412 | "outputs": [],
|
445 | 413 | "source": [
|
446 | 414 | "# %load solutions/code2.py\n",
|
|
451 | 419 | {
|
452 | 420 | "cell_type": "code",
|
453 | 421 | "execution_count": null,
|
454 |
| - "metadata": { |
455 |
| - "ExecuteTime": { |
456 |
| - "end_time": "2018-11-27T08:42:43.119557Z", |
457 |
| - "start_time": "2018-11-27T08:42:40.526180Z" |
458 |
| - } |
459 |
| - }, |
| 422 | + "metadata": {}, |
460 | 423 | "outputs": [],
|
461 | 424 | "source": [
|
462 | 425 | "### Display\n",
|
|
498 | 461 | "name": "python",
|
499 | 462 | "nbconvert_exporter": "python",
|
500 | 463 | "pygments_lexer": "ipython3",
|
501 |
| - "version": "3.6.4" |
| 464 | + "version": "3.6.9" |
502 | 465 | },
|
503 | 466 | "toc": {
|
504 | 467 | "base_numbering": 1,
|
|
0 commit comments