You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,9 @@
1
1
# MLCourse
2
2
3
3
This repository contains teaching material for an introductory machine learning course.
4
-
You can find an interactive preview of the Pluto notebooks of this course [here](https://bio322.epfl.ch) and you can run some notebooks on [mybinder](https://mybinder.org/v2/gh/jbrea/MLCourse/binder?urlpath=pluto/open?path%3D/home/jovyan/MLCourse/index.jl) (some notebooks will crash on mybinder when they hit the memory limit).
4
+
5
+
**Students of my course do not need to pull this repository or follow the instructions
6
+
below. This repository is mostly used to create the interactive websites on [https://bio322.epfl.ch](https://bio322.epfl.ch).**
bikesharing.dropna(inplace=True) # remove rows with missing data
653
-
bikesharing.dtypes.to_frame().reset_index()
654
662
"""
655
663
,
656
664
cache =false
657
665
)
658
666
659
667
# ╔═╡ b9ba1df0-5086-4c0f-a2c9-200c2be27294
660
-
md"Above we see that the `:count` column is detected as `Continuous`, whereas it should be `Count`. We will therefore coerce it to the correct scientific type in the first line of the cell below.
668
+
mlstring(md"Above we see that the `:count` column is detected as `Continuous`, whereas it should be `Count`. We will therefore coerce it to the correct scientific type in the first line of the cell below.
661
669
662
-
For count variables we can use Poisson regression. Following the standard recipe, we parametrize ``f(x) = \theta_0 + \theta_1 x_1 + \cdots +\theta_d x_d``, plug this into the formula of the Poisson distribution and fit the parameters ``\theta_0, \ldots, \theta_d`` by maximizing the log-likelihood. In `MLJ` this is done by the `CountRegressor()`."
670
+
For count variables we can use Poisson regression. Following the standard recipe, we parametrize ``f(x) = \theta_0 + \theta_1 x_1 + \cdots +\theta_d x_d``, plug this into the formula of the Poisson distribution and fit the parameters ``\theta_0, \ldots, \theta_d`` by maximizing the log-likelihood. In `MLJ` this is done by the `CountRegressor()`.",
671
+
md"")
663
672
664
673
# ╔═╡ 81c55206-bf59-4c4e-ac5e-77a46e31bec7
665
674
mlcode(
@@ -682,6 +691,7 @@ m4.fit(bikesharing[['temp', 'humidity']], bikesharing['count']) # Fitting the mo
682
691
683
692
m4.coef_ # Retrieving the fitted parameters
684
693
"""
694
+
,
685
695
)
686
696
687
697
# ╔═╡ 6ea40424-22a0-42b9-bfab-8d4903ab8d64
@@ -698,6 +708,7 @@ predict(m4)
698
708
"""
699
709
m4.predict(bikesharing[['temp', 'humidity']])
700
710
"""
711
+
,
701
712
)
702
713
703
714
# ╔═╡ aa96bbe7-49f4-4244-9c71-8d9b2b3ee065
@@ -788,7 +799,7 @@ In the multiple linear regression of the weather data set above we used all
788
799
md"""
789
800
#### Exercise 5
790
801
- Read the section on [scientific types in the MLJ manual](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#Data-containers-and-scientific-types).
791
-
- Coerce the `count` variable of the bike sharing data to `Continuous` and fit a linear model (`LinearRegressor`) with predictors `:temp` and `:humidity`.
802
+
- Coerce the `count` variable of the bike sharing data to `Continuous` and fit a linear model (`LinearRegressor`) with predictors `:temp` and `:humidity`.
792
803
Create a scatter plot with the true counts `bikesharing.count` on the x-axis and the predicted mode (`predict_mode`) of the counts for the linear regression model and the Poisson model on the y-axis. If the model perfectly captures the data, the plotted points should lie on the diagonal; you can add `plot!(identity)` to the figure to display the diagonal.
793
804
Comment on the differences you see in the plot between the Poisson model and the linear regression model.
0 commit comments