Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results #3869

github-actions · 2024-04-12T09:59:18Z

You can preview this rule here (updated a few minutes after each push).

Review

A dedicated reviewer checked the rule description successfully for:

logical errors and incorrect information
information gaps and missing content
text style and tone
PR summary and labels follow the guidelines

…efore methods yielding results

ghislainpiot · 2024-04-15T12:31:15Z

rules/S6970/python/rule.adoc

+
+== Why is this an issue?
+
+When using the Scikit-learn library it is crucial to train the model before


Maybe replace "model" by "estimator" or "transformer", or both

ghislainpiot · 2024-04-15T12:32:07Z

rules/S6970/python/rule.adoc

+attempting to get results. Failing to do so can lead to incorrect results or runtime errors. 
+The training is done with the help of the `fit` method and retrieving results can be done for example with the `predict` method.
+
+If the `predict` method is called without a prior call to the `fit` method, a `NotFittedError` will be thrown.


Maybe specify it's an exception ?

ghislainpiot · 2024-04-15T12:38:19Z

rules/S6970/python/rule.adoc

+parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
+svc = svm.SVC()
+clf = GridSearchCV(svc, parameters)
+
+results = clf.cv_results_ # raises an AttributeError


We could have a more minimalistic exemple like :

from sklearn.datasets import load_iris from sklearn.neighbors import KNeighborsClassifier iris = load_iris() knn = KNeighborsClassifier(1) knn.n_samples_fit_ # Raises an AttributeError

Yes and it makes me notice that I did not put the list of attributes that would raise an error.

ghislainpiot · 2024-04-15T12:38:47Z

rules/S6970/python/rule.adoc

+results = clf.cv_results_ # raises an AttributeError
+----
+
+In the example above failing to train the model on the iris dataset with the


Add , after "above"

ghislainpiot · 2024-04-15T12:39:29Z

rules/S6970/python/rule.adoc

+
+== How to fix it
+
+To fix the issue train the model by using the `fit` method.


ghislainpiot · 2024-04-15T14:45:24Z

rules/S6970/python/rule.adoc

@@ -75,16 +75,16 @@ kmeans.predict(X) # Compliant
 == Resources
 === Documentation

-* Scikit-learn Documentation - https://scikit-learn.org/stable/glossary.html#term-fit[term fit reference]
+* Scikit-learn Documentation - https://scikit-learn.org/stable/glossary.html#term-fit[Glossary fit reference]
 * Scikit-learn Documentation - https://scikit-learn.org/stable/modules/generated/sklearn.exceptions.NotFittedError.html#sklearn.exceptions.NotFittedError[NotFittedError reference]

 ifdef::env-github,rspecator-view[]

 Implementation details: 



should we check if the object is deserialized through pickle/joblib/... ? Or are we going to lose the type anyways ?

I don't think we would be able to get the proper type back if it is deserialized that's true. I don't think we will be able to raise the error correctly in that case.

ghislainpiot · 2024-04-15T14:48:53Z

rules/S6970/python/rule.adoc

+iris = datasets.load_iris()
+X = iris.data
+
+kmeans = KMeans(n_clusters=3, random_state=42)


Maybe we should have a correct random_state

In what sense this is not correct?

Sorry, it's my bad. I got confused with the numpy.random.Generator

ghislainpiot · 2024-04-16T07:19:22Z

rules/S6970/python/metadata.json

+ "sqKey": "S6970",
+ "scope": "All",
+ "defaultQualityProfiles": ["Sonar way"],
+ "quickfix": "unknown",


nitpick: maybe set the quick fix to infeasible ?

sonarqube-next · 2024-04-16T07:56:42Z

Quality Gate passed for 'rspec-frontend'

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

sonarqube-next · 2024-04-16T07:56:50Z

Quality Gate passed for 'rspec-tools'

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

github-actions bot assigned joke1196 Apr 12, 2024

github-actions bot added the python label Apr 12, 2024

joke1196 changed the title ~~Create rule S6970~~ Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results Apr 15, 2024

joke1196 and others added 2 commits April 15, 2024 11:12

Create rule S6970

ce57225

Create rule S6970: The Scikit-learn \"fit\" method should be called b…

249eea5

…efore methods yielding results

joke1196 force-pushed the rule/add-RSPEC-S6970 branch from 96c24ae to 249eea5 Compare April 15, 2024 09:12

joke1196 added 3 commits April 15, 2024 11:40

Fixed asciidoc error

a55d815

Fix documentation formatting

1da08c1

Fix description and list indent

1a774b4

joke1196 marked this pull request as ready for review April 15, 2024 11:59

joke1196 requested a review from ghislainpiot April 15, 2024 11:59

ghislainpiot reviewed Apr 15, 2024

View reviewed changes

joke1196 added 2 commits April 15, 2024 15:18

Added implementation details in the comment section

fb901b8

Fix after review

138fac4

ghislainpiot reviewed Apr 15, 2024

View reviewed changes

joke1196 requested a review from ghislainpiot April 16, 2024 07:03

ghislainpiot approved these changes Apr 16, 2024

View reviewed changes

joke1196 added 2 commits April 16, 2024 09:41

Fix quickfix metadata

b60539a

Improved implementation details

5df20e6

joke1196 requested a review from jean-jimbo-sonarsource April 16, 2024 09:00

jean-jimbo-sonarsource approved these changes Apr 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results #3869

Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results #3869

github-actions bot commented Apr 12, 2024

ghislainpiot Apr 15, 2024

ghislainpiot Apr 15, 2024

ghislainpiot Apr 15, 2024

joke1196 Apr 15, 2024

ghislainpiot Apr 15, 2024

ghislainpiot Apr 15, 2024

ghislainpiot Apr 15, 2024

joke1196 Apr 16, 2024

ghislainpiot Apr 15, 2024

joke1196 Apr 16, 2024 •

edited

ghislainpiot Apr 16, 2024

ghislainpiot Apr 16, 2024

sonarqube-next bot commented Apr 16, 2024

sonarqube-next bot commented Apr 16, 2024


		== Why is this an issue?

		When using the Scikit-learn library it is crucial to train the model before


		== How to fix it

		To fix the issue train the model by using the `fit` method.

Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results #3869

Are you sure you want to change the base?

Create rule S6970: The Scikit-learn \"fit\" method should be called before methods yielding results #3869

Conversation

github-actions bot commented Apr 12, 2024

Review

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joke1196 Apr 16, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonarqube-next bot commented Apr 16, 2024

Quality Gate passed for 'rspec-frontend'

sonarqube-next bot commented Apr 16, 2024

Quality Gate passed for 'rspec-tools'

joke1196 Apr 16, 2024 •

edited