-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warnings regarding scikit-learn version: Version mismatch might lead to invalid results #22
Comments
@mehdigolzadeh can you take care of this? Is there an easy way to convert the model to the more recent version of sklearn? |
Are there any news regarding this issue? |
I sent an email to the maintainer. I think he changed his email address, explaining why he's not even aware of this issue :-) Let's wait a few days for him to react. As a side note, a researcher in my team is currently working on a new approach to detect bots in repositories hosted on GitHub, based on the various activities they make. The main difference compared to Bodegha is that the new model/tool will rely on a limited number of queries on the GitHub API, implying it will be much faster to detect bots in practice. However, so far, we have no insight about the accuracy of this approach but we are confident it will be, at least, comparable to Bodegha's accuracy. That said, do not expect the tool to be released before October/November :-) |
@mehdigolzadeh Any update? |
Still no reaction from the maintainer? (And also thanks for your side note. Nevertheless, I would like to stay with BoDeGHa, at least, for a certain time, as it is already part of my toolchain, and changing tools always implies additional efforts...) |
He reacted by mail saying he would give some feedback "soon"... :-) I've just sent another email. |
I apologize for the delayed response; I've been swamped with numerous tasks. Unfortunately, I couldn't find the time to run and train a new model, but I did come up with a quick temporary fix. The warning is still present, but I've ignored it because the model is functioning without any problems. I plan to train the model using the new version of scikit-learn as soon as I have some free time. |
If the model is still working with the new version of sklearn, would it be possible to load it in the new version and to export it with the new model format? |
I did this. Now, the model is exported using the new version of scikit-learn. However, I couldn't resolve the warning because the parameter needs to be passed during training. |
Up to now, I used BoDeGHa with sci-learn version
0.22
, as stated inrequirements.txt
:BoDeGHa/requirement.txt
Line 3 in ac8a5d6
However, when installing BoDeGHa freshly, it uses sci-learn version
1.0.1
, since this is the version given insetup.py
:BoDeGHa/setup.py
Line 34 in ac8a5d6
But using
1.0.1
leads to warnings when running BoDeGHa, as the pretrained model was trained with0.22
:So, as there is a mismatch of the scikit-learn versions in your repository, this needs to be fixed somehow – using a pretrained model that was not trained using the current scikit-learn version could lead to wrong results.
To fix this, one can either set the scikit-learn version in
setup.py
back to0.22
, or you need to provide a new pretrained model for1.0.1
in the repository.I tried to set the version of scikit-learn in
setup.py
back to0.22
, but without success: scikit-learn0.22
is not compatible with the current version of numpy any more (AttributeError: module 'numpy' has no attribute 'float'. `np.float` was a deprecated alias for the builtin 'float'
). Downgrading numpy to version1.19.5
(the version before the deprecation ofnp.float
) was not possible, as numpy1.19.5
does not work with python 3.10. Installing numpy1.21.2
(which is compatible with python 3.10), results in another error (ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
). I also tried other versions of numpy in-between1.19.5
and1.21.2
, also without success.So, finally, I did not manage to install scikit-learn version
0.22
on python3.10, on which your pretrained model was trained.Could you please update the pretrained model in this repository to work with scikit-learn
1.0.1
? – or could you prove that using your0.22
-pretrained model with1.0.1
is still correct and prevent the corresponding warnings somehow?Thanks in advance! This would help a lot and increase the reliability of your tool when such risk warnings would disappear 😉
The text was updated successfully, but these errors were encountered: