You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it would be good to include the datasets we used in the benchmarks and reported on in the paper. Perhaps an extra folder benchmarks can be created with scripts that people can use to test performance, and at least the datasets we used in the paper? I think that would be a good service for people interested in the library and possibly in finding any remaining bottlenecks.
The text was updated successfully, but these errors were encountered:
I think it is a good idea, but I would not put it directly in main branch.
Maybe we could create a benchmark branch and put there all the data-sets we want, what do you think ?
The reason is that I think the data should not be directly present in the package or at least in the main branch of the package.
I think it is a good idea, but I would not put it directly in main branch.
Hmm, I'm not sure I agree. Though I think I see why you say this ("only code in main"), including benchmarks scripts and data is the approach in scikit-learn for example: https://github.com/scikit-learn/scikit-learn (see benchmarks folder with scripts, and the datasets subpackage with the data itself). Additionally, the data would be saved as text files, not as binary.
I think it would be good to include the datasets we used in the benchmarks and reported on in the paper. Perhaps an extra folder
benchmarks
can be created with scripts that people can use to test performance, and at least the datasets we used in the paper? I think that would be a good service for people interested in the library and possibly in finding any remaining bottlenecks.The text was updated successfully, but these errors were encountered: