You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The evaluation engine is a component on the server which handles multiple tasks. This is currently implemented in Java and we want to rebuild it in Python, and compartmentalised per each function, for easier maintenance/more accessible to new contributors. One of its tasks is calculating meta-features over tabular datasets.
The engine should take tabular datasets and calculate a set of meta-features of them. Meta-features with an existing name should produce identical results, as much as possible currently available meta-features should remain available. Probably want to work with PyMFE.
The text was updated successfully, but these errors were encountered:
@joaquinvanschoren you were assigned and there is a listed "in progress". Could you write down what progress there is, if any? Then unassign yourself (assuming you are not working on this).
@NathanFCarvalho worked on this from March-June. He has written a script to compute meta-features with PyMFE which works on almost all datasets (tested on about 5000 datasets, but slow on the very large ones). It's a script because PyMFE does most of the work.
The remaining task would be to store the computed meta-features in OpenML, and rework the code so it can run as a cronjob.
Sidenote: PyMFE uses different names for the metafeatures, and they can be quite cryptic. Nathan made a mapping to more understandable names. However, these are not 100% the same as the existing meta-features. We need to decide whether we want to keep the old meta-features, or exclusively use the new ones for consistency.
I unassigned myself since I have a lot on my plate already, but this should be a very doable and well-contained task.
The evaluation engine is a component on the server which handles multiple tasks. This is currently implemented in Java and we want to rebuild it in Python, and compartmentalised per each function, for easier maintenance/more accessible to new contributors. One of its tasks is calculating meta-features over tabular datasets.
The engine should take tabular datasets and calculate a set of meta-features of them. Meta-features with an existing name should produce identical results, as much as possible currently available meta-features should remain available. Probably want to work with PyMFE.
The text was updated successfully, but these errors were encountered: