Replies: 3 comments 5 replies
-
(I converted the issue into a Discussion, I hope that's okay) To better understand your question, the issue is not that skorch would not generally work with imblearn or I'm not aware of any existing solutions. If I had this problem, here are some things I would try (in order of how practical I think they are):
I couldn't find anything on this, could you provide a link? I thought imblearn was supposed to help with imbalanced classes, not with feature selection. |
Beta Was this translation helpful? Give feedback.
-
Hi, Thank you so much for the thoughtful response! I'll try to see if I can fool sklearn using the Dataset class. I am using time series data so not sure how feasible this would be, but I haven't looked into it before so hopefully this will solve my problem! So sorry I totally mixed up the package name I was using - I meant mlxtend: https://rasbt.github.io/mlxtend/api_subpackages/mlxtend.feature_selection/#sequentialfeatureselector Thanks again for your help I appreciate it! |
Beta Was this translation helpful? Give feedback.
-
Hi! Had you any success with it? I'm currently working with EEG data which dimensinality is (number channels x number of time points) and i'm trying to implement wrapper feature (channel) selection method based on ATCNet network which takes input data of dimensionality described above. As far as I could do I adapted some code from internet to that purpose from itertools import combinations
class SequentialBackwardSearch():
'''
Instantiate with Estimator and given number of features
'''
def __init__(self, k_features):
#self.estimator = estimator
self.k_features = k_features
def fit(self, X, y):
dim = X_train.shape[1]
self.module = ATCNet(n_channels = dim,
n_classes = 3,
input_size_s = 3.0,
sfreq = 500, tcn_activation=nn.ELU(),n_windows = 5)
self.wraped = NeuralNetClassifier(model,
max_epochs=10,
criterion=nn.CrossEntropyLoss(),
lr=0.02,
iterator_train__shuffle=True,
callbacks=[EpochScoring(scoring='accuracy', on_train=True, name='train_acc')],
device = 'cuda')
self.indices_ = tuple(range(dim))
self.subsets_ = [self.indices_]
score = self._calc_score(X, y, self.indices_,estimator = self.wraped)
self.scores_ = [score]
del.self.module
del.self.wraped
'''
Iterate through all the dimensions until k_features is reached
At the end of loop, dimension count is reduced by 1
'''
while dim > self.k_features:
scores = []
subsets = []
self.module = ATCNet(n_channels = dim,
n_classes = 3,
input_size_s = 3.0,
sfreq = 500, tcn_activation=nn.ELU(),n_windows = 5)
self.wraped = NeuralNetClassifier(model,
max_epochs=10,
criterion=nn.CrossEntropyLoss(),
lr=0.02,
iterator_train__shuffle=True,
callbacks=[EpochScoring(scoring='accuracy', on_train=True, name='train_acc')],
device = 'cuda')
'''
Iterate through different combinations of features, train the model,
record the score
'''
for p in combinations(self.indices_, r=dim - 1):
score = self._calc_score(X, y, p, self.wraped)
scores.append(score)
subsets.append(p)
#
# Get the index of best score
#
best_score_index = np.argmax(scores)
#
# Record the best score
#
self.scores_.append(scores[best_score_index])
#
# Get the indices of features which gave best score
#
self.indices_ = subsets[best_score_index]
#
# Record the indices of features for best score
#
self.subsets_.append(self.indices_)
dim -= 1 # Dimension is reduced by 1
print(self.scores_)
'''
Transform training, test data set to the data set
havng features which gave best score
'''
def transform(self, X):
return X[:, self.indices_]
'''
Train models with specific set of features
indices - indices of features
'''
def _calc_score(self, X, y, indices, estimator):
self.scores__ = []
self.cv = ShuffleSplit(2, test_size=0.3, random_state=42)
self.scores__ = cross_val_score(estimator, X, y, cv=self.cv)
score = self.scores__.mean()
return score but the problem is that model doenst reinitilize (doenst reset weights), maybe anyone got suggetions about it? |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have a relatively small dataset and was wondering if there was a way to do backward sequential feature selection using skorch?
Sorry I know this is a fairly niche question and probably outside the scope of this package. I was just hopeful that maybe someone has done this and had a suggested package that works well with skorch for this. I am also definitely open to other ways of doing feature selection that work well with skorch
I'm used to using imblearn when using sklearn classifiers since it can select the number of features with the "parsimonious" setting/get detailed info using the get_metric_dict() function, but imblearn requires 2D data. I also tried sklearn's SequentialFeatureSelector which has fewer settings, but it also seems to require 2D data
Beta Was this translation helpful? Give feedback.
All reactions