Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shuffling labels and coordinates #136

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

vucinick
Copy link

Implementation of shuffling labels (closes issue #80) and shuffling coordinates (closes issue #79).

Notes

  • I created a shuffle directory in the preprocessing directory as it seemed the most logical place for this.
  • both scripts are together in this directory. Is this OK or should I put each of them in a separate directory?
  • both scripts were tested with my own input data as well as with the dataset LIBD_DLPFC (sample Br5292_151507)

This was linked to issues Dec 14, 2023
@niklasmueboe
Copy link
Collaborator

@naveedishaque what do you think regarding the structure?

@@ -0,0 +1,6 @@
channels:
- conda-forge
- defaults
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove defaults. Shouldn't be needed here

@@ -0,0 +1,6 @@
channels:
- conda-forge
- defaults
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove defaults. Shouldn't be needed here

}

# Randomize labels
df_randomized <- data.frame(label = sample(df$label))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have an additional colum, in this dataframe that splits the label into high and low confidence. Should that be shuffled too?
@naveedishaque

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "if we shuffle the label we should also shuffle the confidence"?
My feeling is to keep a low confidence spot as a low confidence spot even after label shuffling.
Does that make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. In that case the code still needs to be adjusted to keep the additional columns untouched. Make sure only the labels are shuffled and the rownames still match all the other existing columns

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll add the changes :)

@naveedishaque
Copy link
Contributor

@naveedishaque what do you think regarding the structure?

They should be separate

@niklasmueboe
Copy link
Collaborator

We need to figure out how to make use of it i.e. how tpo connect this to the metrics and trace back that it is a simulation. I think this probably requires a bit more thought on the workflow side.

Also wondering if we need a separate folder for simulations @naveedishaque

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shuffling_labels shuffling_coordinates
3 participants