Look at distributions of features in each batch

The question was raised as to whether the feature composition of each batch is the same; I think it's going to be hard to do this on a per feature level because of the random dropout of highly correlated features, but I do think there are a few metrics we can quite easily generate for each batch based just on the columns present in each CSV:

- [ ] How many total features did this batch use?
- [ ] What percent of features are Cells vs Nuclei vs Cytoplasm? (These should add to 100)
- [ ] What percent of features are Texture vs Neighbors vs AreaShape etc? (These should add to 100)
- [ ] What percent of features are RNA vs DNA vs ER vs Mito vs AGP vs BF? (These should not typically add to 100 but may coincidentally, since AreaShape features have no channels and Colocalization have 2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Look at distributions of features in each batch #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Look at distributions of features in each batch #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions