Matched label returns `NaN` in metric calculation #215

Jieran-S · 2024-02-19T14:55:37Z

SpaceHack2023/metric/jaccard/jaccard.py

Lines 60 to 63 in 2e81727

    
           if not args.matched_labels: 
        
               contingency_table = pd.crosstab(domains, groundtruth) 
        
               row_ind, col_ind = linear_sum_assignment(contingency_table, maximize=True) 
        
               domains = domains.map(dict(zip(row_ind, col_ind)))

Some metrics(MCC, Jaccard) require matched labels, if the labels are not pre-matched, the script will implement a matching algorithm (above). But when no. of domain label > no. of ground truth label, the resulted domains object has many NaN, leading to downstream error.

Can any metric people look into it and propose a potential fix?

The text was updated successfully, but these errors were encountered:

shdam · 2024-02-20T09:53:58Z

Could this be fixed with pd.crosstab(dropna = False)? https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html

Also, isn't the intention to prevent over-clustering with a resolution optimization or similar, which would prevent no. of domain label > no. of ground truth label to be true?

Jieran-S · 2024-02-20T10:27:41Z

Yea agree...The issue also arise from model who dont convert resolution to n_cluster. But in case we want to investigate robustness of the clustering methods in the future it would be good to have this option imo

Jieran-S added bug Something isn't working help wanted Extra attention is needed metric labels Feb 19, 2024

Jieran-S changed the title ~~Matched label returns Nan in metric calculation~~ Matched label returns NaN in metric calculation Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matched label returns `NaN` in metric calculation #215

Matched label returns `NaN` in metric calculation #215

Jieran-S commented Feb 19, 2024

shdam commented Feb 20, 2024

Jieran-S commented Feb 20, 2024

Matched label returns NaN in metric calculation #215

Matched label returns NaN in metric calculation #215

Comments

Jieran-S commented Feb 19, 2024

shdam commented Feb 20, 2024

Jieran-S commented Feb 20, 2024

Matched label returns `NaN` in metric calculation #215

Matched label returns `NaN` in metric calculation #215