Skip to content

Commit f01578e

Browse files
fix permutation correction for clustering scores
this fixes a bug whereby permutation-corrected clustering scores could appear to be "perfect" (i.e., have a value of 1) if the given feature dimension had all the same value across all words in the list. with this correction, these "invalid" clustering scores will now have a corrected value of 0.5 (i.e., exactly equal to "chance").
1 parent 6c847a4 commit f01578e

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

quail/analysis/clustering.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -220,5 +220,10 @@ def compute_feature_weights(pres_list, rec_list, feature_list, distances):
220220
def _permute(egg, feature, distdict, func, n_perms=100):
221221
perms = [func(shuffle_egg(egg), feature, distdict, False, None) for i in range(n_perms)]
222222
real = func(egg, feature, distdict, False, None)
223-
bools = [perm < real for perm in perms]
223+
224+
# permuted values that are *less* than the
225+
# observed value contribute a score of 1; permuted
226+
# values that are *equal* to the observed value contribute 0.5;
227+
# all others (strictly greater than) contribute 0.
228+
bools = [1 if perm < real else 0.5 if perm == real else 0 for perm in perms]
224229
return np.sum(np.array(bools), axis=0) / n_perms

0 commit comments

Comments
 (0)