Looking at the code here:
|
shared_counts = ( |
|
self.store.select( |
|
"/main/{}/counts".format(label), "columns in ['c_0', c_last]" |
|
) |
|
.sum(axis="index") |
|
.values |
|
+ 0.5 |
|
) |
I notice that for the denominator of the enrichment ratio (shared_counts), the code sums all the values for c_0 and c_last and then adds a single pseudo-count of 0.5. Later, for the numerator, the code adds a pseudo-count of 0.5 to each count in the numerator.
Wouldn't this have the effect of potentially skewing the ratios so that they wouldn't sum to 1? For instance, let's say c_0 = [1, 3, 1, 2]
Then shared_counts = (1 + 3 + 1 + 2) + 0.5 = 7.5
And then the ratios would be:
1.5/7.5 = 0.2
3.5/7.5 = 0.467
1.5/7.5 = 0.2
2.5/7.5 = 0.333
and the sum of the ratios = 1.2 instead of 1.
I would have thought that for shared_counts the code would have added the 0.5 pseudo count prior to the sum (or alternatively added a pseudo_count = 0.5*len(c_0)) ?
I may be misreading things though.
Looking at the code here:
Enrich2/enrich2/selection.py
Lines 533 to 540 in bb31cfd
I notice that for the denominator of the enrichment ratio (
shared_counts), the code sums all the values forc_0andc_lastand then adds a single pseudo-count of 0.5. Later, for the numerator, the code adds a pseudo-count of 0.5 to each count in the numerator.Wouldn't this have the effect of potentially skewing the ratios so that they wouldn't sum to 1? For instance, let's say
c_0= [1, 3, 1, 2]Then
shared_counts= (1 + 3 + 1 + 2) + 0.5 = 7.5And then the ratios would be:
1.5/7.5 = 0.2
3.5/7.5 = 0.467
1.5/7.5 = 0.2
2.5/7.5 = 0.333
and the sum of the ratios = 1.2 instead of 1.
I would have thought that for
shared_countsthe code would have added the 0.5 pseudo count prior to the sum (or alternatively added a pseudo_count = 0.5*len(c_0)) ?I may be misreading things though.