maayanlab_bioinformatics.normalization package¶
Submodules¶
maayanlab_bioinformatics.normalization.cpm module¶
- maayanlab_bioinformatics.normalization.cpm.cpm_normalize(mat)[source]¶
- maayanlab_bioinformatics.normalization.cpm.cpm_normalize(mat: ndarray)
- maayanlab_bioinformatics.normalization.cpm.cpm_normalize(mat: DataFrame)
Compute counts-per-million value of counts Simple division of each column by the total sum of its counts and multiplying it by 10^6
maayanlab_bioinformatics.normalization.filter module¶
- maayanlab_bioinformatics.normalization.filter.filter_by_expr(mat, design: Series | None = None, group: DataFrame | None = None, min_count=10, min_total_count=15, large_n=10, min_prop=0.7, tol=1e-14)[source]¶
Ported from R https://rdrr.io/bioc/edgeR/src/R/filterByExpr.R
- maayanlab_bioinformatics.normalization.filter.filter_by_var(mat: DataFrame, top_n=2500, axis=1)[source]¶
Select rows with the most variable expression accross all samples. Takes a dataframe and returns a filtered dataframe in the same orientation. e.g. |condition_1|condition_2| gene_1| 1 | 1 | gene_2| 0 | 10 |
gene_1 here is not variable at all, gene_2 here is very variable.
gene_1 will be dropped, while gene_2 is kept.
maayanlab_bioinformatics.normalization.log module¶
- maayanlab_bioinformatics.normalization.log.log10_normalize(mat, offset=1.0)[source]¶
- maayanlab_bioinformatics.normalization.log.log10_normalize(mat: ndarray, offset=1.0)
- maayanlab_bioinformatics.normalization.log.log10_normalize(mat: DataFrame, offset=1.0)
- maayanlab_bioinformatics.normalization.log.log10_normalize(mat: Series, offset=1.0)
Compute log normalization of matrix Simple
log10(x + offset)
, offset usually set to 1. because log(0) is undefined.
- maayanlab_bioinformatics.normalization.log.log2_normalize(mat, offset=1.0)[source]¶
- maayanlab_bioinformatics.normalization.log.log2_normalize(mat: ndarray, offset=1.0)
- maayanlab_bioinformatics.normalization.log.log2_normalize(mat: DataFrame, offset=1.0)
- maayanlab_bioinformatics.normalization.log.log2_normalize(mat: Series, offset=1.0)
Compute log normalization of matrix Simple
log2(x + offset)
, offset usually set to 1. because log(0) is undefined.
maayanlab_bioinformatics.normalization.quantile module¶
maayanlab_bioinformatics.normalization.quantile_legacy module¶
- maayanlab_bioinformatics.normalization.quantile_legacy.quantile_normalize(mat)[source]¶
- maayanlab_bioinformatics.normalization.quantile_legacy.quantile_normalize(mat: ndarray)
- maayanlab_bioinformatics.normalization.quantile_legacy.quantile_normalize(mat: DataFrame)
Perform quantile normalization on the values of a matrix In the case of a pd.DataFrame, preserve the index on the output frame. See: https://en.wikipedia.org/wiki/Quantile_normalization
maayanlab_bioinformatics.normalization.zscore module¶
- maayanlab_bioinformatics.normalization.zscore.zscore_normalize(mat, ddof=0)[source]¶
- maayanlab_bioinformatics.normalization.zscore.zscore_normalize(mat: ndarray, ddof=0)
- maayanlab_bioinformatics.normalization.zscore.zscore_normalize(mat: DataFrame, ddof=0)
- maayanlab_bioinformatics.normalization.zscore.zscore_normalize(mat: Series, ddof=0)
Compute the z score of each value in the sample, relative to the sample mean and standard deviation. In the case of a pd.DataFrame, preserve the index on the output frame.
Module contents¶
This module contains functions relating to data normalization.