Software Programs from Hongzhe Li's Group

Supported by NIH grants ES009911, CA127334, GM097505, GM129781 and GM123056, we have developed statistical and computational methods for analysis of genetic, genomics and microbiome data. We list here the programs that implement some of our latest statistical methods for analysis of genetivcs, genomics and microbiome data sets, most are written in R/Mathlab or C, and are available as Cran-R packages. Some datasets can also be found here. You can also find documentation about he programs and software installation. Note that these program are constantly beining updated. Some packages are provided as stand-alone package and are available on other public sites.

For questions and comments, please email Dr. Hongzhe Li, (hongzhe@pennmedicine.upenn.edu).

Standalone Software Developed by Our Group and are Available at Public Domains (GitHub)


TransLasso (Li, Cai and Li, 2022 JRSS(B))

TransLasso package implmenting the transfer learnong methods based on high dimensional linear regression models "
- Sai Li, Tony Cai and Hongzhe Li
- Source code language: R

TransCLIME (Li, Cai and Li, 2022 JASA)

TransCLIME package implmenting the transfer learnong methods Gaussian grahical models
- Sai Li, Tony Cai and Hongzhe Li
- Source code language: R

PermRecover (Ma, Cai and Li, 2021 JASA)

PermReover package implmenting the spectral permutation recovery method for permuted monotone matrix model for estimating the bacterial growth rate.
- Rong Ma, Tony Cai and Hongzhe Li
- Source code language: R

HighD-logistic (Ma, Cai and Li, 2021 JASA)

HighD-logistic package implmenting the global and simultaneous hypothesis testing methods for high-dimensional logistic regression models.
- Rong Ma, Tony Cai and Hongzhe Li
- Source code language: R

MaxBlock (Wang, Wang and Li, 2022 AOAS)

MaxBlock package implmenting the sparse block signal detection and identification methods for studying shared genetic variants between two traits.
- Jianqiao Wang and Hongzhe Li
- Source code language: R

GeneCorr (Wang and Li, 2021 Biometrika)

GeneCorr package implmenting the moment estimation of genetic correlationb bassed on summary GWAS statistics.
- Jianqiao Wang and Hongzhe Li
- Source code language: R

DAFOT (Wang, Cai and Li, 2020 Biometrika)

DAFOT package implmenting the detector of active flow that can be used to identify bacterial taxa along the phylogenetic tree.
- Shulei Wang and Hongzhe Li
- Source code language: R and Python

Demic (Gao and Li, 2018 Nature Methods)

DEMIC package for estimating the bacterial growth rates based on genome assemblies.
- Yuan Gao, Hongzhe Li and Scott Daniel
- Source code language: Python

Micropower (Kelly et al, 2015 Bioinformatics)

Micropower package for microbiome power calculation based on PERMANOVA.
- Brandon Kelly et al.
- Source code language: R

ZIBR (Chen and Li, 2016 Bioinformatics)

ZIBR package for fitting zero-inflated mixed-effects models for repeated measureed microbiome data.
- Eric Zhang Chen, Hongzhe Li
- Source code language: R

MATCHCLIP

Program for identifying the breakpoints and CNVs based on the next generation sequence data using CIGAR strings.
- Wu Y, Tian L, Pirastu M, Stambolian D, Li H(2013, Frontiers in Genetics): MATCHCLIP: Locate precise break points for copy number variation using CIGAR strings.

Cran R codes - Programs and Data Sets Used in Our papers

GUniFrac

This program is available as a cran-R package (link here) that implements a generalized UniFrac distance for analysis of microbiome data.
- Chen J and Li H (2012).
- Source code language: R

MiRKAT (Zhao et al, 2015 Am J of Hum Genetics)

MiRKAT package for testing microbiome and outcome association, adjusting for possible covariates.
- Ni Zhao, Michael Wu et al.
- Source code language: R

R/Mathlab Codes for Download - Programs and Data Sets Used in Our papers


HighD-Mixed

This program .R file includes R codes for inference of high dimensional linear mixed-effects models.
- Sai Li, Tony Cai and Hongzhe Li (2022, JASA).
- Source code language: R

CompReg

This program .zip file includes R codes for compositional data regression in paper "Variable selection in regression with compositional covariates".
- Wei Lin and Li H (2014, Biometrika).
- Source code language: R

GLM-constraints

This program .zip file includes Matlab codes and real data sets used in paper "Generalized Linear Models with Linear Constraints for Microbiome Compositional Data".
- Jiaru Lu, Pixu Shi and Li H (2019 Biometrics).
- Source code language: Mathlab

FDE

This program .zip file includes Matlab codes and real data sets used in paper "Optimal estimation of genetic relatedness in high-dimensional linear models.
- Guo Z, Wang W, Cai, TT and Li H (2019JASA).
- Source code language: Mathlab

Modified Rank Tests

This program .zip file includes R codes and real data sets used in paper on modified rank tests for data with excessive zeros.
- Wang W, Chen EZ and Li H.
- Source code language: R