GMAT is available at https://github.com/chaoning/GMAT
Contact
Chao Ning
ningchao(at)sdau(dot)edu(dot)cn
ningchao91(at)gmail(dot)com
Install
GMAT will keep updating. Please uninstall older version to obtain the latest functions. The easiest uninstall way:
> pip uninstall gmat
Dependencies
- numpy>=1.16.0
- pandas>=0.19.0
- scipy>=1.1.1
- cffi>=1.12.0
- pandas_plink>=2.0.0
- tqdm>=4.43.0
We recommend using a Python distribution such as Anaconda (Python 3.7 version). This distribution can be used on Linux and Windows and is free. It is the easiest way to get all the required package dependencies.
Quick install
> pip install gmat
Detailed Package Install Instructions
(1) Install the dependent packages
(2) Go to the directory of GMAT and type
> python setup.py install
REMMAX function
Rapid Epistatic Mixed Model Association Studies
Cite:
- Dan Wang, Hui Tang, Jian-Feng Liu, Shizhong Xu, Qin Zhang and Chao Ning. Rapid Epistatic Mixed Model Association Studies by Controlling Multiple Polygenic Effects. BioRxiv, 2020. doi: https://doi.org/10.1101/2020.03.05.976498
- Chao Ning, Dan Wang, Huimin Kang, Raphael Mrode, Lei Zhou, Shizhong Xu, Jian-Feng Liu. A rapid epistatic mixed-model association analysis by linear retransformations of genomic estimated values. Bioinformatics, 2018, 34(11): 1817-1825.
Format of the input file
Plink binary file including *.bed, *.bim and *.fam.
Missing genotypes are recommended to impute with Beagle or other softwares, although they will be imputed according the frequency of occurrence locus by locus.phenotypic file:
(1) Delimited by blanks or tabs;
(2) All individuals in the plink file must have phenotypic values. If no, please remove these individuals from the plink binary file;
(3) The fisrt column is the family id and the second column is the individual id. The first two columns are the same to plink fam file, but order can be different;
(4) The last column is the phenotypic values. Miss values are not allowed;
(5) The covariates (including population means) are put before the phenotypic column. A column of 1’s must be contained.
An example phenotypic file with four covariates (population mean, sex, age, treatmeant or untreatmeant) is as follows:
12659 14462 1 0 126 0 0.58
12659 14463 1 0 91 1 0.39
12659 14464 1 1 126 0 0.37
12659 14465 1 0 91 1 0.9
12659 14466 1 0 91 1 0.84
12659 14467 1 0 91 1 0.61
12659 14468 1 1 91 1 0.84
Exhaustive additive by addtive epistatis
Data: Mouse data in directory of GMAT/examples/data/mouse
Include additive and additive by additive genomic relationship matrix
(1) Exact test (for small data)
1 | import logging |
(2) Parallel exact test (for small data)
Analysis can be subdivided with remma_epiAA_cpu_parallel and run parallelly on different machines.
1 | # Step 1-3 is same to the above |
(3) approximate test (recommended for big data)
1 | import logging |
(4) Parallel approximate test (recommended for big data)
Analysis can be subdivided with remma_epiAA_eff_cpu_c_parallel and run parallelly on different machines.
1 | # Step 1-6 is same to the above |
Include additive, dominance and additive by additive genomic relationship matrix
(1) Exact test (for small data)
1 | import logging |
(2) Parallel exact test (for small data)
Analysis can be subdivided with remma_epiAA_cpu_parallel and run parallelly on different machines.
1 | # Step 1-3 is same to the above |
(3) approximate test (recommended for big data)
1 | import logging |
(4) Parallel approximate test (recommended for big data)
Analysis can be subdivided with remma_epiAA_eff_cpu_c_parallel and run parallelly on different machines.
1 | # Step 1-6 is same to the above |
Include additive, dominance and three kinds of epistatic genomic relationship matrix
additive, dominance, additive by additive, additive by dominance and dominance by dominance genomic relationship matrix
(1) Exact test (for small data)
1 | import logging |
(2) Parallel exact test (for small data)
Analysis can be subdivided with remma_epiAA_cpu_parallel and run parallelly on different machines.
1 | # Step 1-3 is same to the above |
(3) approximate test (recommended for big data)
1 | import logging |
(4) Parallel approximate test (recommended for big data)
Analysis can be subdivided with remma_epiAA_eff_cpu_c_parallel and run parallelly on different machines.
1 | # Step 1-6 is same to the above |