Open Access Open Access  Restricted Access Subscription or Fee Access

Efficient Identification of Complex Diseases Through Epistasis Computational Models: A Review

R Manavalan, S Priya



Genome-Wide Association Studies (GWAS) identify and characterize the genes that are associated with human diseases. One of the significant ongoing researches in GWAS is to identify the disease susceptible genes through Epistasis. The gene that masks the effects of other genes is called Epistasis. The gene interacts with another gene is known as epistatic or genetic interactions (GGIs). GWAS identifies the genetic variants of Single Nucleotide Polymorphism and also the interactions between SNPs to identify the disease susceptibility. The manual analysis of thousands of SNPs interactions was impractical to the physician. Hence, various statistical approaches and machine learning techniques were proposed to identify the genetic interactions in complex diseases such as Rheumatoid Arthritis (RA), Crohn diseases, Bipolar diseases (BD), Coronary Artery Disease and diabetes. This paper presents a survey on technological revolutions such as different data mining techniques, Machine Learning methods and statistical approaches used to identify the GGIs from Wellcome Trust Case Control Consortium (WTCCC) 7 diseases dataset, Crohn’s disease, RA, BD, Type I Diabetes (T1D) and Type II Diabetes (T2D) datasets. The issues behind the computational approaches to identify the diseases through Epistasis effects and also the parameters used by various researchers were analyzed.


Keywords: Disease, epistasis, genes, genetic variations GWAS, interactions, SNPs

Cite this Article

R Manavalan, S Priya. Efficient Identification of Complex Diseases Through Epistasis Computational Models: A Review. Research & Reviews: A Journal of Bioinformatics. 2020; 7(2): 21–32p.

Full Text:



Bush WS, Moore JH. Chapter 11: Genome-Wide Association Studies. PLOS Comput Biol [Internet]. 2012 Dec 27;8(12):e1002822. Available from:

Visscher PM, Brown MA, McCarthy MI, Yang J. Five years of GWAS discovery. Am J Hum Genet [Internet]. 2012;90(1):7–24. Available from:

Niel C, Sinoquet C, Dina C, Rocheleau G. A survey about methods dedicated to epistasis detection. Front Genet. 2015;6(SEP).

Altshuler D, Gibbs R, Peltonen L, Dermitzakis E, Schaffner S, Yu F, et al. Integrating common and rare genetic variation in diverse human populations. Nature 467: 52-58. Nature. 2010 Sep 2;467:52–8.

Genetics Home reference. What Are Single Nucleotide Polymorphisms (SNPs)? [Internet]. Available from:

Moore JH, Williams SM. New strategies for identifying gene-gene interactions in hypertension. Ann Med [Internet]. 2002 Jan 1;34(2):88–95. Available from:

VanderWeele TJ. Epistatic interactions. Stat Appl Genet Mol Biol [Internet]. 2010;9:Article 1. Available from:

Hunt CE. Gene-environment interactions: Implications for sudden unexpected deaths in infancy. Arch Dis Child. 2005;90(1):48–53.

Mackay TFC, Moore JH. Why epistasis is important for tackling complex human disease genetics. Genome Med. 2014;6(6):1–3.

Huang W, Richards S, Carbone M, Zhu D, Anholt R, Ayroles J, et al. Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc Natl Acad Sci U S A. 2012 Sep 4;109:15553–9.

Burton P, Clayton D, Cardon L, Craddock N, Duncanson A, Kwiatkowski D, et al. The Wellcome Trust Case Control Consortium (WTCCC) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661-678. Nature. 2007 Jun 7;447:661–78.

Crohn Disease [Internet]. Available from:

Ebbert MTW, Ridge PG, Kauwe JSK. Bridging the gap between statistical and biological epistasis in Alzheimer’s disease. Biomed Res Int. 2015;2015.

Brinza D, Schultz M, Tesler G, Bafna V. RAPID detection of gene-gene interactions in genome-wide association studies. Bioinformatics. 2010;26(22):2856–62.

Wu J, Devlin B, Ringquist S, Trucco M, Roeder K. Genetic Epidemiology (2010) Screen and Clean: A Tool for Identifying Interactions in Genome-Wide Association Studies. Genet Epidemiol. 2010 Apr 1;34:275–85.

Yang C, Wan X, Yang Q, Xue H, Yu W. Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso. BMC Bioinformatics. 2010;11(SUPPLL.1):1–11.

Hu X, Liu Q, Zhang Z, Li Z, Wang S, He L, et al. SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 2010;20(7):854–7.

Wan X, Yang C, Yang Q, Xue H, Tang NLS, Yu W. Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics. 2009;26(1):30–7.

Wang Y, Liu X, Robbins K, Rekaya R. AntEpiSeeker: Detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm. BMC Res Notes. 2010;3.

Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NLS, et al. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet [Internet]. 2010;87(3):325–40. Available from:

Emily M. IndOR: A new statistical procedure to test for SNP-SNP epistasis in genome-wide association studies. Stat Med. 2012;31(21):2359–73.

Kwon MS, Kim K, Lee S, Chung W, Yi SG, Namkung J, et al. GWAS-GMDR: A program package for genome-wide scan of gene-gene interactions with covariate adjustment based on multifactor dimensionality reduction. 2011 IEEE Int Conf Bioinforma Biomed Work BIBMW 2011. 2011;703–7.

Yung LS, Yang C, Wan X, Yu W. GBOOST: A GPU-based tool for detecting gene-gene interactions in genome-wide case control studies. Bioinformatics. 2011;27(9):1309–10.

Kam-Thong T, Azencott CA, Cayton L, Ptz B, Altmann A, Karbalai N, et al. GLIDE: GPU-based linear regression for detection of epistasis. Hum Hered. 2012;73(4):220–36.

Oh S, Lee J, Kwon MS, Weir B, Ha K, Park T. A novel method to identify high order gene-gene interactions in genome-wide association studies: gene-based MDR. BMC Bioinformatics. 2012;13 Suppl 9(Suppl 9).

Ueki M, Tamiya G. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis. BMC Bioinformatics. 2012;13(1).

Zhou Z, Liu G, Su L, Yan L, Han L. CChi: An efficient cloud epistasis test model in human genome wide association studies. Proc 2013 6th Int Conf Biomed Eng Informatics, BMEI 2013. 2013;(Bmei):787–91.

Lishout F Van, Mahachie John JM, Gusareva ES, Urrea V, Cleynen I, Théâtre E, et al. An efficient algorithm to perform multiple testing in epistasis screening. BMC Bioinformatics. 2013;14.

Zhu Z, Tong X, Zhu Z, Liang M, Cui W, Su K, et al. Development of GMDR-GPU for Gene-Gene Interaction Analysis and Its Application to WTCCC GWAS Data for Type 2 Diabetes. PLoS One. 2013;8(4).

Wei C, Lu Q. GWGGI: Software for genome-wide gene-gene interaction analysis. BMC Genet. 2014;15(1):1–6.

Wang X, Zhang D, Tzeng J. Pathway-Guided Identification of Gene-Gene Interactions. Ann Hum Genet. 2014 Sep 1;78.

Sapin E, Keedwell E, Frayling T. An Ant Colony Optimization and Tabu List Approach to the Detection of Gene-Gene Interactions in Genome-Wide Association Studies [Research Frontier]. IEEE Comput Intell Mag. 2015;10(4):1–21.

Mieth B, Kloft M, Rodríguez JA, Sonnenburg S, Vobruba R, Morcillo-Suárez C, et al. Combining multiple hypothesis testing with machine learning increases the statistical power of genome-wide association studies. Sci Rep [Internet]. 2016;6(May). Available from:

Zhao J, Zhu Y, Xiong M. Genome-wide gene-gene interaction analysis for next-generation sequencing. Eur J Hum Genet [Internet]. 2016;24(3):421–8. Available from:

Cowman T, Koyutürk M. Prioritizing tests of epistasis through hierarchical representation of genomic redundancies. Nucleic Acids Res. 2017;45(14):e131.

Leem S, Park T. An empirical fuzzy multifactor dimensionality reduction method for detecting gene-gene interactions. BMC Genomics. 2017;18(Suppl 2):1–12.

Martínez H, Barrachina S, Castillo MI, Quintana-Ortí ES, Argila JR De, Farré X, et al. Accelerating FaST-LMM for Epistasis Tests. In: ICA3PP. 2017.

Wu WKK, Sun R, Zuo T, Tian Y, Zeng Z, Ho J, et al. A novel susceptibility locus in MST1 and gene-gene interaction network for Crohn’s disease in the Chinese population. J Cell Mol Med. 2018;22(4):2368–77.

Yang CH, Chuang LY, Lin Y Da. CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies. Bioinformatics. 2017;33(15):2354–62.

Ning C, Wang D, Kang H, Mrode R, Zhou L, Xu S, et al. A rapid epistatic mixed-model association analysis by linear retransformations of genomic estimated values. Bioinformatics. 2018;34(11):1817–25.

Sinoquet C. A method combining a random forest-based technique with the modeling of linkage disequilibrium through latent variables, to run multilocus genome-wide association studies. BMC Bioinformatics. 2018;19(1):1–24.

Manduchi E, Chesi A, Hall MA, Grant SFA, Moore JH. Leveraging putative enhancer-promoter interactions to investigate two-way epistasis in Type 2 Diabetes GWAS. Pacific Symp Biocomput. 2018;0(212669):548–58.

Sinoquet C, Niel C. Enhancement of a stochastic Markov blanket framework with ant colony optimization, to uncover epistasis in genetic association studies. ESANN 2018 - Proceedings, Eur Symp Artif Neural Networks, Comput Intell Mach Learn. 2018;(April):673–8.

Xu EL, Qian X, Yu Q, Zhang H, Cui S. Feature selection with interactions in logistic regression models using multivariate synergies for a GWAS application. BMC Genomics. 2018;19(Suppl 4).

Yang C-H, Chuang L-Y, Lin Y-D. Epistasis Analysis using an Improved Fuzzy C-means-based Entropy Approach. IEEE Trans Fuzzy Syst. 2019;PP(L):1–1.

Yang CH, Chuang LY, Lin Y Da. An improved fuzzy set-based multifactor dimensionality reduction for detecting epistasis. Artif Intell Med [Internet]. 2020;102(November 2019):101768. Available from:

Sun L, Liu G, Su L, Wang R. SEE: a novel multi-objective evolutionary algorithm for identifying SNP epistasis in genome-wide association studies. Biotechnol Biotechnol Equip [Internet]. 2019;33(1):529–47. Available from:

Sun L, Liu G, Wang R. SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation with Automatic Gene Matrix in Genome-Wide Association Studies. IEEE Access. 2020;8:21676–93.


  • There are currently no refbacks.

Copyright (c) 2020 Research & Reviews: A Journal of Bioinformatics