Chen, Zhishan; Song, Wenqiang; Li, Qing; Li, Chao; Wen, Wanqing; Huyghe, Jeroen R.; Law, Philip J.; Fernandez-Rozadilla, Ceres; Timofeeva, M. N.; Thomas, Minta; Schmit, Stephanie L.; Martin, Vicente; Devall, Matthew A. M.; Dampier, Christopher Heaton; Moratalla-Navarro, Ferran; Cai, Qiuyin; Wang, Jifeng; Shi, Jiajun; Kweon, Sun-seog; Tanikawa, Chizu; Jia, Weihua; Shu, Xiang; Long, Jirong; Gao, Jing; Kim, Jeongseon; Shin, Aesun; Matsuo, Keitaro; Jee, Sun-ha; Jung, Keum-ji; Wang, Nan; Kim, Dong-hyun; Ping, Jie; Yang, Gong; Shin, Minho; Ren, Zefang; Oh, Jae-hwan; Oze, Isao; Ahn, Yoon-ok; Gao, Yutang; Pan, Zhizhong; Kamatani, Yoichiro; van Kaer, Luc; Wu, Lan; Li, Bingshan; Matsuda, Koichi; Shu, Xiaoou; Hsu, Li; Dunlop, Malcolm G.; Gruber, Stephen Bernard; Houlston, Richard S.; Tomlinson, Ian P. M.; Li, Li; Lau, Ken S.; Moreno, Victor R.; Casey, Graham R.; Peters, Ulrike; Zheng, Wei; & Guo, Xingyi. (2026). Mixed-model and transcriptome-wide association analyses identify transcription factors and genes associated with colorectal cancer susceptibility. Nature Communications, 17(1), 1377. https://doi.org/10.1038/s41467-025-68127-z
Transcription factors (TFs) are proteins that bind to DNA and turn genes on or off. Some genetic variants linked to colorectal cancer (CRC) may change how these transcription factors bind to DNA, but the specific TFs involved are not well understood. In this study, researchers analyzed 218 TF ChIP-Seq datasets, which map where transcription factors bind across the genome, together with large genome-wide association study (GWAS) data from more than 100,000 people with CRC and over 150,000 people without CRC from East Asian and European populations. They identified 51 transcription factors and TF–cofactor interactions, including cofactors of the vitamin D receptor (VDR), as important regulators of CRC risk.
To better understand how these regulatory changes affect genes, the researchers combined their findings with transcriptome-wide association studies (TWAS), which estimate how genetically predicted gene expression relates to disease risk. They also examined alternative splicing (different ways RNA messages are assembled) and alternative polyadenylation (differences in how RNA molecules are finalized) using RNA sequencing data from individuals of Asian and European ancestry. This multi-ancestry TWAS identified 222 genes associated with CRC risk, including 95 newly discovered genes and 48 genes that may be possible drug targets. Single-cell RNA sequencing provided additional biological support for about 45 percent of these genes, and laboratory experiments confirmed that three genes—RHPN2, IRS2, and TXN—have cancer-promoting, or oncogenic, roles. Overall, this study maps important transcription factor–gene regulatory networks and uncovers new genes that contribute to colorectal cancer risk.

Fig. 1: Associations between TFs with CRC risk using generalized linear mixed models.
A A flow chart to illustrate the integrative analysis of ChIP-seq data (n = 218) for 84 TFs and CRC GWAS summary statistics from 100,204 cases and 154,587 controls of European and East Asian ancestry. B A total of 51 identified TFs with genetic variation of TF-DNA bindings significantly associated with CRC risk. P-values were determined by a two-sided Wald Z test. The dashed line represents a Bonferroni-corrected P < 0.05. C The host motifs of identified TFs were enriched in their ChIP-seq peaks. D Analysis of co-occupied binding regions of the top 10 CRC risk-associated TFs. Venn diagrams in the upper-right triangle show the number of genetic variants (multiplied by 1000) that are occupied by specific TFs or co-occupied by two TFs in each TF pair. Bar plots in the lower-left triangle show the association strengths (regression coefficients) for the genetic variants occupied by two TFs (only the first TF and only the second TF, respectively) as indicated from left to right. Two TFs with significant interactions at the Bonferroni-threshold of P < 3.92 × 10−5(0.05/1,275 TF pairs from 51 TFs) are highlighted in red. P-values were determined by a two-sided Wald Z test.