Identification of lncRNA Signature Associated With Pan-cancer Prognosis

Published in IEEE Journal of Biomedical and Health Informatics, 2020

Recommended citation: Guoqing Bao, Ran Xu, Xiuying Wang, Jianxiong Ji, Linlin Wang, Wenjie Li, Qing Zhang, Bin Huang, Anjing Chen, Beihua Kong, Qifeng Yang, Xinyu Wang, Jian Wang, Xingang Li. (2020). "Identification of lncRNA Signature Associated With Pan-cancer Prognosis" IEEE Journal of Biomedical and Health Informatics. doi: 10.1109/JBHI.2020.3027680. https://doi.org/10.1109/JBHI.2020.3027680

Long noncoding RNAs (lncRNAs) have emerged as potential prognostic markers in various human cancers as they participate in many malignant behaviors. However, the value of lncRNAs as prognostic markers among diverse human cancers is still under investigation, and a systematic signature based on these transcripts that related to pan-cancer prognosis has yet to be reported. In this study, we proposed a framework to incorporate statistical power, biological rationale and machine learning models for pan-cancer prognosis analysis. The framework identified a 5-lncRNA signature (ENSG00000206567, PCAT29, ENSG00000257989, LOC388282, and LINC00339) from TCGA training studies (n=1,878). The identified lncRNAs are significantly associated (all P1.48E-11) with overall survival (OS) of the TCGA cohort (n=4,231). The signature stratified the cohort into low- and high-risk groups with significantly distinct survival outcomes (median OS of 9.84 years versus 4.37 years, log-rank P=1.48E-38) and achieved a time-dependent ROC/AUC of 0.66 at 5 years. After routine clinical factors involved, the signature demonstrated a better performance for long-term prognostic estimation (AUC of 0.72). Moreover, the signature was further evaluated on two independent external cohorts (TARGET, n=1,122; CPTAC, n=391; National Cancer Institute) which yielded similar prognostic values (AUC of 0.60 and 0.75; log-rank P=8.6E-09 and P=2.7E-06). An indexing system was developed to map the 5-lncRNA signature to prognoses of pan-cancer patients. In silico functional analysis indicated that the lncRNAs are associated with common biological processes driving human cancers. The five lncRNAs, especially ENSG00000206567, ENSG00000257989 and LOC388282 that never reported before, may serve as viable molecular targets common among diverse cancers.

Access full paper here

Open-source Code in GitHub: link