Ongoing Research Projects

Cancer Genomics

I have contributed to our lab's somatic mutation calling pipeline and software: MuSE. I've also worked with clinicians and specialists to define the mutational landscape of anaplastic thyroid cancer and esophageal cancer.

miRNAs associated with TmS

miRNAs are key regulators of gene expression. Our lab has recently published TmS, a metric for quantifying the total mRNA expression of a tumor sample. Using data from TCGA, I am working better understand TmS by identifying cancer-specific and pan-cancer associations with miRNAs.

Computational Deconvolution

Computational deconvolution has been a revolutionary tool for identifying cell type- and tumor-specific patterns from bulk RNAseq data. No tool has been created or benchmarked for deconvolution of miRNA data. My main thesis project is working to address this need.

Tumor Evolution & Subclonal Reconstruction

Cancers originate with a single cell that acquires oncogenic mutations. The descendants of that cell will share these clonal mutations, and subpopulations will arise that share subclonal mutations. Identifying these clonal and subclonal mutations is key to studying tumor evolution and heterogeneity, and will be critical to precision oncology. Our method CliP is an extremely fast and accurate method enabling this study.

Publications

Estimation of tumor cell total mRNA expression in 15 cancer types predicts disease progression

Single-cell RNA sequencing studies have suggested that total mRNA content correlates with tumor phenotypes. Technical and analytical challenges, however, have so far impeded at-scale pan-cancer examination of total mRNA content. Here we present a method to quantify tumor-specific total mRNA expression (TmS) from bulk sequencing data, taking into account tumor transcript proportion, purity and ploidy, which are estimated through transcriptomic/genomic deconvolution. We estimate and validate TmS in 6,590 patient tumors across 15 cancer types, identifying significant inter-tumor variability. Across cancers, high TmS is associated with increased risk of disease progression and death. TmS is influenced by cancer-specific patterns of gene alteration and intra-tumor genetic heterogeneity as well as by pan-cancer trends in metabolic dysregulation. Taken together, our results indicate that measuring cell-type-specific total mRNA expression in tumor cells predicts tumor phenotypes and clinical outcomes.

Impact of Somatic Mutations on Survival Outcomes in Patients With Anaplastic Thyroid Carcinoma

Anaplastic thyroid carcinoma (ATC) uniformly present with aggressive disease, but the mutational landscape of tumors varies. We aimed to determine whether tumor mutations affect survival outcomes in ATC. Patients who underwent mutation sequencing using targeted gene panels between 2005 and 2019 at a tertiary referral center were included. Associations between mutation status and survival outcomes were assessed using Cox proportional hazards models.A total of 202 patients were included, where 122 died of ATC (60%). The median follow-up was 31 months (interquartile range, 18-45 months). The most common mutations were in TP53 (59%), BRAF (41%), TERT promoter (37%), and the RAS gene family (22%). Clinicopathologic characteristics and overall survival (OS) significantly correlated with mutations in BRAFV600E and RAS, which were mutually exclusive. Mutation analysis provides prognostic information in ATC and should be incorporated into routine clinical care.

MuSE: A Novel Approach to Mutation Calling with Sample-Specific Error Modeling

Book cover Variant Calling pp 21–27Cite as MuSE: A Novel Approach to Mutation Calling with Sample-Specific Error Modeling Shuangxi Ji, Matthew D. Montierth & Wenyi Wang Protocol First Online: 26 June 2022 253 Accesses Part of the Methods in Molecular Biology book series (MIMB,volume 2493) Abstract Accurate detection of somatic mutations in genetically heterogeneous tumor cell populations using next-generation sequencing remains challenging. We have developed MuSE, Mutation calling using a Markov Substitution model for Evolution, a novel approach for modeling the evolution of the allelic composition of tumor and normal tissue at each reference base. It adopts a sample-specific error model to depict inter-tumor heterogeneity, which greatly improves the overall accuracy. Here, we describe the method and provide a tutorial on the installation and application of MuSE.

CliP: subclonal architecture reconstruction of cancer cells in DNA sequencing data using a penalized likelihood model

Subpopulations of tumor cells characterized by mutation profiles may confer differential fitness and consequently influence prognosis of cancers. Understanding subclonal architecture has the potential to provide biological insight in tumor evolution and advance precision cancer treatment. Recent methods comprehensively integrate single nucleotide variants (SNVs) and copy number aberrations (CNAs) to reconstruct subclonal architecture using whole-genome or whole-exome sequencing (WGS, WES) data from bulk tumor samples. However, the commonly used Bayesian methods require a large amount of computational resources, a prior knowledge of the number of subclones, and extensive post-processing. Regularized likelihood modeling approach, never explored for subclonal reconstruction, can inherently address these drawbacks. We therefore propose a model-based method, Clonal structure identification through pair-wise Penalization, or CliP, for clustering subclonal mutations without prior knowledge or post-processing. The CliP model is applicable to genomic regions with or without CNAs. CliP demonstrates high accuracy in subclonal reconstruction through extensive simulation studies. Utilizing the well-established regularized likelihood framework, CliP takes only 16 hours to process WGS data from 2,778 tumor samples in the ICGC-PCAWG study, and 38 hours to process WES data from 9,564 tumor samples in the TCGA study. In summary, a penalized likelihood framework for subclonal reconstruction will help address intrinsic drawbacks of existing methods and expand the scope of computational analysis for cancer evolution in large cancer genomic studies.

A pedigree-based prediction model identifies carriers of deleterious de novo mutations in families with Li-Fraumeni syndrome

De novo mutations (DNMs) are increasingly recognized as rare disease causal factors. Identifying DNM carriers will allow researchers to study the likely distinct molecular mechanisms of DNMs. We developed Famdenovo to predict DNM status (DNM or familial mutation [FM]) of deleterious autosomal dominant germline mutations for any syndrome. We introduce Famdenovo.TP53 for Li-Fraumeni syndrome (LFS) and analyze 324 LFS family pedigrees from four US cohorts: a validation set of 186 pedigrees and a discovery set of 138 pedigrees. The concordance index for Famdenovo.TP53 prediction was 0.95 (95% CI: [0.92, 0.98]). Forty individuals (95% CI: [30, 50]) were predicted as DNM carriers, increasing the total number from 42 to 82. We compared clinical and biological features of FM versus DNM carriers: (1) cancer and mutation spectra along with parental ages were similarly distributed; (2) ascertainment criteria like early-onset breast cancer (age 20–35 yr) provides a condition for an unbiased estimate of the DNM rate: 48% (23 DNMs vs. 25 FMs); and (3) hotspot mutation R248W was not observed in DNMs, although it was as prevalent as hotspot mutation R248Q in FMs. Furthermore, we introduce Famdenovo.BRCA for hereditary breast and ovarian cancer syndrome and apply it to a small set of family data from the Cancer Genetics Network. In summary, we introduce a novel statistical approach to systematically evaluate deleterious DNMs in inherited cancer syndromes. Our approach may serve as a foundation for future studies evaluating how new deleterious mutations can be established in the germline, such as those in TP53.

Get in touch