Skip to contents

Load MDTToolset

Prepare Input File

The following files are given together with the MDTToolset. If you would like to see how they are prepared, please see tutorial: Introduction to Most Dominant Transcripts.

cond1_mdts <- MDTToolset::cond1_mdts
cond2_mdts <- MDTToolset::cond2_mdts
cond1_ENST_ENSG_nonredundant <- MDTToolset::cond1_ENST_ENSG_nonredundant
cond2_ENST_ENSG_nonredundant <- MDTToolset::cond2_ENST_ENSG_nonredundant
# read MANE Select
ensg_enst_ensp_mane_v104 <- MDTToolset::ensg_enst_ensp_mane_v104
# prepare sequences and Mane select transcripts dataframe
ensg_enst_seq <- prepare_Seq(MDTToolset::Homo_sapiens.GRCh38.cdna.all, ensg_enst_ensp_mane_v104)

## Select transcripts having MANE ids
enst_mane_iso <- ensg_enst_ensp_mane_v104[!(is.na(ensg_enst_ensp_mane_v104$RefSeq.match.transcript..MANE.Select.) | ensg_enst_ensp_mane_v104$RefSeq.match.transcript..MANE.Select. == ''),c(1,2,3,4)]

# Select the only ENSTs having ENSP Protein IDs.
ensp_only <- ensg_enst_ensp_mane_v104[!ensg_enst_ensp_mane_v104$Protein.stable.ID == "",]

Calculate MDT Switching Events in the Data

The switch_calculator() function identifies MDT switching events in each sample by comparing it to a “Normal” cohort or another specified cohort, thereby defining dMDT (disease-specific MDTs). It considers the following criteria:

  • cutoff: The percentage of samples in the comparison group (cond2 in our case) that may have the same MDT as the Normal cohort. A cutoff of 0 is very stringent and ensures that the dMDT is not observed in any of the samples from cond2.
  • cutoff_rate: The enrichment rate between Transcript 1 and Transcript 2 in the cond1_mdts and cond2_mdts datasets.
  • cutoff_other_MDTs: The percentage of cond2 samples having another MDT for the same gene.
# Calculate switches in condition 1 compared to condition 2
Cond1_Cond2_Switches <- switch_calculator(cancer_ori = cond1_mdts, gtex_ori = cond2_mdts, exp_values_pcawg = cond1_ENST_ENSG_nonredundant, exp_values_gtex = cond2_ENST_ENSG_nonredundant, ensp_sequences = ensg_enst_seq, ensp_only = ensp_only, cutoff = 0, cutoff_rate = 2, cutoff_other_MDTs = 50)

# visalize the results. Note that you should have knitr R package installed.
knitr::kable(head(Cond1_Cond2_Switches,5), caption = 'MDT Switching Events')
MDT Switching Events
SampleID ENSG dMDT ENST2_cancer TPM1_cancer TPM2_cancer enrichment p_value relative_cancer_exp relative_gtex_exp MDT_GTEx adj_p
tpm_kallisto_macular_retina-1 ENSG00000020256 ENST00000346617 ENST00000216923 3.342410 1.644050 2.033034 0.0002441 0.3602014 0.1078287 ENST00000216923 0.0002712
tpm_kallisto_macular_retina-1 ENSG00000062282 ENST00000605099 ENST00000228027 0.936847 0.345076 2.714901 0.0002441 0.6704946 0.0000000 ENST00000376262,ENST00000228027 0.0002712
tpm_kallisto_macular_retina-1 ENSG00000070669 ENST00000422745 ENST00000455086 13.171600 6.408810 2.055233 0.0002441 0.2935822 0.0937156 ENST00000455086 0.0002712
tpm_kallisto_macular_retina-1 ENSG00000083454 ENST00000345901 ENST00000552456 5.319620 1.283410 4.144911 0.0002441 0.6319887 0.0000000 ENST00000552456 0.0002712
tpm_kallisto_macular_retina-1 ENSG00000089597 ENST00000356638 ENST00000346178 48.838600 19.508100 2.503504 0.0002441 0.5505801 0.2352723 ENST00000346178 0.0002712

Calculate Functional Impact of Transcript Switches by Integrating the Transcript Interaction Network

Isoform Interaction Network

Different transcripts of a gene can have varying exons, leading to changes in protein domains. These changes may disrupt interactions with protein interaction partners. To understand the functional impact of these switching events, we incorporate the transcript interaction network into the switching results. MDTToolset provides Isoform Network, IsoNetwork_v12, which is based on STRING v12. This network has the following columns. ENSG: Ensembl gene ID ENSP: Ensembl protein ID ENST: Ensembl transcript ID STRINGensp: STRING protein ID, considered canonical STRINGintN: Number of proteins STRINGensp interacts with in the STRING database ExistIntN: Number of interactions ENST is involved in MissIntN: Number of interactions ENST has lost RelMissIntN: Percentage of interaction loss ExistInts: Existing interactions of ENST MissInts: Lost interactions of ENST

IsoNetwork_v12 <- MDTToolset::IsoNetwork_v12_subsetData
knitr::kable(head(IsoNetwork_v12,5), caption = 'Isoform Interaction Network')
Isoform Interaction Network
ENSG ENSP ENST STRINGensp STRINGintN ExistIntN MissIntN RelMissIntN ExistInts MissInts
2507 ENSG00000017260 ENSP00000432956 ENST00000533801 ENSP00000421326 39 6 1 0.143 PF00122:E1-E2_ATPase:167-362:PF03381:CDC50:ENSP00000230461:402:1,PF00122:E1-E2_ATPase:167-362:PF00122:E1-E2_ATPase:ENSP00000263519:599:1,PF00122:E1-E2_ATPase:167-362:PF00122:E1-E2_ATPase:ENSP00000349877:425:1,PF00122:E1-E2_ATPase:167-362:PF00122:E1-E2_ATPase:ENSP00000350310:600:1,PF00122:E1-E2_ATPase:167-362:PF00122:E1-E2_ATPase:ENSP00000353414:629:1,PF00122:E1-E2_ATPase:167-362:PF00122:E1-E2_ATPase:ENSP00000392043:631:1 PF00689:Cation_ATPase_C:760-932:PF05366:Sarcolipin:ENSP00000435380:413:1
2573 ENSG00000018625 ENSP00000376066 ENST00000392233 ENSP00000354490 102 7 13 0.650 PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000218548:725:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000332756:414:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000353414:421:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000357060:936:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000440045:873:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000444688:978:1,PF00122:E1-E2_ATPase:161-352:PF00122:E1-E2_ATPase:ENSP00000445306:936:1 PF00689:Cation_ATPase_C:796-1005:PF00287:Na_K-ATPase:ENSP00000218008:960:1,PF00689:Cation_ATPase_C:796-1005:PF00287:Na_K-ATPase:ENSP00000250111:993:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000270310:599:1,PF00689:Cation_ATPase_C:796-1005:PF00287:Na_K-ATPase:ENSP00000286371:978:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000292079:978:1,PF00689:Cation_ATPase_C:796-1005:PF00287:Na_K-ATPase:ENSP00000334216:770:1,PF00689:Cation_ATPase_C:796-1005:PF04272:Phospholamban:ENSP00000350132:860:1,PF00689:Cation_ATPase_C:796-1005:PF00287:Na_K-ATPase:ENSP00000356790:996:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000442756:709:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000473361:558:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000473929:583:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000481244:946:1,PF00689:Cation_ATPase_C:796-1005:PF02038:ATP1G1_PLM_MAT8:ENSP00000482442:453:1
91915 ENSG00000204427 ENSP00000418331 ENST00000477462 ENSP00000379282 56 0 12 1.000 - PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000247706:448:1,PF00561:Abhydrolase_1:280-410:PF12146:Hydrolase_4:ENSP00000250974:560:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000268129:571:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000273359:541:1,PF00561:Abhydrolase_1:280-410:PF12697:Abhydrolase_6:ENSP00000273596:461:1,PF00561:Abhydrolase_1:280-410:PF12697:Abhydrolase_6:ENSP00000289119:649:1,PF00561:Abhydrolase_1:280-410:PF12146:Hydrolase_4:ENSP00000302657:415:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000326491:501:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000336693:577:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000365063:589:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000365725:855:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000414558:545:1
1 ENSG00000000003 ENSP00000362111 ENST00000373020 ENSP00000362111 17 0 0 - - -
2 ENSG00000000003 ENSP00000482130 ENST00000612152 ENSP00000362111 17 0 0 - - -

# read where the Pfam
ENSG_ENST_ENSP_Pfam <- MDTToolset::ENSG_ENST_ENSP_Pfam

Calculate the Functional Impact

To understand if the interaction partners of ENSTs (lost or kept) are expressed in the sample, we implemented a function buildIsoNet() which takes four inputs: dMDT_network_info: Merged database: switching results and Isoform Network ensembl_data: ENSG_ENST_ENSP_Pfam kallisto_counts: cond1_ENST_ENSG_nonredundant iso_network: IsoNetwork_v12

# we first merge the switching results with Isoform Network
dMDTs_IsoNetwork <- merge(Cond1_Cond2_Switches, IsoNetwork_v12, by.x='dMDT', by.y='ENST')
dMDTs_IsoNetwork_int <- dMDTs_IsoNetwork[dMDTs_IsoNetwork$MissInts != '-' | dMDTs_IsoNetwork$ExistInts != '-',]

Switching_Events_Network <- buildIsoNet(dMDT_network_info = dMDTs_IsoNetwork_int, ensembl_data = ENSG_ENST_ENSP_Pfam, kallisto_counts = cond1_ENST_ENSG_nonredundant, iso_network = IsoNetwork_v12)
knitr::kable(head(Switching_Events_Network,2), caption = 'Interaction Lost of dMDTs')
Interaction Lost of dMDTs
dMDT SampleID ENSG.x ENST2_cancer TPM1_cancer TPM2_cancer enrichment p_value relative_cancer_exp relative_gtex_exp MDT_GTEx adj_p ENSG.y ENSP STRINGensp STRINGintN ExistIntN MissIntN RelMissIntN ExistInts MissInts ensp_expressed_missed ensp_expressed_kept int_losts_in_mdt
3 ENST00000477462 tpm_kallisto_macular_retina-2 ENSG00000204427 ENST00000496579 0.517667 0.171691 3.015109 0.0002441 0.609764 0 ENST00000395952,ENST00000482224 0.0002712 ENSG00000204427 ENSP00000418331 ENSP00000379282 56 0 12 1.000 - PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000247706:448:1,PF00561:Abhydrolase_1:280-410:PF12146:Hydrolase_4:ENSP00000250974:560:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000268129:571:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000273359:541:1,PF00561:Abhydrolase_1:280-410:PF12697:Abhydrolase_6:ENSP00000273596:461:1,PF00561:Abhydrolase_1:280-410:PF12697:Abhydrolase_6:ENSP00000289119:649:1,PF00561:Abhydrolase_1:280-410:PF12146:Hydrolase_4:ENSP00000302657:415:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000326491:501:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000336693:577:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000365063:589:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000365725:855:1,PF00561:Abhydrolase_1:280-410:PF00561:Abhydrolase_1:ENSP00000414558:545:1 ENSP00000247706,ENSP00000250974,ENSP00000268129,ENSP00000273359,ENSP00000273596,ENSP00000289119,ENSP00000302657,Not_expressed,ENSP00000336693,ENSP00000365063,ENSP00000365725,ENSP00000414558 No kept int MDT_is_not_found

Visualize Network of Interaction Lost

visualize_network() function will take two inputs: ENSP id, and Isoform Network, IsoNetwork_v12. And it gives a graph showing the network of the protein (of transript of interest) using igraph R package.

Example 1: All Interactions are Lost

visualize_network('ENSP00000418331', IsoNetwork_v12)

#> NULL

Example 2: Some interactions are lost while some are kept

visualize_network('ENSP00000432956', IsoNetwork_v12)

#> NULL
visualize_network('ENSP00000376066', IsoNetwork_v12)

#> NULL

Check Expression Level of dMDT:MDT in the sample

plotExpression('ENST00000346617', 'ENST00000216923', 'tpm_kallisto_macular_retina-1', cond1_ENST_ENSG_nonredundant, cond2_ENST_ENSG_nonredundant)