The Dependency Map (DepMap) is a genome-wide pooled CRISPR-Cas9 knockout proliferation screen conducted in more than 700 cancer cell lines spanning many different tumor lineages. Each cell line in the DepMap contains a unique barcode, and each gene knockout is assigned a “dependency score” on a per cell-line basis which quantifies the rate of CRISPR-Cas9 guide drop. It has been found that proteins with similar DepMap scores across cell lines, a phenomenon known as co-dependent genes, have closely related biological functions. This can include activity in the same or parallel pathways or membership in the same protein complex or the same pathway.
We identified the strongest seven co-dependent genes (“Symbol”) for DUBs and ran GO enrichment analysis. We used Biogrid, IntAct, and Pathway Commons PPIDs, and the NURSA protein-protein interaction databases (PPIDs) to determine whether co-dependent genes interact with one another. The “Evidence” column contains the PPIDs in which the interaction appears as well as whether there is support for the association by an INDRA statement. As another approach to identify potential interactors, we looked at proteomics data from the Broad Institute's Cancer Cell Line Encyclopedia (CCLE) for proteins whose expression across ~375 cell lines strongly correlated with the abundance of each DUB; it has previously been observed that proteins in the same complex are frequently significantly co-expressed. The correlations and associated p-values in the CCLE proteomics dataset are provided. And, we determined whether co-dependent genes yield similar transcriptomic signatures in the Broad Institute's Connectivity Map (CMap). A CMap score greater than 90 is considered significantly similar.
| Symbol | Name | DepMap Correlation | Evidence | CCLE Correlation | CCLE Z-score | CCLE p-value (adj) | CCLE Significant | CMAP Score | CMAP Type | 
|---|---|---|---|---|---|---|---|---|---|
| SUPT20H | SPT20 homolog, SAGA complex component | 0.542 | BioGRID IntAct Pathway Commons INDRA (1) Reactome (3) | 0.06 | 0.24 | 3.83e-01 | |||
| TADA1 | transcriptional adaptor 1 | 0.509 | BioGRID IntAct INDRA (1) Reactome (3) | 0.12 | 0.59 | 5.20e-02 | |||
| TAF5L | TATA-box binding protein associated factor 5 like | 0.496 | BioGRID IntAct INDRA (1) Reactome (3) | 0.11 | 0.50 | 8.95e-02 | |||
| TADA2B | transcriptional adaptor 2B | 0.472 | BioGRID IntAct INDRA (1) Reactome (7) | 0.19 | 0.95 | 2.41e-03 | |||
| TAF6L | TATA-box binding protein associated factor 6 like | 0.462 | BioGRID IntAct INDRA (1) Reactome (3) | 0.34 | 1.77 | 9.52e-10 | |||
| TADA3 | transcriptional adaptor 3 | 0.444 | BioGRID IntAct INDRA (1) Reactome (7) | 0.07 | 0.30 | 2.92e-01 | |||
| ATXN7 | ataxin 7 | 0.438 | BioGRID INDRA (6) Reactome (7) | 0.11 | 0.52 | 8.76e-02 | 
Gene set enrichment analysis was done on the genes correlated with USP22using the terms from Gene Ontology and gene sets derived from the Gene Ontology Annotations database via MSigDB.
Using the biological processes and other Gene Ontology terms from well characterized DUBs as a positive control, several gene set enrichment analyses were considered. Threshold-less methods like GSEA had relatively poor results. Over-representation analysis with a threshold of of the top 7 highest absolute value Dependency Map correlations yielded the best results and is reported below.
| GO Identifier | GO Name | GO Type | p-value | p-value (adj.) | q-value | 
|---|---|---|---|---|---|
| GO:0070461 | SAGA-type complex | Cellular Component | 4.07e-17 | 3.70e-15 | 1.07e-15 | 
| GO:0030914 | STAGA complex | Cellular Component | 8.12e-16 | 7.38e-14 | 1.07e-14 | 
| GO:0031248 | protein acetyltransferase complex | Cellular Component | 8.77e-14 | 7.98e-12 | 7.69e-13 | 
| GO:0000124 | SAGA complex | Cellular Component | 8.84e-12 | 8.04e-10 | 5.81e-11 | 
| GO:0034212 | peptide N-acetyltransferase activity | Molecular Function | 1.64e-11 | 1.49e-09 | 8.64e-11 | 
| GO:0008080 | N-acetyltransferase activity | Molecular Function | 4.22e-11 | 3.84e-09 | 1.85e-10 | 
| GO:0016407 | acetyltransferase activity | Molecular Function | 1.13e-10 | 1.03e-08 | 3.90e-10 | 
| GO:0016410 | N-acyltransferase activity | Molecular Function | 1.19e-10 | 1.08e-08 | 3.90e-10 | 
| GO:0018394 | peptidyl-lysine acetylation | Biological Process | 9.99e-10 | 9.09e-08 | 2.92e-09 | 
| GO:0016569 | covalent chromatin modification | Biological Process | 1.48e-09 | 1.34e-07 | 3.89e-09 | 
| GO:0006473 | protein acetylation | Biological Process | 2.49e-09 | 2.27e-07 | 5.96e-09 | 
| GO:0043966 | histone H3 acetylation | Biological Process | 2.73e-09 | 2.49e-07 | 5.99e-09 | 
| GO:0043543 | protein acylation | Biological Process | 6.80e-09 | 6.19e-07 | 1.38e-08 | 
| GO:0016746 | transferase activity, transferring acyl groups | Molecular Function | 9.55e-09 | 8.69e-07 | 1.79e-08 | 
| GO:1905368 | peptidase complex | Cellular Component | 1.64e-08 | 1.49e-06 | 2.87e-08 | 
| GO:0003713 | transcription coactivator activity | Molecular Function | 2.58e-08 | 2.35e-06 | 4.24e-08 | 
| GO:0018205 | peptidyl-lysine modification | Biological Process | 7.26e-08 | 6.61e-06 | 1.12e-07 | 
| GO:0016591 | RNA polymerase II, holoenzyme | Cellular Component | 2.79e-06 | 2.54e-04 | 4.08e-06 | 
| GO:0030880 | RNA polymerase complex | Cellular Component | 6.70e-06 | 6.10e-04 | 9.29e-06 | 
| GO:0033276 | transcription factor TFTC complex | Cellular Component | 1.06e-05 | 9.64e-04 | 1.39e-05 | 
| GO:0061695 | transferase complex, transferring phosphorus-containing groups | Cellular Component | 8.16e-05 | 7.43e-03 | 9.34e-05 | 
| GO:0070646 | protein modification by small protein removal | Biological Process | 1.29e-04 | 1.18e-02 | 1.42e-04 | 
The following table shows the significantly differentially expressed genes after knocking out USP22 using CRISPR-Cas9.
| Symbol | Name | log2-fold-change | p-value | p-value (adj.) | 
|---|---|---|---|---|
| USP22 | ubiquitin specific peptidase 22 | -1.60e+00 | 7.36e-14 | 1.62e-09 | 
| DNAJB1 | DnaJ heat shock protein family (Hsp40) member B1 | 1.02e+00 | 9.34e-10 | 1.03e-05 | 
There were too few differentially expressed genes to run a meaningful GSEA.
INDRA was used to automatically assemble known mechanisms related to USP22 from literature and knowledge bases. The first section shows only DUB activity and the second shows all other results.