Rapid growth in publicly available and “functional genomic” datasets affords an opportunity for extensive analysis of large gene families such as human DUBs. A total of nine different public resources measuring the following were mined for data, in most cases starting with a CRISPR-Cas9 knockout or small molecule signature we collected in our laboratories (using the high-throughput DGE RNA-seq):
Click here to view a table summarizing the data and insights gained from each resource.
Comparison of enriched gene sets or GO terms made it possible to bridge different types of data. Overall, we observed substantial and encouraging consistency among datasets. For example, genes identified as co-dependent with DUBs in DepMap data frequently exhibited similar transcript signatures, were co-expressed across cell lines, and were physically associated.
Our analysis yielded three types of information:
Our studies provide a diverse set of data on the DUB family as a whole as well as new insight into many individual DUBs, including several that have been studied intensively. One theme that emerges is that for genes with multiple proposed functions (USP7 and UCHL5 for example), a combination of profiling CRISPR-Cas9 knockouts or drug-induced perturbations with systematic mining of functional genomic databases makes it possible to distinguish among essential and no-essential phenotypes. A second is that more DUBs than anticipated have non-redundant roles in the tumor suppressor and oncogenic pathways, most notably TP53 regulation, suggesting new approaches to undruggable targets. The approaches described in this work are directly applicable to other gene families and therapeutic targets.