Natasa Przulj

Barcelona Supercomputing Center - Centro Nacional de Supercomputación

Advances in omics technologies have revolutionized cancer research by producing massive datasets. Common approaches to deciphering these complex data are by embedding algorithms of molecular interaction networks. These algorithms find a low-dimensional space in which similarities between the network nodes are best preserved. Currently available embedding
approaches mine the gene embeddings directly to uncover new cancer-related knowledge. However, these gene-centric approaches produce incomplete knowledge, since they do not account for the functional implications of genomic
alterations. We propose a new, function-centric perspective and approach, to complement the knowledge obtained from omic data.

We introduce our Functional Mapping Matrix (FMM) to explore the functional organization of different tissue-specific and speciesspecific embedding spaces generated by a Non-negative Matrix Tri-Factorization algorithm. Also, we use
our FMM to define the optimal dimensionality of these molecular interaction network embedding spaces. For this optimal dimensionality, we compare the FMMs of the most prevalent cancers in human to FMMs of their corresponding
control tissues. We find that cancer alters the positions in the embedding space ofcancer-related functions, while it keeps the positions of the noncancer-related ones. We exploit this spacial ‘movement’ to predict novel cancerrelated
functions. Finally, we predict novel cancer-related genes that the currently available methods for gene-centric analyses cannot identify; we validate these predictions by literature curation and retrospective analyses of patient survival data.