A multi-task machine learning pipeline for the classification and analysis of cancers from gene expression data