Ovarian-Cancer-Subtypes-Identification
10 Nov, 2020
The model used by us is a logistic regression classification model which uses K-means clustering techniques. We have chosen to align with the paper specifications and techniques as closely as possible but changed certain values to improve accuracy score. The multi-omics data of patients are inputted into the Denoising Autoencoder for generating z. With the help of generated z, the patients are clustered using k-means. The optimal number of clusters was determined using silhouette score. In the model, we tested the k from [2, 8] and we finally used k=2 as it had the highest silhouette score. After obtaining the labels clustered by k-means, we built a light-weighted mRNA model for reducing the number of genes needed to identify cancer subtypes by using a logistic regression algorithm.
Publisher: https://github.com/garimasingh128/Ovarian-Cancer-Subtypes-Identification