Genetic models to predict the development of colorectal cancer. (2021)
Type of ContentTheses / Dissertations
Thesis DisciplineBiological Sciences
Degree NameMaster of Science
PublisherUniversity of Canterbury
Background: Survival rates for colorectal cancer are highest when cancer is diagnosed at an early stage but very few cancers are diagnosed before they progress to later stages. A model which could predict who will develop colorectal cancer based on genetic information would allow targeted screening of high-risk individuals. Genome-wide association studies (GWAS) have identified ~100 genetic variants (SNPs) that are individually associated with the development of colorectal cancer, but models built using these SNPs do not identify all high-risk individuals (AUC of 0.629).
Methods: To improve the performance of polygenic risk score models, three methods were tested: first, the use of rare allele principal components; second, the identification of clusters of colorectal cancer patients with the same underlying genetic causes of cancer; third, the incorporation of interactions within gradient based tree models.
Results: Both rare and common allele principal components were found to identify population groups, but this did not improve the performance of models to predict the development of colorectal cancer. Clusters which represented similar underlying genetic causes of colorectal cancer were unable to be identified, although models that predict the location of colorectal cancer performed significantly better than models built with linear discriminant analysis (p-value=0.022). The use of gradient boosted tree models significantly improved the performance of models to predict the development of colorectal cancer, compared with linear models for the same dataset (p−value=0.0258). However, there was only weak evidence of interactions in the gradient boosted tree models. When variables were selected with random forests or gradient boosted trees, some of the SNPs selected had missing genotypes that were highly favourable or unfavourable for colorectal cancer (odds ratios of 0.446 and 1.77).
Conclusion: The performance of models to identify individuals at high-risk for the development of colorectal cancer may be able to be improved through the use of gradient boosted tree models. The treatment of missing genotypes warrants further study due to the strong odds ratios attached to some genotypes that are missing.
RightsAll Rights Reserved
Showing items related by title, author, creator and subject.
Molecular Modelling Prediction of Estrogen Mimicry and its Biological Consequences in Estrogen Receptor Positive Breast Cancer Cells Bennie, Rachel Zoe (2021)Environmental estrogens are a diverse group of natural and synthetic compounds that can interact with estrogen receptors (ERs) in animals and humans. Interactions with ERs relies on key structural features of estrogen ...
Modifiable lifestyle risk factors that could reduce the incidence of colorectal cancer in New Zealand Richardson AK; Hayes J; Frampton C; Potter JD (2016)AIM: To estimate population attributable fractions for modifiable lifestyle factors and colorectal cancer in New Zealand. METHOD: Relative risks for lifestyle risk factors for colorectal cancer, and population data on ...
Kerr, J.; Day, P.; Broadstock, M.; Weir, R.; Bidwell, S. (University of Canterbury. GeographyUniversity of Canterbury. School of Social Work and Human ServicesUniversity of Canterbury. Health Sciences Centre, 2007)Aim To estimate the effectiveness of colorectal cancer screening with faecal occult blood testing (FOBT), flexible sigmoidoscopy (FS), and combinations of FOBT and FS in preventing colorectal cancer (CRC) deaths. Method ...