Genetic models to predict the development of colorectal cancer.

Ainsworth, Rachel

Genetic models to predict the development of colorectal cancer.

dc.contributor.author	Ainsworth, Rachel
dc.date.accessioned	2021-10-21T22:53:38Z
dc.date.available	2021-10-21T22:53:38Z
dc.date.issued	2021	en
dc.description.abstract	Background: Survival rates for colorectal cancer are highest when cancer is diagnosed at an early stage but very few cancers are diagnosed before they progress to later stages. A model which could predict who will develop colorectal cancer based on genetic information would allow targeted screening of high-risk individuals. Genome-wide association studies (GWAS) have identified ~100 genetic variants (SNPs) that are individually associated with the development of colorectal cancer, but models built using these SNPs do not identify all high-risk individuals (AUC of 0.629). Methods: To improve the performance of polygenic risk score models, three methods were tested: first, the use of rare allele principal components; second, the identification of clusters of colorectal cancer patients with the same underlying genetic causes of cancer; third, the incorporation of interactions within gradient based tree models. Results: Both rare and common allele principal components were found to identify population groups, but this did not improve the performance of models to predict the development of colorectal cancer. Clusters which represented similar underlying genetic causes of colorectal cancer were unable to be identified, although models that predict the location of colorectal cancer performed significantly better than models built with linear discriminant analysis (p-value=0.022). The use of gradient boosted tree models significantly improved the performance of models to predict the development of colorectal cancer, compared with linear models for the same dataset (p−value=0.0258). However, there was only weak evidence of interactions in the gradient boosted tree models. When variables were selected with random forests or gradient boosted trees, some of the SNPs selected had missing genotypes that were highly favourable or unfavourable for colorectal cancer (odds ratios of 0.446 and 1.77). Conclusion: The performance of models to identify individuals at high-risk for the development of colorectal cancer may be able to be improved through the use of gradient boosted tree models. The treatment of missing genotypes warrants further study due to the strong odds ratios attached to some genotypes that are missing.	en
dc.identifier.uri	https://hdl.handle.net/10092/102763
dc.identifier.uri	http://dx.doi.org/10.26021/11897
dc.language	English
dc.language.iso	en
dc.publisher	University of Canterbury	en
dc.rights	All Rights Reserved	en
dc.rights.uri	https://canterbury.libguides.com/rights/theses	en
dc.title	Genetic models to predict the development of colorectal cancer.	en
dc.type	Theses / Dissertations	en
thesis.degree.discipline	Biological Sciences	en
thesis.degree.grantor	University of Canterbury	en
thesis.degree.level	Masters	en
thesis.degree.name	Master of Science	en
uc.bibnumber	3103197
uc.college	Faculty of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ainsworth, Rachel_final Master's Thesis.pdf
Size:: 3.56 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Science: Theses and Dissertations