Show simple item record

dc.identifier.urihttp://hdl.handle.net/11401/76527
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractA gene pathway typically refers to a group of genes and small molecules that work together to control one or more cell functions. In systems biology, pathway analysis is of paramount biological importance, and recent studies revealed that malfunction of gene pathways could induce disease manifestations, such as cancer. Usually, a gene pathway consists of two components: the upstream factors, which are signaling molecules transmitting stimulus from cell surface to nucleus, and the downstream factors, which respond to cell signaling through changes of their expression levels. Although several methods have been reported for analysis of gene pathways, almost all of them focus on the upstream factors of a pathway, ignoring the rich information from the downstream factors. In this thesis work, we first investigated and compared the existing gene pathway analysis methods, particularly on three most popular ones: Gene Set Enrichment Analysis (GSEA), Principal Component Analysis (PCA), and Canonical Discriminant Analysis (CDA). We then proposed an innovative method based on the concept of integrating the statistical information from both upstream and downstream factors to infer differential gene pathways. More specifically, the Relax Intersection-Union Test (RIUT) framework was employed to combine evidences from upstream and downstream factors. We performed intensive simulation studies with GSEA, PCA and CDA. We found out both the limitations and strengths of these methods under various data structures, and we identified scenarios in which each method can outperform the others. Furthermore, we demonstrated that our proposed combining method outperforms the above existing methods in terms of both power and interpretability in biology. We applied the combining method to two real data sets: the p53 data set and Essential thrombocythaemia data set. The results suggest that in the combining method, GSEA is more appropriate for the upstream subgroup and CDA is more powerful for the downstream subgroup due to their distinct data structures.
dcterms.available2017-09-20T16:50:33Z
dcterms.contributorZhu, Weien_US
dcterms.contributorWu, Songen_US
dcterms.contributorYang, Jieen_US
dcterms.contributorCao, Jian.en_US
dcterms.creatorZhang, Qiao
dcterms.dateAccepted2017-09-20T16:50:33Z
dcterms.dateSubmitted2017-09-20T16:50:33Z
dcterms.descriptionDepartment of Applied Mathematics and Statistics.en_US
dcterms.extent130 pg.en_US
dcterms.formatApplication/PDFen_US
dcterms.formatMonograph
dcterms.identifierhttp://hdl.handle.net/11401/76527
dcterms.issued2015-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2017-09-20T16:50:33Z (GMT). No. of bitstreams: 1 Zhang_grad.sunysb_0771E_12189.pdf: 4079656 bytes, checksum: 353080c853f2104ff34665a65c32a4ed (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectgene pathway, microarray
dcterms.subjectStatistics
dcterms.titleIdentification of Differential Gene Pathways in Microarray Data
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record