Show simple item record

dc.identifier.urihttp://hdl.handle.net/11401/76571
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractThis thesis proposes a novel statistical method based on the generalized linear errors-in-variables model to compare two measurement platforms with discrete and continuous outcomes respectively. This method overcomes the limitation of the classical platform comparison method with only linear models that can only accommodate two continuous outcome measures. This novel method was applied to model two gene expression measurement platforms: Microarray (continuous) and RNA-Seq (discrete). The comparison result is further validated by differentially expressed gene analysis and biological pathway analysis. The proposed approach would play a significant role: 1) assessing emerging platforms systematically with existing platforms, 2) serving as a foundation to integrate data sets generated from different platforms. In order to perform platform comparison, a model is built between Microarray and RNA-Seq gene expression profiles based on established distribution assumptions for the purpose of estimating fixed and proportional biases. From both biological and technical view, the variation and dispersion in the measured expression profiles are considered to be gene-specific, which means realistic models of whole genome expression profile data sets contains large number of nuisance parameters and each platform would feature only a limited number of replicates because of the high cost to measure sample on both platforms. Consequently, substituting those parameters with their common estimates from the limited replicates in the model's likelihood function is often proven unreliable with large variances. Therefore, directly replacing nuisance parameter with estimates from replicates does not lead to appropriate estimates. Additionally, because the number of parameters in model is often tens of thousands, estimating nuisance parameters through their maximum likelihood estimators (MLE) is no longer feasible considering the computational difficulties. In order to overcome above limitations, we further developed a customized estimation method for the proposed generalized linear errors-in-variables model based on unbiased estimating equations (UEE), which yield estimators in analytical form, in lieu of maximum likelihood estimate. Under suitable distribution assumptions of the platforms, the new estimator is proven, theoretically, to converge to the underlying truth with a small bias, which is due to the inherent low count in the discrete platform. The performance of proposed method's was first evaluated by simulated data sets with modest number (three, five and ten) of replicates and subsequently applied to compare published Microarray and RNA-Seq data sets.
dcterms.available2017-09-20T16:50:40Z
dcterms.contributorZhu, Weien_US
dcterms.contributorWu, Songen_US
dcterms.contributorWang, Xuefengen_US
dcterms.contributorLi, Ellen.en_US
dcterms.creatorZhang, Yuanhao
dcterms.dateAccepted2017-09-20T16:50:40Z
dcterms.dateSubmitted2017-09-20T16:50:40Z
dcterms.descriptionDepartment of Applied Mathematics and Statistics.en_US
dcterms.extent100 pg.en_US
dcterms.formatApplication/PDFen_US
dcterms.formatMonograph
dcterms.identifierhttp://hdl.handle.net/11401/76571
dcterms.issued2015-08-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2017-09-20T16:50:40Z (GMT). No. of bitstreams: 1 Zhang_grad.sunysb_0771E_12186.pdf: 2596069 bytes, checksum: e2c359b6d07bcb54a1267400f47038d9 (MD5) Previous issue date: 2014en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjecterrors-in-variables model, generalized linear model, platform comparison, unbiased estimating equation
dcterms.subjectStatistics
dcterms.titleStatistical Comparison of Measurement Platforms
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record