Show simple item record

dc.identifier.urihttp://hdl.handle.net/1951/59794
dc.identifier.urihttp://hdl.handle.net/11401/71350
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractNowadays, Hidden Markov Model (HMM) has been widely used in analysis of various biological data for both smoothing and clustering. However, characterizing each hidden state by a single distribution, the classical HMM might have some limitations on the data whose hidden state is composed by a mixture of distributions (Heng Lian et al., 2006). To address this issue, we proposed a new stochastic segmentation model and an associated estimation procedure that has attractive analytical and computational properties. We combined the forward and backward filter together based on Bayes' theorem to calculate the posterior mean and variance. Besides, we developed an expectation-maximization (EM) algorithm to estimate the hyper-parameters. Furthermore, we utilized a bounded complexity mixture (BCMIX) approximation whose computational complexity is linear in sequence length. Another important feature of this segmentation model is that it yields explicit formulas for posterior means and probability of categorical states, which can be used to make inference on both categorical and continuous aspects of the data. Other quantities relating to the posterior distribution that are useful for making confidence assessments of any given segmentation can also be estimated by using our method. We perform intensive simulation studies (1) to compare the Bayes and BCMIX estimates (2) to evaluate the BCMIX estimates in terms of sum square error, Kullback-Leibler divergence and the identification ratio of true segments. We also applied our model on two biological data sets: (1) reduced representation bisulfite sequencing (RRBS) data (A.Molaro et al., 2011) (2) ENCODE Nimblegen tilled arrays (Sabo et al., 2006). Our model shows good performance on segmentation of these two sequential data. In RRBS data it can further help identify differential methylation region (DMR) while in microarray data it can discover the DNAsel Hypersensitive Sites (DHSs).
dcterms.available2013-05-22T17:35:17Z
dcterms.available2015-04-24T14:47:09Z
dcterms.contributorWu, Songen_US
dcterms.contributorZhang, Michael Q., Xing, Haipengen_US
dcterms.contributorFang, Yixin.en_US
dcterms.creatorMo, Yifan
dcterms.dateAccepted2013-05-22T17:35:17Z
dcterms.dateAccepted2015-04-24T14:47:09Z
dcterms.dateSubmitted2013-05-22T17:35:17Z
dcterms.dateSubmitted2015-04-24T14:47:09Z
dcterms.descriptionDepartment of Applied Mathematics and Statisticsen_US
dcterms.extent107 pg.en_US
dcterms.formatMonograph
dcterms.formatApplication/PDFen_US
dcterms.identifierMo_grad.sunysb_0771E_11168en_US
dcterms.identifierhttp://hdl.handle.net/1951/59794
dcterms.identifierhttp://hdl.handle.net/11401/71350
dcterms.issued2012-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2013-05-22T17:35:17Z (GMT). No. of bitstreams: 1 Mo_grad.sunysb_0771E_11168.pdf: 2312485 bytes, checksum: e1bd3ad6a7ea1e5f92bb7d2a5c1ba5f9 (MD5) Previous issue date: 1en
dcterms.provenanceMade available in DSpace on 2015-04-24T14:47:09Z (GMT). No. of bitstreams: 3 Mo_grad.sunysb_0771E_11168.pdf.jpg: 1894 bytes, checksum: a6009c46e6ec8251b348085684cba80d (MD5) Mo_grad.sunysb_0771E_11168.pdf.txt: 133179 bytes, checksum: befc4ea4c02b98ec6929e65f9db60143 (MD5) Mo_grad.sunysb_0771E_11168.pdf: 2312485 bytes, checksum: e1bd3ad6a7ea1e5f92bb7d2a5c1ba5f9 (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectStatistics
dcterms.titleA Stochastic Segmentation Model for Categorical and Continuous Features of various biological sequential
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record