Show simple item record

dc.identifier.urihttp://hdl.handle.net/1951/55675
dc.identifier.urihttp://hdl.handle.net/11401/72711
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractWhole genome sequencing and whole exome sequencing are developing techniques to explore the associations between rare variants and complex diseases. The number of variants that are expected to appear in a randomly selected group that do not appear in a different group randomly selected from the same population has unknown mean and variance. Expressions for these quantities are derived here. Numerical values are calculated assuming that the frequency of a rare variant has a beta distribution using parameters estimated for four populations. Extensions to the number of variants that appear in r ( r >1) members of a randomly selected group with none in the comparison group are given. These calculations suggest that a genome wide study of rare variants would generate an extremely large number of false positives. Similarly, an exome wide search would also generate a smaller but still overwhelming number of false positives. A search restricted to variants in a specified gene would not generate excessive numbers of false positives. The expectations using the beta model fit a SNP database well when the underlying beta distribution was restricted to variant frequencies greater than 0.001.
dcterms.available2012-05-15T18:07:23Z
dcterms.available2015-04-24T14:53:18Z
dcterms.contributorNancy R. Mendellen_US
dcterms.contributorGouma, Perenaen_US
dcterms.contributorHaipeng Xingen_US
dcterms.contributorEli Hatchwell.en_US
dcterms.creatorXu, Wenjie
dcterms.dateAccepted2012-05-15T18:07:23Z
dcterms.dateAccepted2015-04-24T14:53:18Z
dcterms.dateSubmitted2012-05-15T18:07:23Z
dcterms.dateSubmitted2015-04-24T14:53:18Z
dcterms.descriptionDepartment of Applied Mathematics and Statisticsen_US
dcterms.formatMonograph
dcterms.formatApplication/PDFen_US
dcterms.identifierhttp://hdl.handle.net/1951/55675
dcterms.identifierXu_grad.sunysb_0771E_10343.pdfen_US
dcterms.identifierhttp://hdl.handle.net/11401/72711
dcterms.issued2010-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2012-05-15T18:07:23Z (GMT). No. of bitstreams: 1 Xu_grad.sunysb_0771E_10343.pdf: 662719 bytes, checksum: af83c1d39fe8548eedee384d0e336816 (MD5) Previous issue date: 1en
dcterms.provenanceMade available in DSpace on 2015-04-24T14:53:18Z (GMT). No. of bitstreams: 3 Xu_grad.sunysb_0771E_10343.pdf.jpg: 1894 bytes, checksum: a6009c46e6ec8251b348085684cba80d (MD5) Xu_grad.sunysb_0771E_10343.pdf.txt: 183889 bytes, checksum: a42b349114e42ac9d12980c0671156b6 (MD5) Xu_grad.sunysb_0771E_10343.pdf: 662719 bytes, checksum: af83c1d39fe8548eedee384d0e336816 (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectbeta distribution, case-control study, exome sequencing, false positives, rare variant, whole genome sequencing
dcterms.subjectStatistics -- Genetics
dcterms.titleDistribution of Number of Rare Variants Appearing in Cases but Not Controls in Genome-wide Studies
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record