Show simple item record

dc.identifier.urihttp://hdl.handle.net/11401/77300
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractThere is an increasing need for data structures that efficiently manipulate huge amounts of data. When the data to be represented does not fit into main memory all at once, data structures designed without taking this constraint into account fail to perform as expected. Data structures conceived specifically for this purpose fall into two, not necessarily exclusive, categories. The first consists of data structures designed and analyzed under a model of computation where the cost is given by the amount of data transferred between external and internal memory. The second comprises data structures that make the most out of the internal memory by representing the data using as few space as possible, without hurting the performance of the operations they support. This dissertation falls in the intersection of both categories: It studies representations that use space close to the information-theoretic lower bound while, at the same time, provide good performance guarantees in terms of the number of memory transfers. The resulting data structures are not only of theoretical interest, but can also be (and in most cases have been) easily turned into robust, efficient, well tested, and usable implementations, where the constants and low order terms hidden by Big O notation do not take over the running time in practice. The focus of this dissertation is on data structures that dynamically maintain a set in order to efficiently answer range and membership queries. Specifically, this work presents two compact data structures: the level-based packed memory array, for range and membership queries, and the quotient filter, for approximate membership queries. Furthermore, three extension to this last data structure are also described: the buffered quotient filter, the cascade filter, and the counting quotient filter. Finally, this dissertation shows that the standard B-tree data structure is asymptotically optimal for the batched predecessor problem in external memory if the space is a constrain. The tradeoff that quantifies the minimum space required for a given query cost is also presented.
dcterms.available2017-09-20T16:52:23Z
dcterms.contributorBender, Michael Aen_US
dcterms.contributorJohnson, Roben_US
dcterms.contributorGao, Jieen_US
dcterms.contributorFarach-Colton, Martin.en_US
dcterms.creatorMontes Arango, Pablo
dcterms.dateAccepted2017-09-20T16:52:23Z
dcterms.dateSubmitted2017-09-20T16:52:23Z
dcterms.descriptionDepartment of Computer Science.en_US
dcterms.extent108 pg.en_US
dcterms.formatApplication/PDFen_US
dcterms.formatMonograph
dcterms.identifierhttp://hdl.handle.net/11401/77300
dcterms.issued2014-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2017-09-20T16:52:23Z (GMT). No. of bitstreams: 1 MontesArango_grad.sunysb_0771E_12013.pdf: 1155908 bytes, checksum: bd63830998516b943029c509036f901b (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectAlgorithms, Data structures, Lower bounds, Membership
dcterms.subjectComputer science
dcterms.titleTheoretical and Practical Aspects of Compact Data Structures for Range and Membership Queries in Sparse Sets
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record