Show simple item record

dc.identifier.urihttp://hdl.handle.net/11401/77315
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractModern computer systems produce and process an overwhelming amount of data at an increasing rates. The performance of storage hardware components, however, cannot keep up with the required speed at a practical cost. To mitigate this discrepancy, storage vendors incorporate many workload-driven optimizations in their products. Emerging applications cause workload patterns to change rapidly and significantly. One of the prominent examples is a rapid shift towards virtualized environments. Virtual machines mix I/O streams from different applications and perturb applications' access patterns. In addition, modern users demand more convenience features, such as deduplication, snapshotting, and encryption. Stringent performance requirements, changing I/O patterns, and the growing feature list increase the complexity of storage systems. The complexity of design, in turn, makes the evaluation of the storage systems a difficult task. Storage community needs practical evaluation tools and techniques to resolve this task timely and efficiently. This thesis first explores the complexity of evaluating storage systems. Second, the thesis proposes a Multi-Dimensional Histogram (MDH) workload analysis as a basis for designing a variety of evaluation tools. I/O traces are good sources of information about real-world workloads but are inflexible in representing more than the exact system conditions at the point the traces were captured. We demonstrate how MDH techniques can accurately convert I/O traces to workload models. Historically, most I/O optimizations focused on the metadata: e.g., I/O access patterns, arrival times, read/write sizes. Increasingly, storage systems must also consider the data and not just the metadata. For example, deduplication systems eliminate duplicates in the data to increase logical storage capacity. We use MDH techniques to generate realistic datasets for deduplication systems. The shift from physical to virtual clients drastically changes the I/O workloads seen by Network Attached Storage (NAS). Using MDH techniques we study workload changes caused by virtualization and synthesize a set of versatile NAS benchmarks. It is our thesis that MDH techniques are powerful for both workload analysis and synthesis. MDH analysis bridges the gap between the complexity of storage systems and the availability of practical evaluations tools.
dcterms.available2017-09-20T16:52:29Z
dcterms.contributorPorter, Donalden_US
dcterms.contributorZadok, Erezen_US
dcterms.contributorKuenning, Geoffreyen_US
dcterms.contributorFerdman, Michael.en_US
dcterms.creatorTarasov, Vasily
dcterms.dateAccepted2017-09-20T16:52:29Z
dcterms.dateSubmitted2017-09-20T16:52:29Z
dcterms.descriptionDepartment of Computer Science.en_US
dcterms.extent94 pg.en_US
dcterms.formatApplication/PDFen_US
dcterms.formatMonograph
dcterms.identifierhttp://hdl.handle.net/11401/77315
dcterms.issued2013-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2017-09-20T16:52:29Z (GMT). No. of bitstreams: 1 Tarasov_grad.sunysb_0771E_11602.pdf: 899235 bytes, checksum: 5dcd6e4f08e2e877f870b169b2da38cf (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectBenchmarking, Characterization, Performance, Storage, Workload
dcterms.subjectComputer science
dcterms.titleMulti-dimensional Workload Analysis and Synthesis for Modern Storage Systems
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record