Show simple item record

dc.identifier.urihttp://hdl.handle.net/11401/77299
dc.description.sponsorshipThis work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.en_US
dc.formatMonograph
dc.format.mediumElectronic Resourceen_US
dc.language.isoen_US
dc.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dc.typeDissertation
dcterms.abstractCluster analysis is an important tool for unsupervised learning. It is commonly used for pattern recognition and dimension reduction. Traditional clustering algorithms include hierarchical clustering and k-means clustering, as well as model based approach such as the group trajectory analysis. A major draw-back of the traditional clustering analysis is that it considers only a single objective (dissimilarity measurement) whilst in reality, one usually holds several criteria for the classification. Therefore, in this thesis, we strive to develop novel multiple-objective clustering methods – with a focus on the more approachable dual-objective ones. This thesis consists of two parts. In the first part, we introduce the framework of multiple-objective clustering methods. We then introduce the Biclusering analysis method – an existing dual-objective clustering analysis classifying data matrix on the rows and columns simultaneously. Biclustering has been used in gene expression analysis to identify interpretable biological patterns involving a subset of genes and a subset of conditions. Our novel contribution lies in generalizing and extending the objective function used in biclustering in the form of compound clustering, where it is a linear combination of the objective functions with respect to the rows and columns. We also compared the generalized biclustering to the original biclustering algorithm using a microarrary gene expression data set, and a simulation study. Both demonstrated that overall, the generalized biclustering is better than the original biclustering algorithm. Subsequently, we try to apply both the dual-objective bi-clustering algorithms as well as the classic clustering algorithms to understanding the stock market movements. We attempted to detect the patterns in the bear and the bull stock markets using the biclustering and the generalized biclustering techniques. The pros and cons of the dual-objective clustering in a time domain application are therefore summarized. Subsequently, we used the classic clustering method to identify historical stock market periods resembling the current market in an effort to infer the trend of our current market – especially whether we are approaching a recession or not. We conclude the thesis by performing analysis of intraday pattern of high frequency trading data at the aggregation level of one minute and five minute using stocks traded on NYSE using a model-based clustering approach.
dcterms.abstractCluster analysis is an important tool for unsupervised learning. It is commonly used for pattern recognition and dimension reduction. Traditional clustering algorithms include hierarchical clustering and k-means clustering, as well as model based approach such as the group trajectory analysis. A major draw-back of the traditional clustering analysis is that it considers only a single objective (dissimilarity measurement) whilst in reality, one usually holds several criteria for the classification. Therefore, in this thesis, we strive to develop novel multiple-objective clustering methods – with a focus on the more approachable dual-objective ones. This thesis consists of two parts. In the first part, we introduce the framework of multiple-objective clustering methods. We then introduce the Biclusering analysis method – an existing dual-objective clustering analysis classifying data matrix on the rows and columns simultaneously. Biclustering has been used in gene expression analysis to identify interpretable biological patterns involving a subset of genes and a subset of conditions. Our novel contribution lies in generalizing and extending the objective function used in biclustering in the form of compound clustering, where it is a linear combination of the objective functions with respect to the rows and columns. We also compared the generalized biclustering to the original biclustering algorithm using a microarrary gene expression data set, and a simulation study. Both demonstrated that overall, the generalized biclustering is better than the original biclustering algorithm. Subsequently, we try to apply both the dual-objective bi-clustering algorithms as well as the classic clustering algorithms to understanding the stock market movements. We attempted to detect the patterns in the bear and the bull stock markets using the biclustering and the generalized biclustering techniques. The pros and cons of the dual-objective clustering in a time domain application are therefore summarized. Subsequently, we used the classic clustering method to identify historical stock market periods resembling the current market in an effort to infer the trend of our current market – especially whether we are approaching a recession or not. We conclude the thesis by performing analysis of intraday pattern of high frequency trading data at the aggregation level of one minute and five minute using stocks traded on NYSE using a model-based clustering approach.
dcterms.available2017-09-20T16:52:23Z
dcterms.contributorZhu, Weien_US
dcterms.contributorWang, Xuefengen_US
dcterms.contributorWu, Songen_US
dcterms.contributorShroyer, Annie Laurie.en_US
dcterms.creatorRuan, Tingjun
dcterms.dateAccepted2017-09-20T16:52:23Z
dcterms.dateSubmitted2017-09-20T16:52:23Z
dcterms.descriptionDepartment of Applied Mathematics and Statisticsen_US
dcterms.extent152 pg.en_US
dcterms.formatMonograph
dcterms.formatApplication/PDFen_US
dcterms.identifierhttp://hdl.handle.net/11401/77299
dcterms.issued2016-12-01
dcterms.languageen_US
dcterms.provenanceMade available in DSpace on 2017-09-20T16:52:23Z (GMT). No. of bitstreams: 1 Ruan_grad.sunysb_0771E_12798.pdf: 8443144 bytes, checksum: d9b0afc1108ea33a19c8dccef219679e (MD5) Previous issue date: 1en
dcterms.publisherThe Graduate School, Stony Brook University: Stony Brook, NY.
dcterms.subjectStatistics
dcterms.titleMulti-Objective Clustering Analysis
dcterms.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record