Gutenberg Open Science: Metaheuristics for Pattern Mining in Big Sequence Data

Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-5740

Full metadata record

DC Field	Value	Language
dc.contributor.author	Raza, Atif	-
dc.date.accessioned	2021-04-19T13:19:58Z	-
dc.date.available	2021-04-19T13:19:58Z	-
dc.date.issued	2021	-
dc.identifier.uri	https://openscience.ub.uni-mainz.de/handle/20.500.12030/5749	-
dc.description.abstract	An ever-growing list of human endeavors in a variety of domains results in the generation of time-series data, i.e., data that are time-resolved and measured in equidistant time intervals. The continued developments in sensor and storage technology and the availability of database systems specifically designed for time-series data have also made it possible to record an exorbitant amount of such data. The vast yet readily available data places ever-increasing demands on data mining methods for fast and efficient knowledge discovery, which establishes the need for exceedingly fast algorithms. The data mining research community has been actively investigating various avenues to develop algorithms for time series classification. Most research has focused on optimizing accuracy or error rate, although runtime performance and broad applicability are as important in practice. The result is a plethora of algorithms that have quadratic or higher computational complexities. Consequently, the algorithms have little to no use for deployment on a large scale. This thesis addresses the complexity issue by introducing several time-series classification methods based on metaheuristics and randomized approaches to improve the state-of-the-art in time-series mining. We introduce three subsequence-based time series classification algorithms and an approximate distance measure for time series data. One subsequences-based time series classifier explicitly employs random sampling for subsequence discovery. The other two subsequences-based classifiers employ discretized time series data coupled with (i) a linear time and space string mining algorithm for extracting frequent patterns and (ii) a novel pattern sampling approach for discovering frequent patterns. The frequent patterns are translated back to subsequences for model induction. Both of these algorithms are up to two orders of magnitude faster than previous state-of-the-art algorithms. An extensive set of experiments establishes the effectiveness and classification accuracy of these methods against established and recently proposed methods.	en_GB
dc.language.iso	eng	de
dc.rights	InCopyright	*
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	*
dc.subject.ddc	004 Informatik	de_DE
dc.subject.ddc	004 Data processing	en_GB
dc.title	Metaheuristics for Pattern Mining in Big Sequence Data	de_DE
dc.type	Dissertation	de
dc.identifier.urn	urn:nbn:de:hebis:77-openscience-ae187893-7759-4c7f-a534-7c96c085efcf9	-
dc.identifier.doi	http://doi.org/10.25358/openscience-5740	-
jgu.type.dinitype	doctoralThesis	en_GB
jgu.type.version	Original work	de
jgu.type.resource	Text	de
jgu.date.accepted	2021-04-13	-
jgu.description.extent	xix, 148 Seiten, Illustrationen, Diagramme	de
jgu.organisation.department	FB 08 Physik, Mathematik u. Informatik	de
jgu.organisation.number	7940	-
jgu.organisation.name	Johannes Gutenberg-Universität Mainz	-
jgu.rights.accessrights	openAccess	-
jgu.organisation.place	Mainz	-
jgu.subject.ddccode	004	de
jgu.organisation.ror	https://ror.org/023b0x485
Appears in collections:	JGU-Publikationen

Files in This Item:

	File	Description	Size	Format
	raza_atif-metaheuristics-20210414152914334.pdf		5.53 MB	Adobe PDF	View/Open

Show simple item record