hrvatski jezikClear Cookie - decide language by browser settings

A framework for redescription set construction

Mihelčić, Matej; Džeroski, Sašo; Lavrač, Nada; Šmuc, Tomislav (2017) A framework for redescription set construction. Expert Systems with Applications, 68 . pp. 196-215. ISSN 0957-4174

[img]
Preview
PDF - Submitted Version - article
Download (1MB) | Preview

Abstract

Redescription mining is a field of knowledge discovery that aims at finding different descriptions of similar subsets of instances in the data. These descriptions are represented as rules inferred from one or more disjoint sets of attributes, called views. As such, they support knowledge discovery process and help domain experts in formulating new hypotheses or constructing new knowledge bases and decision support systems. In contrast to previous approaches that typically create one smaller set of redescriptions satisfying a pre-defined set of constraints, we introduce a framework that creates large and heterogeneous redescription set from which user/expert can extract compact sets of differing properties, according to its own preferences. Construction of large and heterogeneous redescription set relies on CLUS-RM algorithm and a novel, conjunctive refinement procedure that facilitates generation of larger and more accurate redescription sets. The work also introduces the variability of redescription accuracy when missing values are present in the data, which significantly extends applicability of the method. Crucial part of the framework is the redescription set extraction based on heuristic multi-objective optimization procedure that allows user to define importance levels towards one or more redescription quality criteria. We provide both theoretical and empirical comparison of the novel framework against current state of the art redescription mining algorithms and show that it represents more efficient and versatile approach for mining redescriptions from data.

Item Type: Article
Additional Information: The authors would like to acknowledge the European Commission's support through the MAESTRA project (Gr. no. 612944), the MULTIPLEX project (Gr.no. 317532), and support of the Croatian Science Foundation (Pr. no. 9623: Machine Learning Algorithms for Insightful Analysis of Complex Data Structures).
Uncontrolled Keywords: knowledge discovery ; redescription mining ; predictive clustering trees ; redescription set construction ; scalarization ; conjunctive refinement ; redescription variability
Subjects: TECHNICAL SCIENCES > Computing
Divisions: Division of Electronics
Projects:
Project titleProject leaderProject codeProject type
Postupci strojnog učenja za dubinsku analizu složenih struktura podataka-DescriptiveInductionDragan GambergerIP-2013-11-9623HRZZ
Learning from Massive, Incompletely annotated, and Structured Data-MAESTRASašo Džeroski612944EK
Foundational Research on MULTIlevel comPLEX networks and systems-MULTIPLEXGuido Caldareli317532EK
Depositing User: Phd Tomislav Šmuc
Date Deposited: 11 Dec 2017 09:19
Last Modified: 11 Dec 2017 09:19
URI: http://fulir.irb.hr/id/eprint/3716
DOI: 10.1016/j.eswa.2016.10.012

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year