7 November 2016
IT4Innovations
Europe/Prague timezone

Linguistic Characterization of Natural Data by Applying Intermediate Quantifiers on Fuzzy Association Rules

7 Nov 2016, 16:20
20m
atrium (IT4Innovations)

atrium

IT4Innovations

Studentská 1B 708 33 Ostrava - Poruba
Plenary Talks Afternoon sessions Plenary talks 2

Speaker

Dr Petra Murinová (Institute for Reseaech and Application of Fuzzy Modelling)

Description

Extended Abstract The main goal of this talk is to put together theoretical results on intermediate quantifiers which were proposed in several papers (see e.g. [1, 2, 3, 4]) with the Fuzzy GUHA method [5], and to introduce a linguistic characterization of natural data using generalized intermediate quantifiers. The theory of intermediate quantifiers was introduced by Nov´ak in [3] and now is a constituent of the theory of Fuzzy Natural Logic (FNL), which is a mathematical counterpart of the concept of Natural Logic introduced by Lakoff [6]. This theory is based on Łukasiewicz fuzzy type theory (Ł- FTT) [4], which is one of the existing higher-order fuzzy logics. Fuzzy GUHA is a special method for automated search of association rules from numerical data. Generally, obtained associations are in the form A s B, which means that the occurrence of A is associated with the occurrence of B, where A and B are formulae created from objects’ attributes. As proposed by H´ajek et al. [5], the original GUHA method allowed only boolean attributes to be involved. Some parts of their approach was independently re-invented by Agrawal [7] many years later and is also known as the mining of association rules or market basket analysis. A detailed book on the GUHA method is [8], where one can find distinct statistically approved associations between attributes of given objects. Fuzzy GUHA is an extension of a classical GUHA method for fuzzy data. In this paper, we work with associations in the form of IF-THEN rules composed of evaluative linguistic expressions, which allow the quantities to be characterized with vague linguistic terms such as “very small”, “big”, “medium” etc. To measure the interestingness of a rule, many numerical characteristics or indices have been proposed (see [9, 10] for a nice overview). As a supplement to them, we try to utilize the theory of intermediate quantifiers to characterize the intensity of association, which allows us to use linguistic characterizations such as “almost all”, “most”, “some”, or “few”. As a result, we may automatically obtain the following sentences from numerical bio-statistical data: Almost all people, who suffer atopic tetter, live in an area affected by heavy industry and smoke, suffer from asthma. Most people who smoke and suffer from respiratory diseases also suffer from ischemic disease of leg. In the practice, it is often the case that some data are not available e.g. due the error in measures, missing results, or if the respondent is not willing to answer or has no opinion on the given subject. We can completely remove the cases with missing values to obtain clean data, but it can result in an excessive loss of information. Alternatively, we can handle missing values by using fuzzy partial logics, which were proposed by Bˇehounek and Nov´ak in [11]. They provide formal apparatus for several types of missing information such as “unknown” or “undefined” (i.e. not meaningful) value. Basically, the semantics of these logics formed by algebras of truth values is extended by a special value “”.

Summary

References
[1] P. Murinova, V. Novak, A formal theory of generalized intermediate syllogisms, Fuzzy Sets and Systems 186
(2013) 47–80.
[2] P. Murinova, V. Novak, The structure of generalized intermediate syllogisms, Fuzzy Sets and Systems 247 (2014)
18–37.
[3] V. Novak, A formal theory of intermediate quantifiers, Fuzzy Sets and Systems 159 (10) (2008) 1229–1246.
[4] V. Novak, On fuzzy type theory, Fuzzy Sets and Systems 149 (2005) 235–273.
[5] P. Hajek, The question of a general concept of the GUHA method, Kybernetika 4 (1968) 505–515.
[6] G. Lakoff, Linguistics and natural logic, Synthese 22 (1970) 151–271.
[7] R. Agrawal, R. Srikant, Fast algorithms for mining association rules, in: Proc. 20th Int. Conf. on Very Large
Databases, AAAI Press, Chile, 1994, pp. 487–499.
[8] P. Hajek, T. Havr´anek, Mechanizing hypothesis formation: Mathematical foundations for a general theory,
Springer-Verlag, Berlin/Heidelberg/New York, 1978.
[9] P.-N. Tan, V. Kumar, J. Srivastava, Selecting the right objective measure for association analysis, Information
Systems 29 (4) (2004) 293–313.
[10] L. Geng, H. J. Hamilton, Interestingness measures for data mining: A survey, ACM Computing Surveys 38 (3)
(2006) 9.
[11] L. Běhounek, V. Nov´ak, Towards fuzzy partial logic, In Proceedings of the IEEE 45th International Symposium
on Multiple-Valued Logics (ISMVL 2015) (2015) 139–144.

Primary author

Dr Petra Murinová (Institute for Reseaech and Application of Fuzzy Modelling)

Co-author

Dr Michal Burda (Institute for Research and Applicationof Fuzzy Modelling)

Presentation materials

There are no materials yet.