College of Public Health

Permanent URI for this community

https://hdl.handle.net/1920/2852

The College of Public Health provides education, research, and service opportunities to improve the health of our communities.

Browse

Now showing 1 - 6 of 6

An Adjustable Description Quality Measure for Pattern Discovery in Large Databases Using the AQ Methodology
(2000-03) Kaufman, Kenneth A.; Michalski, Ryszard S.
In concept learning and data mining tasks, the learner is typically faced with a choice of many possible hypotheses or patterns characterizing the input data. If one can assume that training data contain no noise, then the primary conditions a hypothesis must satisfy are consistency and completeness with regard to the data. In real-world applications, however, data are often noisy, and the insistence on the full completeness and consistency of the hypothesis is no longer valid. In such situations, the problem is to determine a hypothesis that represents the best trade-off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a rule quality criterion that combines the rule coverage (a measure of completeness) and training accuracy (a measure of inconsistency). These factors are combined into a single rule quality measure through a lexicographical evaluation functional (LEF). The method has been implemented in the AQ18 learning system for natural induction and pattern discovery, and compared with several other methods. Experiments have shown that the proposed method can be easily tailored to different problems and can simulate different rule learners.by modifying the parameter of the rule quality criterion.
Generating Alternative Hypotheses in AQ Learning
(2004-12) Michalski, Ryszard S.
In many areas of application of machine learning and data mining, it is desirable to generate alternative inductive hypotheses from the given data. The Aq-ALT or, briefly, ALT method, presented in this paper, generates alternative hypotheses in two phases. The first phase proceeds according to the standard Aq algorithm, but each star generation process produces not just one best complex, but rather a collection of complexes, called the elite. This phase ends when the union of best complexes constitutes a complete and consistent cover of the target set, called the primary hypothesis. The second phase derives alternative hypotheses by multiplying out the disjunctions of symbols representing complexes in each elite, and creating an irredundant DNF expression. Individual terms in this expression determine alternative hypotheses. These hypotheses are ranked according to a given hypothesis evaluation criterion, LEFh, and the alt best hypotheses are selected, where alt is a parameter provided to the program. The method is extended to inconsistent covering problem by introducing an event membership probability function. The selected hypotheses can be used as alternative generalizations of data, or arranged into an ensemble of classifiers to perform a form of boosting. The ALT method is general, and can thus be employed not only in concept learning, but also for generating alternative solutions to any general covering problem.
Modeling User Behavior by Integrating AQ Learning with a Database: Initial Results
(2002-06) Cervone, Guido; Michalski, Ryszard S.
The paper describes recent results from developing and testing LUS methodology for user modeling. LUS employs AQ learning for automatically creating user models from datasets representing activities of computer users. The datasets are stored in a relational database and employed in the learning process through an SQL-style command that automatically executes the AQ20 rule learning program and generates user models. The models are in the form of attributional rulesets that are more expressive than conventional decision rules, and are easy to interpret and understand. Early experimental results from the testing of the LUS method gave highly encouraging results.
Multitype Pattern Discovery Via AQ21: A Brief Description of the Method and Its Novel Features
(2006-06) Wojtusiak, Janusz; Michalski, Ryszard S.; Kaufman, Kenneth A.; Pietrzykowski, Jaroslaw
The AQ21 program seeks different types of patterns in data and represents them in human-oriented forms resembling natural language descriptions. Because of the latter feature it is called a natural induction program. This feature is achieved by employing a highly expressive representation language, Attributional Calculus, that combines aspects of propositional, predicate and multi-valued logic for the purpose of supporting pattern discovery and inductive learning. This paper briefly describes the pattern discovery mode in AQ21, and several novel abilities seamlessly integrated in it, specifically, to discover different types of attributional patterns depending on the parameter settings, to optimize patterns according to a large number of different pattern quality criteria, to learn rules with exceptions, to determine optimized sets of alternative hypotheses generalizing the same data, and to handle data with missing, irrelevant and/or not-applicable meta-values. The discovered patterns are expressed in the form of attributional rules that are directly interpretable in natural language and are visualized using either general logic diagrams or concept association graphs. The described program features are illustrated by a sample of pattern discovery problems.
Reasoning with Meta-values in AQ Learning
(2005-06) Michalski, Ryszard S.; Wojtusiak, Janusz
This paper describes methods for reasoning with missing, irrelevant and not applicable meta-values in the AQ attributional rule learning. The methods address issues of handling these values in datasets both for rule learning and rule testing. In rule learning, the presence of these values affects the extension-against generalization operator in star generation, and the rule matching operator. In rule testing, these values affect the execution of the rule matching operator. The presented methods have been implemented in the AQ21 learning program and tested on four datasets.
Semantic and Syntactic Attribute Types in AQ Learning
(2007-11-18T03:33:50Z) Michalski, Ryszard S.; Wojtusiak, Janusz
AQ learning strives to perform natural induction that aims at deriving general descriptions from specific data and formulating them in human-oriented forms. Such descriptions are in the forms closely corresponding to simple natural language statements, or are transformed to such statements in order to make computer generated knowledge easy to interpret and understand. An important feature of natural induction is that it employs a wide range of types of attributes to guide the process of generalization. Attribute types constitute problem domain knowledge, and are provided by the user, or are inferred by the learning program from the data. This paper makes a distinction between semantic and syntactic attribute types in AQ learning, explains their relationships and provides their classifications. Semantic types depend solely on the structure of attribute domains and help to create plausible generalizations, while syntactic types depend also on physical properties of attribute domains, and are used to efficiently implement semantic types.

Browse

Browsing College of Public Health by Subject "AQ learning"

Results Per Page

Sort Options