Creator: Marko Bohanec
Donors to UCI ML Repository: Marko Bohanec,
Blaz Zupan
Date: June, 1997
The hierarchical decision model, from which this dataset is derived,
was first presented in
M. Bohanec and V. Rajkovic: Knowledge acquisition and explanation for multi-attribute
decision making. In 8th Intl Workshop on Expert Systems and their Applications,
Avignon, France. pages 59-78, 1988.
Within machine-learning, this dataset was used for the evaluation of HINT
(Hiearchy INduction Tool). The results are presented in
B. Zupan, M. Bohanec, I. Bratko, J. Demsar (1997) Machine
learning by function decomposition. In (D. Fisher, ed.) Proc.
ICML-97, pages 421-429. Morgan-Kaufmann.
and show that HINT is able to completely reconstruct the original hierarchical model. The paper further compares the generalization capability of HINT and C4.5. The learning curve obtained by both learning systems is (p is the percent of examples used for learning, y axis shows the classification accuracy when all remaining examples are classified).
Car Evaluation Database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp. 145-157, 1990.). The model evaluates cars according to the following attribute structure:
The features used in the structure are:
CAR car acceptability . PRICE overall price . . buying buying price . . maint price of the maintenance . TECH technical characteristics . . COMFORT comfort . . . doors number of doors . . . persons capacity in terms of persons to carry . . . lug_boot the size of luggage boot . . safety estimated safety of the car
and can use the following sets of values:
CAR unacc, acc, good, v-good . PRICE v-high, high, med, low . . BUYING v-high, high, med, low . . MAINT v-high, high, med, low . TECH poor, satisf, good, v-good . . COMFORT bad, acc, good, v-good . . . DOORS 2, 3, 4, 5-more . . . PERSONS 2, 4, more . . . LUG_BOOT small, med, big . . SAFETY low, med, high
The model includes three intermediate concepts (PRICE, TECH, COMFORT). Every higher-level feature is in the original model related to its lower level descendants by a set of examples (click on the intermediate or target concept - circled in the structure - to see the set of examples that define it).
The Car Evaluation Database contains examples with the structural information
removed, i.e., directly relates CAR to the six input attributes buying,
maint, doors, persons, lug_boot, safety. Because of known underlying concept
structure, this database may be particularly useful for testing constructive
induction and structure discovery methods.
Number of Instances: 1728 (instances completely cover the attribute
space)
Number of Attributes: 6
Class distribution:
Class | N | N[%] |
---|---|---|
unacc | 1210 | 70.023% |
acc | 384 | 22.222% |
good | 69 | 3.993% |
v-good | 65 | 3.762% |
PRICE TECH CAR ------------------------ v-high poor unacc high poor unacc med poor unacc low poor unacc v-high satisf unacc high satisf unacc med satisf acc low satisf acc v-high good unacc high good acc med good acc low good good v-high v-good unacc high v-good acc med v-good v-good low v-good v-good
doors persons lug_boot COMFORT ---------------------------------- 2 2 small bad 3 2 small bad 4 2 small bad 5-more 2 small bad 2 4 small acc 3 4 small acc 4 4 small acc 5-more 4 small acc 2 more small bad 3 more small acc 4 more small acc 5-more more small acc 2 2 med bad 3 2 med bad 4 2 med bad 5-more 2 med bad 2 4 med acc 3 4 med acc 4 4 med good 5-more 4 med v-good 2 more med acc 3 more med good 4 more med v-good 5-more more med v-good 2 2 big bad 3 2 big bad 4 2 big bad 5-more 2 big bad 2 4 big good 3 4 big good 4 4 big v-good 5-more 4 big v-good 2 more big good 3 more big v-good 4 more big v-good 5-more more big v-good
buying maint PRICE ---------------------- v-high v-high v-high high v-high v-high med v-high high low v-high high v-high high v-high high high high med high high low high med v-high med high high med high med med med low med low v-high low high high low high med low low low low low
COMFORT safety TECH ------------------------ bad low poor acc low poor good low poor v-good low poor bad med poor acc med satisf good med good v-good med good bad high poor acc high good good high v-good v-good high v-good