Classification Rule Extraction by Ant-Miner for Weed Risk Assessment
Weed risk assessment (WRA) models developed by Pheloung et al. (1999) and Daehler et al. (2004) allow an informed decision prior to introducing potentially invasive plant species into a country. In this study, Ant-Miner, a data mining tool, is used to develop classification rules for WRA models of Australia, and Hawaii and the Pacific. Ant-Miner (Parpinelli et al., 2002), based on Ant Colony Optimization (ACO), is a metaheuristic inspired by the foraging behaviour of ant colonies. Its objective is to solve discrete optimisation problems and extract classification rules by simulating the behaviours of ants. For this study, Ant-Miner identifies a shortest pathway described by nodes, i.e., the 50 questions from WRA, by overcoming ant behaviour problems, e.g., the dead end, loop, returning root and evaporation of pheromones (Figure 1), during the search for the destination, e.g., a single decision described by either yes, no or blank to classify the class: low to high risk (reject) or evaluate and more information required in the WRA models. The purposes of detecting the dominant pathway are: 1) to understand how the decision process for plant risk is assessed from answering the questions in the current WRA model, and 2) to understand the WRA criteria in regards to how the decision process differs among regions and climates, e.g., Australia, and Hawaii and the Pacific. Ant-Miner is found to be an effective alternative data mining tool, since it obtained reasonably high classification accuracy (via 10-fold cross validation); in particular for the Hawaii and Pacific Island WRA model (81±1.24%) and for the Australia WRA model (71±2.26%). The extracted rules for Ant-Miner suggest that high risk species are assessed mostly under the following key factors: for Australia, if the species have been naturalized beyond their native range and reproduce by vegetative propagation, and for the Pacific, if the species have been naturalized beyond their native range and are congeneric, but not parasitic. Ant-Miner detects that the dispersal mechanism is an important factor for the classes low or evaluate for both Australia, and Hawaii and the Pacific WRA models. On the other hand, from both WRA models, the question about the plant type was found to be less significant for the plant risk assessment. The reproduction process for Australia and the location of the weed for Hawaii and the Pacific are detected to be overall important factors for the plant risk assessment. Identifying influential factors in weed risk helps improve cost effective biosecurity assessment by highlighting important and modifying or perhaps removing unimportant questions of the current WRA model to increase the overall accuracy. This study will encourage further investigation with larger data sets from different regions in future to add knowledge to help the WRA model improvement.