2002 AAAI Spring Symposium on Information Refinement and Revision for Decision Making:

Modeling for Diagnostics, Prognostics, and Prediction

Software and Data

Current softwareUseful datasets

Home
Description 
Organizing Committee 

Symposium
Dates, times, and places 
Why a symposium now? 
Who should attend? 
Format 
List of Accepted Papers 
Paper Submission 
About AAAI 

Decision Making in M&D
Background 
Examples 

Artificial Intelligence
Challenge problems 

Software and Data

References
FAQ 
Web sites 
Bibliography 

Contact
Comments 
Subscriptions

Current software

Several commercial and shareware software packages facilitate classification and segmentation techniques which can be used in the context of M&D. This (highly incomplete) list includes decision tree approaches, rule based approaches, neural approaches, and Bayesian Belief Net approaches. 


 
 

Decision Tree Approaches

commercial:

  • AC2, provides graphical tools for data preparation and builing decision trees. 
  • Alice d'Isoft 6.0, a streamlined version of ISoft's decision-tree-based AC2 data-mining product, is designed for mainstream business users. 
  • Business Miner, data mining product positioned for the mainstream business user. 
  • C5.0/See5, constructs classifiers in the form of decision trees and rulesets. Includes latest innovations such as boosting. 
  • CART 4.0 decision-tree software, from winners in the KDD Cup 2000. Advanced facilities for data mining, data pre-processing and predictive modeling including bagging and arcing. 
  • Cognos Scenario, allows you to quickly identify and rank the factors that have a significant impact on your key business measures. 
  • Decisionhouse, provides data extraction, management, pre-processing and visualization, plus customer profiling, segmentation and geographical display. 
  • KnowledgeSEEKER, high performance interactive decision tree analytical tool. 
  • Neusciences aXi.DecisionTree, ActiveX Control for building a decision tree. Handles discrete and continuous problems and can extract rules from the tree. 
  • PolyAnalyst, includes an information Gain decision tree among its 11 algorithms 
  • SPSS AnswerTree, easy to use package with CHAID and other decision tree algorithms. Includes decision tree export in XML format. 
  • XpertRule Miner (Attar Software), provides graphical decision trees with the ability to embed as ActiveX components. 

free:

  • C4.5, the "classic" decision-tree tool, developed by J. R. Quinlan, (restricted distribution) 
  • EC4.5, a more efficient version of c4.5, which uses the best among three strategies at each node construction. 
  • IND, provides Gini and C4.5 style decision trees and more. Publicly available from NASA but with export restrictions. 
  • LMDT, builds Linear Machine Decision Trees (based on Brodley and Utgoff papers). 
  • ODBCMINE, shareware data-mining tool that analyzes ODBC databases using the C4.5, and outputs simple IF..ELSE decision rules in ascii. 
  • OC1, decision tree system continuous feature values; builds decision trees with linear combinations of attributes at each internal node; these trees then partition the space of examples with both oblique and axis-parallel hyperplanes. 
  • PC4.5, a parallel version of C4.5 built with Persistent Linda (PLinda) system. 
  • PLUS, Polytomous Logistic regression trees with Unbiased Splits, (Fortran 90). 

Bayesian Belief Net approaches:

commercial:

  • Analytica, influence diagram-based, visual environment for creating and analyzing probabilistic models (Win/Mac). 
  • AT-Sigma Data Chopper, for analysis of databases and finding causal relationships. 
  • Bayesware Discoverer 1.0, an automated modeling tool able to extract a Bayesian network from data by searching for the most probable model 
  • newData Digest Business Navigator 5, combines Bayesian networks, graphical UI, and data preparation tools. 
  • DXpress, Windows based tool for building and compiling Bayes Networks. 
  • Ergo(tm), Bayesian Network Editor and Solver (Win and Mac demos available) 
  • HUGIN, full suite of Bayesian Network reasoning tools 
  • KnowledgeMiner , uses self-organizing neural networks to discover problem structure (Mac platform) 
  • Netica, bayesian network tools (Win 95/NT), demo available. 
  • PrecisionTree, an add-in for Microsoft Excel for building decision trees and influence diagrams directly in the spreadsheet 

free:

  • BAYDA 1.0
  • Bayesian belief network software (Win95/98/NT/2000), including 
    • BN PowerConstructor: An efficient system for learning BN structures and parameters from data. Constantly updated since 1997. 
    • BN PowerPredictor: A data mining program for data modeling/classification/prediction. It extends BN PowerConstructor to BN based classifier learning. 
  • FDEP, induces functional dependencies from a given input relation. (GNU C). 
  • GeNIe, decision modeling environment implementing influence diagrams and Bayesian networks (Windows). Has over 2000 users. 
  • JavaBayes 
  • MSBN: Microsoft Belief Network Tools, tools for creation, assessment and evaluation of Bayesian belief networks. Free for non-commercial research users. 
  • Pulcinella, tool for Propagating Uncertainty through Local Computations based on the Shenoy and Shafer framework. (Common Lisp) 
  • SPI, Probabilities, Local Expression Language Utilities, Explanation, Dynamic Models, GUI. (Common Lisp). 
  • RoC (Robust Bayesian Classifier) v 1.0, for MS Windows 9x/NT


Rule-based approach

commercial:

  • AIRA, a rule discovery, data and knowledge visualization tool. AIRA for Excel extracts rules from MS-Excel spreasheets. 
  • Datamite, enables rules and knowledge to be discovered in ODBC-compliant relational databases. 
  • DataDowser, finds IF[AND] THEN association rules; uses fuzzy logic. 
  • PolyAnalyst, builds fuzzy logic classification rule with PolyNet Predictor, SKAT, or Linear Regression. 
  • SuperQuery, business Intelligence tool; works with Microsoft Access and Excel and many other databases. 
  • WizWhy, automatically finds all the if-then rules in the data and uses them to summarize the data, identify exceptions, and generate predictions for new cases. 
  • XpertRule Miner (Attar Software) provides association rule discovery from any ODBC data source. 

free:

  • CBA, mines association rules and builds accurate classifiers using a subset of association rules. 
  • Claudien, a clausal discovery engine 
  • CN2, inductively learns a set of propositional if...then... rules from a set of training examples by performing a general-to-specific beam search through rule-space. 
  • DBPredictor 
  • KINOsuite-PR extracts rules from trained neural networks. 
  • RIPPER, a system that learns sets of rules from data. Fast -- asymptotically O(n*logn*logn), where n is the number of cases. ANSI C, Unix. For research purposes only.

Neural Net Approaches:

commercial:

free and shareware:


Hybrid and other approaches 

commercial:

  • Affinium Model Suite, includes linear regression, logistic regression, CHAID, neural networks, and genetic algorithms.
  • Clementine from SPSS, leading visual rapid modeling environment for data mining. Now includes Clementine Server. 
  • Darwin (now part of Oracle), high-performance data mining software, optimized for parallel servers 
  • KINOsuite PR, extracts rules from trained neural networks.
  • Knowledge Studio, featuring multiple data mining models in a visual, easy-to-use interface. 
  • MarketMiner automatically selects the best mining technique using: Statistical Networks, Logistic and linear regression, K nearest neighbors, and Decision trees (C4.5). 
  • Polyanalyst, features multiple classification algorithms: Decision Trees, Fuzzy Logic, and Memory Based reasoning.
  • newPredictionWorks, includes decision tree (gini, entropy, C4.5), logistic regression, k nearest-neighbor, naive bayes and linear regression. Free test over the web!
  • Previa Classpad, provides an interactive environment for classification using neural networks, decision trees, and bayesian networks. 
  • prudsys DISCOVERER: non-linear decision trees (NDTs) and sparse grid methods for classification
  • Datalogic, professional tool for knowledge acquisition, classification, predictive modelling based on rough sets.
  • KXEN, Components, based on Vapnik's work on SVM.
  • K-DYS, rough set approach
  • Discipulus, uses genetic programming for engineering and business data mining problems.
  • Evolver, genetic programming
  • MARS, J. Friedman's automated logistic regression for binary classification problems. Automatic missing value handling, interaction detection, variable transformation.



  • WINROSA, automatically generates fuzzy rules, based on the fuzzy ROSA method (Rule Oriented Statistical Analysis).

free:

  • BSVM, a decomposition method for support vector machines (SVM) for large classification problems. 
  • LIBSVM, a support vector machines (SVM) library for classification
  • Kernel Machines and related methods website
  • Grobian, user-friendly software to analyse data with rough set technology (C++).
  • updateRosetta, is a toolkit for analyzing tabular data within the framework of rough set theory. Runs under Windows NT/98/95/2000 
  • Rough Enough, data mining tool that supports the entire process; developed under Paradox DBMS for Windows.
  • PEBLS, a nearest-neighbor learning system designed for symbolic feature values; was applied to bioinformatic and genetic problems 
  • TiMBL 2.0, nearest neighbour approach.
  • MLC++, a library of C++ classes for supervised machine learning, including multiple classification algorithms.
  • JAM, Java Agents for Meta Learning (applications to fraud and intrusion detection).
  • SIPINA-W, produces decision graphs and trees. Includes several classification methods. (Win). Shareware
  • ROC Convex Hull Program for comparing classifiers, (perl), under GPL.

Useful datasets

Some of the standard datasets which can be used for benchmarking (such as the IRIS dataset) are available at the machine learning data set repository at UCI which is accessible via ftp://ftp.ics.uci.edu/pub/machine-learning-databases/

Version 1.0
Updated 9/19/01

Current softwareUseful datasets