CRISP-DM (CRoss-Industry Standard Process for Data Mining
) is a non-proprietary, documented and freely available data mining
Selecting just one of the definitions is not as important as realizing that people will use the term data mining
in at least the four ways described in the sidebar.
While data mining
tools can identify information that is useful for future planning, data mining
can also be very useful for detecting fraudulent (theft or invoice manipulations) or high-risk activities (bad loans, investments).
For instance, data mining
can help them identify such key elements in a case or series of events as patterns of time and location--by forecasting future events based on this historical data, agencies potentially could anticipate strategic locations for deployment.
Written in C for Windows, Linux or Solaris platforms, NAG Data Mining
Components can extract data from flat files or OBDC compliant databases.
One of the key challenges is integrating tons of data in the mainframe data warehouse with business-specific data in data marts on the open systems client-server platform, said Kent Bauer, director of data mining
and analytics at Axa Financial.
Provides best practices for performing data mining
using simple tools such as Excel
s enterprise-strength data mining
workbench, helps businesses improve the profitability of customer relationships through in-depth understanding of data.
tools are especially important for marketing departments because marketing efforts succeed when they single out customer segments with high profit potential.
Clients can now fully leverage their IT investments by using the parallelism in high-performance hardware and software (such as multiprocessor or multicore systems), as well as data mining
algorithms provided as part of the database systems by IBM, Microsoft and Oracle.
Pentaho was an obvious choice, based on the strength of its data mining
technology, its team, and its clear leadership in open source BI.
The author provides a new statistical methodology specifically designed for evaluating the performance of rules that are discovered by data mining
, a process in which many rules are back-tested and the best performing rule(s) is selected.