Data Mining as a Practical Science


Data mining is located at the crossing of different disciplines. Its roots are to be found in the data analysis techniques that were originally the main object of the study of statistics. The fundamental ideas at the basis of estimation theory, classification, clustering, sampling theory, are indeed still one of the major ingredients of data mining. But other methods and techniques have been added to the toolbox of the data analyst, extending the limits of the classical parametric statistics with more complex models, reaching their maturity with the actual state of knowledge on decision trees, neural networks, support vector machines, just to mention a few. In addition, the need to organize and manage large bodies of data has required the deployment of computer science techniques for database management, query optimization, optimal coding of algorithms, and other tasks devoted to the storing of information in the memory of computers and to the efficient execution of algorithms.

A common trademark of the modern approaches is the formalization of estimation and classification problems arising in data mining as mathematical optimization problems, and the use of consistent algorithmic techniques to determine optimal solutions for these problems. Such methodological framework has been strongly supported by applied mathematics and operations research (OR), a scientific discipline characterized by a deep integration of mathematical theory and practical problems. A significant evidence of the role of OR in data mining is the contribution that nonlinear and integer optimization methods have given to the solution of the error minimization functions that need to be optimized to train neural networks and support vector machines. Analogously, integer programming and combinatorial optimization have been largely used to solve problems arising in the identification of synthetic rule-based classification models and in the selection of optimal subsets of features in large datasets.

Despite its strong methodological characterization, data mining cannot be successfully applied without a deep understanding of the semantic of each specific problem, which often requires the customization of existing methods or the development of ad hoc techniques, partially based on already existing algorithms. To some extent, the real challenge that the data mining practitioner has to face is the selection, among many different methods and approaches, of the one that best serves the scope of the task considered, often assessing a compromise between the complexity of the chosen model and its generalization capability.

Legal Disclaimer

Our website is not responsible for the information contained by this article. Webworldarticles.com is a free articles resource thus practically any visitor can submit an article. However if you notice any copyrighted material, please contact us and we will remove the article(s) in discussion right away.


This article was sent to us by: Ralph Dawson at 11302007

Related Articles

1. What's the Place of IT in CRM Initiatives
Should IT be kept away from CRM projects because CRM projects are really all about the business function? No, that would be a big mistake for three reasons. ...

2. A Happy IT Staff: From Recruiting to Retaining
Keeping the IT staff happy is an unwritten responsibility of a boss that needs to be taken seriously. And this one responsibility towers over many others, because of the ...

3. IT Outsourcing to India: Moving up the value chain
Today IT outsourcing to India has become more about high quality rather than reduced cost. Quality is the new buzzword and is dominating business processes and services l...

4. Filling the Gap: Support the Enterprise Not the IT Department
Aligning technologies with the business needs is a challenge that every organisation, whether a one-man operation or thousands of employees enterprise, is facing on a dai...

5. IMPLEMENTATION AND MANAGEMENT OF E~PROCESSES
A strong management team is a critical requirement for every financial venture and for Internet-oriented ventures, and indeed the successful ones need to operate w...

6. E~PROCESSES AND E~BUSINESS: MANAGING IT ENABLED OPERATIONS
The impact of the Internet on e-operations is a central theme; indeed, it can be seen as the Web tying together all the disparate elements necessary for the format...

7. E~PROCESSES
E-operations can be seen as a combination of operations strategy and effectiveness, and business technology – the intersection of operations and technologywh...