Sunday, June 15, 2014

Not all data mining packages are created equal (Comparison of 6 major free tools)

For anyone looking to compare characteristics, pros and cons of the six most commonly used data mining free software tools, please refer to a great expert paper titled "An overview of free software tools for general data mining" by A. Jović, K. Brkićand N. Bogunović, Faculty of Electrical Engineering and Computing, University of Zagreb / Department of Electronics, Microelectronics,Computer and Intelligent Systems, Zagreb, Croatia. The six tools extensively covered in the paper are as following-
  • RapidMiner
  • R
  • Weka
  • Orange
  • scikit-learn
The paper contains a comparison of the implemented algorithms covering all areas of data mining, such as,
  • classification
  • regression
  • clustering
  • associative rules
  • feature selection
  • evaluation criteria
  • visualization
Also covered in the paper are advanced and specialized research topics, such as,
  • big data
  • data streams
  • text mining
In short it is a treasure trove of generally great information for either a novice or an expert who might just be wondering if he or she choose the right tool.

Comparison of free data mining tools (R, RapidMiner, Weka and more)
Comparison of free data mining tools