Rapid Miner is ready-made, open source, 'no-coding required' software which gives advanced analytics. It incorporates multifaceted data mining functions such as data pre-processing, visualisation, predictive analysis.
Free to download (you will need to register for a new account)
Online video tutorials are available on the Rapid Miner website.
Weka is a free-to-download collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualisation.
Weka is free to download.
View free online courses on data mining with machine learning techniques.
Orange is a Python-based, powerful and open source tool for both novices and experts. It has components for text mining, visual programming, machine learning, add-ons for bioinformatics, data analytics.
Orange is free to download.
View the YouTube training video.
R is a free software environment for statistical computing and graphics is one of the leading tools used to do data mining tasks. It is packaged with hundreds of libraries built specifically for data mining, and comes with community support.
Training resources available:
- online tutorials
- ITLC taught courses (enter 'R data' in the filter box)
- online LinkedIn Learning tutorials.
Knime is an open-source data analytics, reporting and integration platform. It does all three of data pre-processing main components: extraction, transformation and loading. Its GUI allows for the assembly of nodes for data processing and integrates various components for machine learning and data mining.
Free to download (register for help and updates).
Online training resources are available on the Knime website.
Rattle, ‘R Analytical Tool To Learn Easily’, has been developed using the R statistical programming language. The software runs on Linux, Mac OS and Windows, and features statistics, clustering, modelling and visualisation with the computing power of R. Rattle is currently being used in business, commercial enterprises and for teaching purposes in Australian and American universities.
- A PDF training brochure is available.
Tanagra is a free, open-source data mining software for academic and research purposes. It proposes several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. It contains some supervised learning, but also paradigms such as clustering, factorial analysis, parametric and non parametric statistics, association rule, feature selection and construction algorithms.
The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, following the present norms for software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data.
Guidance is available on the tutorial blog.
XLMiner is the only comprehensive data mining add-in for Excel, with neural nets, classification and regression trees, logistic regression, linear regression, Bayes classifier, K-nearest neighbors, discriminant analysis, association rules, clustering, principal components, and more. XLMiner provides everything you need to sample data from many sources — PowerPivot, Microsoft/IBM/Oracle databases, or spreadsheets. You can explore and visualise your data with multiple linked charts; preprocess and ‘clean’ your data, fit data mining models, and evaluate your models’ predictive power.
The drawback of XL Miner is that it is a paid add-in for Excel, but there is a 15-day free trial option. The software has great features and its integration in Excel makes life easier.
Download a 15-day free trial.
- Online tutorials are available.
- You can learn Excel data-mining with LinkedIn Learning.