Text Mining, Data Mining or simply TDM refers to a range methodologies developed to work with very large volumes of data or text. Programmatic analysis of copied content is then employed to reveal patterns, links, trends and other insights. Tools such as Python or R are common examples of the programs used.
For the purposes of academic research there are two types of TDM:
- The first type is where researchers are mainly interested in finding and accessing content where its owners allow use of mining techniques. Content needs to located, questions of copyright permission resolved and then copied to create a corpus. After that stage the choice of tools is decided by the researcher. Many academic journals allow this type of access as long as permission is sought in advance.
- The second type of TDM is where owners of content not only allow access but also provide their own specially developed tools to support mining. These are often referred to as an application programming interface (API). Examples of this type of mining resource currently subscribed to by the Bodleian include:
Other examples of data mining software are listed in the Data Analysis section of this site. A team within the Bodleian have responsibility for developing and managing support for researchers, and a TDM library guide has recently been published. If you have suggestions of additional mining products or would like support for a research project, contact the Bodleian Data Librarian.