Skip to Main Content

Text Mining

What is text mining?

Text mining or text analysis are blanket terms for analyzing documents (books, tweets, news reports, etc) with the aid of software. Text analysis is a methodological approach and discipline agnostic. Text analysis is performed on corpora, collections of machine-readable text that are designed to answer specific kinds of questions.

A corpus can cover: 

Do I need permission to text mine?

A library subscription DOES NOT imply that text mining is permitted. Some licenses have text mining language, and some will require permission.

Regardless of licensing permissions, some text mining techniques can create server issues for providers. Make sure the methodology to be used follows the provider's preferences. Also, some preferred methods may need assistance from the provider. 

Details to communicate:

  • Define types of information being mined.
  • Define method.
  • Is it for a one-time occurrence or going? (And if ongoing, at what frequency?) 

We are happy to help with advice for the permission letter, or for information about what our license with a specific provider permits.

How we can help

The UCSC Library can provide consultations about software and what copyright permissions you may need to request access.