The Cognitive Computation Group at The University of Illinois has a number of Natural Language Processing (NLP) demos. They are fun to browse. They are all based on doing interesting things with plain text.
Data-Intensive Text Processing with MapReduce is a Free online (PDF) textbook about text processing on large amounts of data. The 1st edition has been available for a couple of years, and a 2nd edition is in the works. Here is quick overview of some of the topics.
- Graph Algorithms
- Text Processing
Happy Reading (and Text Processing)!