The Theory and Practice of Secure Data Mining

Data mining isn't always about structured data. Text mining -- or text data mining -- is about comprehending natural language and extracting high quality information from it. Natural languages have structure, too. These structures are generally more complex than a schema, especially one designed for data mining.

As you read a sentence, its meaning may be clear even before you reach its end. This illustrates our topic. Our minds process text sequentially. As we read, the context presented to us by an author develops in our minds. What precedes clarifies what follows, and vice-versa.

This phenomenon is a result of efficiency. It's how language works. Reducing the number of symbols we use simplifies communication in one sense; but it also forces us to adopt complications like words and grammar. Few of us write with hieroglyphs anymore.

Consequently, we render our thoughts in the form of longer streams of consciousness, like this paragraph. The more reading we do, the better we are at predicting what's looming ahead. Yet, we still must let our author's picture become complete in our mind before we are sure we "get" the meaning.


Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <h1> <quote> <img>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.