 |
Rice University
Department of Computer Science
presents
Hisham Al-Mubaid
University of Houston-Clear Lake
Machine Learning for Natural Language Processing
Abstract
Most of the efficient natural language processing (NLP) systems and methods
are designed based on machine learning approaches. We have devised a new
learning architecture and applied it to a number of NLP problems including
detection and corrections of context-based spelling errors, word prediction,
and text classification. The learning architecture is based on a logic
learning classification technique called Lsquare. Lsquare is a
classification technique based on learning logic rules; this technique has
achieved a competitive performance in a number of classification-based NLP
and data mining problems. The learning method has the following new
properties: (1) A new way of encoding the representation of words. The
features of words are extracted such that, for a given occurrence of a word
w, the structure of the neighborhood of w is recorded in connection with
other occurrences of w and their neighborhoods. (2) A new way of learning
how correct word occurrences can be recognized from prior, correct text.
This step uses the encoding of (1) and the data-mining algorithm Lsquare. It
acquires knowledge (trained classifiers) that contains insight into the
context-based syntactic [and to some extent semantic] representation of text
words. (3) A new way of employing the knowledge of (2) in the target NLP
task. The learning method will be explained along with a number of examples
applications from NLP domain.
Tuesday, May 20 at 2:00 p.m. in DH 1049
About Hisham Al-Mubaid
Hisham Al-Mubaid has received his master's
and Ph.D. degrees in computer science from
the University of Texas at Dallas in 1996
and 2000, respectively. He is now an assistant
professor in department of computer science
at University of Houston - Clear Lake. His
research interests include natural language
processing and machine learning, operating
and distributed systems.
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- |
|
| |