Language Technology: Speech and Language Processing

This course is cross-listed between the Liguistics Graduate Program and the Computer Science Doctoral Program.
The course number for Linguistics is 83600, CRN code 17382.
The course number for Computer Science is 84010, CRN code 18281.

Spring 2012
Fridays, 11:45am to 1:45pm
Graduate Program in Linguistics and and Doctoral Program in Computer Science
CUNY Graduate Center

Instructor

Dr. Matt Huenerfauth
Associate Professor
Computer Science, CUNY Queens College
Computer Science, CUNY Graduate Center
http://eniac.cs.qc.cuny.edu/matt

Description

Applications of speech and language processing are found everywhere today. Automated telephone systems, for example, incorporate voice recognition and synthesis. This course will explore how computers deal with natural language in such areas as speech recognition, speech generation, and machine translation. Intended as an introduction to the field, the course will survey a range of methodologies in speech and language processing and will cover the basic components of natural language systems, including the lexicon, syntax and parsing, semantic analysis and representation, discourse processing, and pragmatics.

Research Interests

This course would be excellent for students who may be interested in research in computational linguistics or natural language processing (NLP). Specific research areas surveyed briefly in this course will include: Natural Language Processing, Natural Language Generation, Statistical Parsing, Speech Technologies, Machine Translation, Information Extraction, Automatic Summarization, and others.

The NLP Research Community at CUNY

There is a vibrant community of researchers in natural language processing at CUNY; this includes several laboratories, funded research projects, students, and seminar series. Taking this course could be a good way to become more involved in this community by learning about the key concepts in the field. You can visit the NLP at CUNY website for more information.

Textbook

We will be using the following textbook for the course: Speech and Language Processing, Second Edition, by Daniel Jurafsky and James H. Martin, Pearson / Prentice Hall, 2008. ISBN: 978-0131873216. Retail Price: $90.

Assignments and Grading

There will be short homework assignments every 1 to 2 weeks. There will be a term project: part of the project will be due in the middle of the semester (a bibliography and proposal) and part of the project will be due at the end (a final report or deliverable). Depending on the number of students and how quickly the lecture schedule progresses, we may also have in-class presentations at the end of the semester.

Course Topics

This is a tentative list of the topics to be covered in this course:

Prerequisites

The course is open to graduate students with a solid background in either linguistics or computing. Knowledge of both is not required. It is recommended that students with minimal computer programming background have first taken Linguistics-73600 to learn programming skills. For students with minimal programming skills, Linguistics-83800 is a recommended co-requisite for this course. Computer science students or students with more programming skills will have the option of doing a programming-based rather than a research-based term project.