Main teacher. Theory classes : Marta Gatius Vila
Problems and laboratory classes: Horacio Rodríguez Hontoria
Attending timetable
Marta Gatius: Wednesdays from 11 to 13. Omega Building. Office 218
email:gatiusatcs.upc.edu
Horacio Rodríguez : Omega Building. Office 316
email: horacioatcs.upc.edu
Brief description of the contents
This course is an introduction to most relevant problems involved in Natural Language Processing (NLP), the most relevant techniques and resources used and the theories they are based on. The course includes an overview of Natural Language applications.
The course is focused on the two most relevant approaches to NLP: knowledge based and empirical ( both statistical and machine learning).
The students will learn the fundamental concepts of NLP, most well-known techniques and theories as well as most relevant existing resources. They will learn about most relevant applications of NLP and the theories, techniques and resources involved in those applications.
The students will reason (eventually, in group) about several problems in the NLP context and the different possible techniques and resources that can be useful for their resolution. They will also design and develop programs to solve specific problems, that require the selection of most appropriate techniques and existing resources.
2. Applications.
3. Language models.
4. Basic levels of lingusitic description.
5. Syntactic processing.
6. Semantic and pragmatic processing.
7. Generation
Computational Linguistic and Natural Language Processing (NLP). History, motivation, applications. Main challenges in NLP. Levels in linguistic description.
2. Applications
Multilingual systems. Dialogue. NL interfaces and multi-modal interfaces. Question/Answering.
Information extraction. Summarization. Information retrieval. Translation.
3. Language models
Statistical language models. Finite state techniques. Markov model, Hidden-Markov model and their application to tagging.
4. Basic levels of linguistic description
Basic levels of linguistic description: textual, lexical and morphology processing. Dictionaries and lexicons.
5. Syntactic processing
Formal languages. Grammars. Syntagmatic grammars. Contextual grammars . Logic grammars. Feature grammars. Other recent grammars (GPSG, HPSG).
Basic techniques of syntactic processing.
Obtaining syntactic knowledge. Grammar induction.
6. Semantic and pragmatic processing
Representation of semantic knowledge. Semantic dictionaries. Ontologies.
Semantic interpretation. Semantic desambiguation (WSD).
Discourse. Dialogue. Dialogue grammars. Pragmatics.
7. Generation
Natural Language generation. Symbolic and statistical methods
There are three types of sessions: theory, exercise and laboratory.
In the theory sessions we will introduce new concepts together with the challenges they present and the approaches to face them.
In the exercises sessions we will work on the concepts, techniques and algorithms introduced in the theory sessions.
In the laboratory sessions small practices will be developed using the appropriate NLP tools to practice and reinforce the knowledge of the theory sessions.
There will be two exams: a mid-term exam, that worths 15% of the final grade and an end-of-term exam, that worths 45%.
Assignments done by the student during the course worth 40% of the final grade.
The end-of-term exam will include all the course contents. For those students failing (or not presenting) the mid-term exam, the end-of-term exam would worth the 60% of their final grade.
In particular, the final grade of the course would be calculated as follows:
Course grade = maximum ( mid-term exam grade*0.15 + end-of-term exam grade* 0.45, end-of-term exam grade* 0.6) + assignments grade *0.4
Frequent questions about the course grading (and their corresponding answers).
o [6] Allen, J. Natural Language Understanding, Benjamin/CummingsPublishing Company , 1995.
o [7] NLTK tutorials, http://nltk.sourceforge.net/lite/doc/en/
o [8] Kenneth R. Beesley and Lauri Karttunen, Finite State Morphology, CSLI Publications, 2003.
o [9] Peter Jackson, Isabelle Moulinier, Natural Language Processing for Online Applications: Text retrieval, Extraction, and Categorization, John Benjamins, 2007 (2nd edition).
o [10] Andras Kornai, Mathematical Linguistics, Springer Verlag 2008.
o [11] Robert B. Kaplan (ed) The Oxford Handbook of Applied Linguistics Second Edition Edited by OUP USA Oxford Handbooks in Linguistics, 2010.
NLTK, Natural Language Toolkit |
|
Association of Computational Linguistics ACL |
|
ACL Anthology |
|
Python |
|
The Python Papers Anthology |
|
Information Society Technology IST |
|
Sociedad Española para el procesamiento del lenguaje natural SEPLN |
|
Oficina del Español en la Sociedad de la Información OESI |
|
TALP (UPC) |
|
Grup de PLN de la UPC |
|
OpenNLP |
|
Pàgina de recursos de NLP de l' Universitat d'Stanford |
|
Mallet, toolbox en Java, desarrollado por Andrew McCallum para NLP de tipo estadístico |
|
WEKA, paquete integrado de Machine Learning |
|
Lingpipe |
1 |
2001 |
Luis Alfonso Ureña López |
Resolución de la Ambigüedad Léxica en Tareas de Clasificación Automática de Documentos. |
2 |
2002 |
Jose Luis Vicedo González |
Recuperación de información de alta precisión: los sistemas de búsqueda de respuestas. |
3 |
2003 |
Montserrat Civit Torruella |
Criterios de etiquetación y desambiguación morfosintáctica de corpus en español. |
4 |
2004 |
Anselmo Peñas Padilla |
Técnicas lingüísticas aplicadas a la búsqueda textual multilingüe. Ambigüedad, variación terminológica y multilingüismo. |
5 |
2005 |
Iulia Nica |
El conocimiento lingüístico en la desambiguación semántica automática. |
6 |
2007 |
David Martínez Iraolak |
Supervised Word Sense Disambiguation: facing Current Challenges |
7 |
2008 |
Enrique Amigó |
Síntesis de Información: Desarrollo y evaluación de un modelo interactivo |
8 |
2009 |
Jesús Ángel Giménez Linares |
Empirical Machine Translation and its Evaluation |
9 |
2010 |
Miguel Ángel García Cumbreras
|
BRUJA: Un sistema de Búsqueda de Respuestas Multilingüe
|
10 |
2011 |
Isabel Segura Bedmar |
Application of Information Extraction techniques to pharmalogical domain: Extracting drug-drug interactions |
11 |
2012 |
Fermín L. Cruz Mata |
Extracción de Opiniones sobre Características: Un Enfoque Práctico Adaptable al Dominio |
12 |
2013 |
F. Javier ORtega Rodriguez |
Detection of Dishonest Behaviours in On-Line Networks Using Graph-based Ranking Techniques |
MIT (Michael Collins) MIT
Toronto (Gerard Penn) Toronto
Johns Hopkins (Jason Eisner) JHU
Massachussetts Amherst (Andrew McCallum) UMass
http://en.wikipedia.org/wiki/User:Stevenbird/List_of_NLP_Courses
session |
T/P/L |
data |
content |
material |
Recommended readings |
1 |
T |
15/09/16 |
IntroductionApplications of NLP.(I) |
IntroductionApplications1 |
Part III from [4], Chapters 9 to 14 from [1], [9] |
2 |
P/L |
15/09/16 |
Introduction |
|
|
3 |
T |
22/09/16 |
Applications of NLP(II) InterfacesStatistical Models of Language. |
Applications 2
|
Part III from [4], Chapters 9 to 14 from [1], [9]Chapter 4 from [2], [3] |
4 |
P/L |
22/09/16 |
|
|
|
5 |
T |
29/09/16 |
Lexical Processing.Finite State Models |
|
Chapter 10 from [4]
Chapter 21 from [4] Chapter 2 from [2] Chapter 18 from [4] |
6 |
P |
29/09/16 |
|
|
|
7 |
T |
06/10/16 |
Morphology. |
Introduction morphologyMorphology(I) |
Chapter 2 from [4] Chapter 3 from [2][8] |
8 |
P/L |
06/10/16 |
|
|
|
9 |
T |
13/10/16 |
Tagging
|
Introduction POS taggingTagging |
Chapters 5 and 6 from [2]
|
10 |
P/L |
13/10/16 |
|
|
|
11 |
T |
20/10/16 |
Hidden Markov models. Syntax
|
|
|
12 |
P/L |
20/10/16 |
|
|
|
13 |
T |
27/10/16 |
Syntactic parsing
|
Parsing1 |
Chapter 4 from [1]Chapters 12-13 from [2] Chapter 3 from [3] Chapter 4 from [4] |
14 |
P/L |
27/10/16 |
|
|
|
15 |
T |
03/11/16 |
Midterm Exam |
|
|
16 |
L |
03/11/16 |
|
|
|
17 |
T |
10/11/16 |
Statistical parsing |
Parsing4
|
Chapter 22 from [1] Chapter 14 from [2] |
18 |
P/L |
10/11/16 |
|
|
|
21 |
T |
17/11/16 |
ParsingProblems |
|
|
22 |
P/L |
17/11/16 |
|
|
|
23 |
T |
24/11/16 |
SemanticsProblems |
Semantics 1
|
Chapter 17 from [2]
|
24 |
P/L |
24/11/16 |
|
|
|
25 |
T |
01/12/16 |
SemanticsProblems |
Semantics 2 |
|
26 |
P/L |
01/12/16 |
|
|
|
27 |
T |
15/12/16 |
Discourse and pragmatics. Problems |
Pragmatics and discourse
|
Chapter 21 from [2] |
28 |
p/l |
15/12/16 |
|
|
|
29 |
T |
22/12/16 |
GenerationProblems |
Generation
|
|
30 |
p/L |
22/12/16 |
|
|
|
Solution to the FINAL exam 2012
Solution to the final exam 2014