Master in Data Science (MDS)

Mining Unstructured Data - Spring 2026

Instructors


Sessions

Date Content Lecturer
Week 1 February 9th Introduction to MUD
Jordi
February 10th Lab Session 1
Introduction to laboratory
Task data & other stuff
Carlos
Week 2 February 16th Document structure and language
Jordi
February 17th Lab Session 2
Guides and code for task 1 (Document structure and language)
Carlos
Week 3 February 23th Distributional Embeddings
+ Exercises about Distributional Embeddings
Salvador
February 24th Lab Session 3
Link to the colab code
Carlos
Week 4 March 2nd Distributional embeddings for word classification
+ Exercises about Word Classification
Salvador
March 3th Lab Session 4
Guidelines and code for task 2 (NN-based NERC)
+ Colab link
Carlos
Week 5 March 9th Contextual embeddings: Recurrent NN Language Models
+ Exercises about RNNs
Salvador
March 10th Lab Session 5
Guidelines and code for task 3 (NN-based DDI)
+ Colab link
Carlos
Week 6 March 16th Contextual embeddings: Transformers
+ Exercises about transformers
Salvador
March 17th Lab Session 6
Carlos
Week 7 March 23th Large language models Salvador
March 24th Lab Session 7
Carlos
April 30th and 31st No class (Bank Holidays)
April 6th No class (Bank Holidays)
April 7th No class (FIB midterm exams)
Week 8 April 13th No class (FIB midterm exams)
April 14th Lab Session 8
Carlos
Week 9 April 20th Words: PoS Tagging Jordi
April 21st Lab Session 9
Carlos
Week 10 April 27th Words: Lexical Semantics Jordi
April 28th Lab Session 10
Exercise of WordNet similarities
Link to the colab code
Carlos
Week 11 May 4th Word Sequence: Named Entities and Noun Phrases
+ Exercises about features for word sequence recognition
Jordi
May 5th Lab Session 11
Guidelines for task 4 (ML-based NERC)
Code for task 4
Carlos
Week 12 May 11th Sentence: Constituent Parsing
+ Exercises about constituent parsing
Jordi
May 12th Lab Session 12
Guidelines and code for task 5 (ML-based DDI)
Code for task 5
Carlos
Week 13 May 20th Sentence: Dependency Parsing
+ Exercises about dependency parsing
Jordi
May 21st Lab Session 13 Carlos
Week 14 May 25th No class (Holiday)
May 26th Lab Session 14
Carlos

Important dates

March 2nd: Delivery of Doc. structure and language report (lab task 1)
April 28th: Delivery of NN-based NERC+DDI report (lab tasks 2 and 3)
May 26th: Delivery of comparison NN-based vs ML-based NERC+DDI report (lab tasks 4 and 5)
June 15th: Final exam

Solved Exercises

Exercises about Word Embeddings
Exercises about Word Classification
Exercises about RNNs
Exercises about Transformers
Exercises about Features for Word Sequence Recognition
Exercises about Constituent Parsing
Exercises about Dependency Parsing

Resources

Complementary Readings

Software