Ahmed Sabir

Ahmed Sabir

I'm a postdoctoral researcher interested in language grounding, visual semantics, and semantics similarity. I obtained my Ph.D. from the department of computer science at Universitat Politècnica de Catalunya BarcelonaTech in 2020, Spain, where I worked with the TALP Research Group under the supervision of Prof. Lluís Padró and Dr. Francesc Moreno Noguer (CSIC-UPC). I did my MSc at the Masuda AI lab at the Kanagawa Institute of Technology under the supervision of Prof. Michiko Matsuda/Hiroshi Tanaka. Now, I'm a member of the TALP Research Center, and my advisor is Prof. Lluís Padró.

My previous research was on gesture recognition, sign languages, and OCR correction via visual semantics. Now, I'm working on language grounding in image caption generation. Before that, I used to work as measurement electronic specialist (Class 3/A) to do programming, service maintenance, and testing for reservoir mapping while drilling tools.

Email  /  CV  /  Scholar  /  Github  /  Huggingface  /  Blog  /  Twitter  /  LinkedIn


I'm interested in Natural Languages Processing, Computer Vision, and Machine learning. Especially, the intersection between language, vision, and scene understanding. My current research focuses on adding visual context information to enhance image captioning without re-training.

Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
CVPRW 2023 (Spotlight)
paper / project page / code / video / slide / poster / demo / huggingface

COCO-based textual dataset for visual context information.

Word to Sentence Visual Semantic Similarity: Lessons learned
Ahmed Sabir
MVA 2023
paper / project page / poster / slide / video/ appendix / blog post

Investigating BERT+GloVe for visual grounding via textual semantic similarity for image captioning.

Belief Revision based Caption Re-ranker with Visual Semantic
Ahmed Sabir, Francesc Moreno Noguer, Pranava Madhyastha, Lluis Padro
paper / project page / slide / code / poster / demo

Visual semantic re-ranker for image caption generation

Context-aware Text Recognition in the Wild
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
arxiv / appendix / BERT code / sense embedding code

An extended version of the neural approach and comparison with BERT
and knowledge-base-embedding (sense embedding).

Textual Visual Semantic Dataset for Text Spotting
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
CVPRW 2020
paper / project page / slide / youtube / code

Textual visual dataset for text spotting

Semantic Relatedness Based Re-ranker for Text Spotting
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
EMNLP 2019
paper / project page / blog post / code / colab

Neural approach to learn semantic relatedness between word-to-word and word-to-sentence pairs as visual re-ranker for Text Spotting problem or OCR in the wild

Visual Re-rankeing with Natural Language Understanding for Text Spotting
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
ACCV 2018 (Oral Presentation) (top 4.5% submission)
paper / project page / code / colab

Word-level Natural Language Understanding based visual re-ranker for Text Spotting

Visual Semantic Re-ranker for Text Spotting
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
arxiv 2018

Word-level Visual Semantic Re-ranker

Enhancing Text Spotting with a Language Model and Visual Context Information
Ahmed Sabir, Francesc Moreno Noguer, Lluis Padro
CCIA 2018 (Oral Presentation), (National Conference)
arxiv / slide / ICDAR2017 Challenge

Language and visual context based re-rankers as simple and fast post processing approach to improve scene text recognition for any pre-trained OCR model.

Enhancing Text Spotting with Visual Context Information
Ahmed Sabir
International Conference on Document Analysis and Recognition (ICDAR17)
arxiv / poster

ICDAR 2017 Doctoral Consortium

Ahmed Sabir
Preprint 2013  
arxiv / blog / poster

Developing Japanese semaphore recognition system.

手旗信号認識への Kinect 適用の検討とその評価
Ahmed Sabir, Yoshiki Ito, Yasuhiro Sudo, Hiroshi Tanaka, Michiko Matsuda
IEICE 2012 (Oral Presentation)
slide / bibtex / blog / poster

Investigation and Evaluation on Application of KINECT to Handflag Signaling Recognition in Japanese language.


Hackathon Mobility BCN
Ahmed Sabir, Jaime Sendra, José Umberto Gamboa , Juan Pedro, Carlos Valencia, 2017
Driving Experience Hardware Winner, Team Vegeta  
slide / cm / video / picture

Seat/Carnet Mobility Hackathon (48hrs).

Useful links

Unix for Poets by Kenneth Ward Church

AWK cheat sheets

LaTeX for Linguists

My LaTeX Phd template

This guy make nice website 👉HERE👨‍💻