Hi everyone. I have planned to apply for Erasmus Mundus IPCVAI programme this year. I have completed my motivational letter. I would greatly appreciate a review and suggestions for improvement.
Motivational Letter:
I am Khawaja Abdul Mujeeb, a software engineer and AI researcher from Peshawar, Pakistan, applying to the IPCVAI Erasmus Mundus Joint Master's programme. IPCVAI stands out to me for its deliberate progression from image processing foundations through deep visual learning to applied computer vision, exactly the path my research in zero-shot video retrieval has shown I need. With a research paper under peer review and over a year of production AI engineering, I am confident I will contribute to this programme as much as I gain from it.
During my final year at IMSciences, I led the development of VidEx and co-authored a corresponding research paper under the supervision of Prof. Dr. Awais Adnan. VidEx is a cinematic video instance retrieval system requiring no task-specific training, no labelled data, and no GPU hardware. The core contribution was Temporal Majority Voting (TMV): rather than collapsing frame embeddings into a single mean vector, each extracted keyframe independently queries a FAISS index built on CLIP ViT-B/32 embeddings, casting film-identity votes aggregated into a confidence score. Evaluated on 150 clips across 50 films spanning thirteen languages and multiple cinematic eras, it achieved 98.0% Rank-1 accuracy, zero false accepts, and a 5.80-second average processing time on commodity CPU hardware. More importantly in stress testing, I observed its limitations: CLIP's colour-dependent feature space degrades on monochrome and low-key content, and the 50-film evaluation scale leaves scalability fundamentally untested. These are the open problems that drove this application, questions that require a grounding in visual representation theory that software engineering alone cannot provide.
Alongside this research, I spent over a year building and deploying production AI systems at DevBlock Studios: semantic search pipelines, LLM routing systems, and fine-tuned language models, delivering measurable outcomes including 92% routing accuracy, 60% training cost reduction, and 40% reduction in API costs. That engineering experience fed directly into a project I built independently: the Afghan Civil Law Legal AI system, a retrieval-augmented pipeline over 9,000 legal provisions of the Afghan Civil Code, achieving 93% answer accuracy, to make legal knowledge accessible to diaspora communities without legal counsel. That project clarified something for me: the most consequential AI systems are not those that set benchmark records, but those that function reliably where infrastructure is absent.
IPCVAI addresses the precise theoretical gaps my work has exposed. The Deep Learning for Computer Vision 2 module at UAM, specifically its focus on vision-language models, contrastive representation learning, and zero-shot classification, will help me formalize the theoretical foundations behind models like CLIP, which I have worked with experimentally but not yet at a deeper research level. I am particularly eager to engage with Prof. Jesús Bescós at VPULab, whose research on content-based video indexing and video sequence analysis is the academic field my work entered without knowing its name; his publication on semantic-aware scene recognition speaks directly to the representational challenges VidEx encountered. At Bordeaux, Prof. Jenny Benois-Pineau's work on deep learning for video analysis and multimedia AI at LaBRI represents the direction I want to develop toward in Semester 3. The programme's progression, from PPKE's mathematical and algorithmic foundations through UAM's deep learning specialisation to UBx's application-focused advanced modules, is precisely the learning path I need: not to be introduced to visual AI, but to understand it at a depth where I can extend it.
For the Semester 4 thesis, I intend to build directly on VidEx's open problems, investigating robustness to monochrome and low-key cinematography and evaluating zero-shot retrieval at database scales beyond 50 films, ideally through a research placement at VPULab. More broadly, I want to develop IPCV systems that are deployable in resource-constrained environments, not only in well-equipped laboratories. VidEx operating at 98% accuracy on a standard CPU was a deliberate design choice rooted in a conviction I have held since building the Afghan Legal AI system: these tools should work where the problem exists, not only where the infrastructure is.
Moving between Budapest, Madrid, and Bordeaux will require genuine cultural and academic adaptability. However, as an Afghan refugee raised in Pakistan who speaks Dari, Pashto, Urdu, and English fluently, navigating diverse cultural landscapes is something I have done my entire life. IPCVAI is the right next step because my research has already led me toward this field: with real results, clearly identified limitations, and open questions that only this programme can help me answer. I bring both research output and production deployment experience to this cohort, and I am committed to contributing to the IPCVAI research community in return. I am grateful for the committee's consideration.
Motivational Letter:
I am Khawaja Abdul Mujeeb, a software engineer and AI researcher from Peshawar, Pakistan, applying to the IPCVAI Erasmus Mundus Joint Master's programme. IPCVAI stands out to me for its deliberate progression from image processing foundations through deep visual learning to applied computer vision, exactly the path my research in zero-shot video retrieval has shown I need. With a research paper under peer review and over a year of production AI engineering, I am confident I will contribute to this programme as much as I gain from it.
During my final year at IMSciences, I led the development of VidEx and co-authored a corresponding research paper under the supervision of Prof. Dr. Awais Adnan. VidEx is a cinematic video instance retrieval system requiring no task-specific training, no labelled data, and no GPU hardware. The core contribution was Temporal Majority Voting (TMV): rather than collapsing frame embeddings into a single mean vector, each extracted keyframe independently queries a FAISS index built on CLIP ViT-B/32 embeddings, casting film-identity votes aggregated into a confidence score. Evaluated on 150 clips across 50 films spanning thirteen languages and multiple cinematic eras, it achieved 98.0% Rank-1 accuracy, zero false accepts, and a 5.80-second average processing time on commodity CPU hardware. More importantly in stress testing, I observed its limitations: CLIP's colour-dependent feature space degrades on monochrome and low-key content, and the 50-film evaluation scale leaves scalability fundamentally untested. These are the open problems that drove this application, questions that require a grounding in visual representation theory that software engineering alone cannot provide.
Alongside this research, I spent over a year building and deploying production AI systems at DevBlock Studios: semantic search pipelines, LLM routing systems, and fine-tuned language models, delivering measurable outcomes including 92% routing accuracy, 60% training cost reduction, and 40% reduction in API costs. That engineering experience fed directly into a project I built independently: the Afghan Civil Law Legal AI system, a retrieval-augmented pipeline over 9,000 legal provisions of the Afghan Civil Code, achieving 93% answer accuracy, to make legal knowledge accessible to diaspora communities without legal counsel. That project clarified something for me: the most consequential AI systems are not those that set benchmark records, but those that function reliably where infrastructure is absent.
IPCVAI addresses the precise theoretical gaps my work has exposed. The Deep Learning for Computer Vision 2 module at UAM, specifically its focus on vision-language models, contrastive representation learning, and zero-shot classification, will help me formalize the theoretical foundations behind models like CLIP, which I have worked with experimentally but not yet at a deeper research level. I am particularly eager to engage with Prof. Jesús Bescós at VPULab, whose research on content-based video indexing and video sequence analysis is the academic field my work entered without knowing its name; his publication on semantic-aware scene recognition speaks directly to the representational challenges VidEx encountered. At Bordeaux, Prof. Jenny Benois-Pineau's work on deep learning for video analysis and multimedia AI at LaBRI represents the direction I want to develop toward in Semester 3. The programme's progression, from PPKE's mathematical and algorithmic foundations through UAM's deep learning specialisation to UBx's application-focused advanced modules, is precisely the learning path I need: not to be introduced to visual AI, but to understand it at a depth where I can extend it.
For the Semester 4 thesis, I intend to build directly on VidEx's open problems, investigating robustness to monochrome and low-key cinematography and evaluating zero-shot retrieval at database scales beyond 50 films, ideally through a research placement at VPULab. More broadly, I want to develop IPCV systems that are deployable in resource-constrained environments, not only in well-equipped laboratories. VidEx operating at 98% accuracy on a standard CPU was a deliberate design choice rooted in a conviction I have held since building the Afghan Legal AI system: these tools should work where the problem exists, not only where the infrastructure is.
Moving between Budapest, Madrid, and Bordeaux will require genuine cultural and academic adaptability. However, as an Afghan refugee raised in Pakistan who speaks Dari, Pashto, Urdu, and English fluently, navigating diverse cultural landscapes is something I have done my entire life. IPCVAI is the right next step because my research has already led me toward this field: with real results, clearly identified limitations, and open questions that only this programme can help me answer. I bring both research output and production deployment experience to this cohort, and I am committed to contributing to the IPCVAI research community in return. I am grateful for the committee's consideration.
