About me

I am a Master’s candidate in Visual Computing at Saarland University, with a focus on computer vision and machine learning. My work centers on building intelligent systems that connect visual data with structured language, particularly in real-world applications.

For my thesis, titled "Report Generation for Eye Surgeries: Bridging Video Understanding and Language Models", I developed a pipeline for automated German-language insurance report generation from ophthalmic surgery data. The work combines surgical video understanding, large language models, and retrieval-augmented generation (RAG), with a focus on generating clinically structured and reliable reports.

I also worked as a Research Assistant at the German Research Center for Artificial Intelligence (DFKI) in the Cognitive Assistants department. There, I contributed to the EEGain project on EEG-based emotion recognition and supported MLOps and deployment workflows for the NFDI-MatWerk project, focusing on scalable and reproducible research pipelines.

My interests lie in applied machine learning and computer vision, particularly in developing robust and scalable systems that bridge perception and language.

What I like to do

  • design icon

    3D modelling in Blender and Autodesk

    I love 3D modelling and stylistic scene creation in Blender. I also do some CAD work for designing mechanical keyboards as a personal hobby.

  • Development icon

    Coding

    I like to build code related to art, whether its cartoon like effects or a basic 3D renderer.

  • gaming icon

    Gaming

    I like to play visually stunning open world games to get creative ideas that can help me with my 3D modelling. I also like to play FPS games.

  • camera icon

    Photography

    I love to capture the world with my beloved Pixel.

Resume

Education

  1. Saarland University, Germany

    2023 — Present

    Master’s candidate in Visual Computing (thesis submitted, March 2026), with a focus on computer vision and machine learning. My studies emphasized image processing, machine learning, and high-level computer vision, providing a strong foundation in visual data analysis and modeling.

  2. University of Petroleum and Energy Studies, India

    2017 — 2021

    Completed my undergraduate degree in Computer Science with specialization in Business Analytics and Optimization with a grade of 8.91. Was awarded the Silver Medal for being the top student in my specialization.

Experience

  1. Research Assistant (DFKI)

    October, 2024 — March, 2026

    At DFKI, I worked in the Cognitive Assistants department on applied machine learning projects. I contributed to the EEGain project, which is a one stop framework that can be utilised to perform Emotion Recognition using EEG data, and co-authored a research paper currently under review (pre-print). I also worked on the NFDI-MatWerk project, focusing on MLOps and deployment workflows, including automating notebook-based pipelines using Docker and Voila.

  2. Tutor - Digital Signal Processing (Saarland University)

    April, 2025 — July, 2025

    As a tutor for the Digital Signal Processing course, I conducted weekly sessions to help students understand complex concepts and solve problem sets. I was responsible for designing and grading assignments, as well as supporting students in their exam preparation through targeted guidance and additional learning resources.

  3. Summer Intern

    May, 2020 — July, 2020

    During my internship at Rhocron, I worked on the project "Low-Resolution Face Recognition in the Wild." Using YOLO v3 and SRGANs, I built a system capable of recognizing faces in low-resolution images, including user registration. I utilized the DeepFace library for this purpose.

Extra-curricular activities

  1. Head of Events Committee, Computer Society of India, UPES

    2019 — 2020

    As the head of the events committee, I was responsible for organizing various technical and fun events in my college.

Certifications

My skills

  • C
    75%
  • C++
    85%
  • Python
    90%
  • HTML
    60%
  • Blender
    70%
  • Graphic Design
    60%

Portfolio

  • Report Generation for Eye Surgeries: Bridging Video Understanding and Language Models

    In my Master’s thesis, I developed a pipeline for automated German-language insurance report generation from ophthalmic surgery videos (pars plana vitrectomy), combining video understanding, large language models, and retrieval-augmented generation (RAG). I evaluated both end-to-end multimodal and structured phase-driven approaches, with the latter proving more robust and interpretable, especially under limited paired data.
    Link to the thesis report.

    Graduate

  • Studying various training approaches for the MoLFormer model on the Lipophilicity dataset

    In this collaborative project, I explored and implemented various data selection (influence scores, etc.) and fine-tuning strategies (LoRA, BitFit, iA3) from scratch to adapt the pre-trained chemical language model, MoLFormer, to the regression task of predicting lipophilicity values of the MoleculeNet Lipophilicity dataset.
    Link to the project report.

    Graduate

  • EEG for Emotion Recognition

    In this collaborative project, I explored emotion classification from films using low-level audio-visual features instead of EEG signals. Using datasets like DREAMER, XGBoost models predicted Arousal and Valence with notable accuracy in LOSO validation, showing the potential of low-level features for emotion recognition.
    Link to the project report.

    Graduate

  • Analysis of Self-Supervised Learning Methods for Urban Scene Segmentation with Adverse Weather Conditions

    In this collaborative project, I explored self-supervised learning for urban scene segmentation in low-visibility conditions, evaluating U-Net and DeepLabv3 models on Cityscapes and Foggy Cityscapes. Achieved accuracy comparable to fully supervised methods with less labeled data.
    Link to the project report.

    Graduate

  • EIGEN - A renderer based on physically based rendering

    In this collaborative project, I built a rendering engine using the Lightwave framework.
    Link to the project webpage.

    Graduate

  • BRDF implementation using SHADERed

    In this project, I implementated the Phong and Cook-Torrance BRDF models to better understand their mathematical foundations.

    Graduate

  • Comparative Analysis of Epsilon-Greedy, UCB & Thompson Sampling Algorithms

    In this project, I programmed agents for Epsilon-Greedy, UCB, and Thompson Sampling algorithms to perform a comparative analysis of their performance on a simple 3-arm bandit problem for both long and short time steps.

    Under-graduate

  • COVID Safety Tracker

    In this project, I built a system capable of tracking any violation of social distancing and mask norms. It captures the facial identity of the violator to perform facial recognition if needed. It also creates a dashboard of the captured statistics. The system uses perspective transformation to get a better accuracy while detecting social distancing violations.
    Link to the github repo.

    Under-graduate

  • Vocal Psychiatric Simulator

    In this project, I built a system that serves as a virtual psychiatrist. It asks the subject a set of questions and then evaluates their condition basis their responses and facial expressions and generates an evaluation report. The system uses the Flask framework as a basic UI backbone for the system.
    Link to the github repo.

    Under-graduate

  • Digit Recognition using Machine Learning in C

    In this project, I built a Neural Network capable of classifying handwritten digits of the MNIST database. It also makes a live Accuracy vs Epochs graph using the graphics.h library in C.
    Link to the github repo.

    Under-graduate

Contact