About

I am a Machine Learning expert specializing in R&D of Large Language Models (LLMs) and Multimodal LLMs (MLLMs) that integrate text, images, and video. I design scalable transformer-based architectures for advanced cross-modal reasoning, aligning language with visual content. My work spans fine-tuning, optimization, and deployment of LLMs/MLLMs for tasks like image captioning, visual question answering, video-language grounding, and document intelligence.

Education

Toronto Metropolitan University

Toronto, Canada

Degree: Master of Engineering in Electrical and Computer Engineering
CGPA: 3.63 / 4.33

Relevant Courseworks:

  • Neural Information Processing
  • Machine Learning
  • Topics in Data Science
  • Deep learning

Rajshahi University of Engineering & Technology (RUET)

Rajshahi, Bangladesh

Degree: Bachelor of Science in Electrical & Electronic Engineering
CGPA: 3.62 / 4.0

Relevant Courseworks:

  • Introduction to Programming Language
  • Engineering Mathematics I–V
  • Electromagnetic Fields & Waves
  • Introduction to Digital System & Design
  • Microprocessor & Microcomputer System
  • Advanced Computer Programming

Certifications

  • How Google does Machine Learning – Google Cloud Training (Coursera)
  • Launching into Machine Learning – Google Cloud Training (Coursera)
  • Convolutional Neural Networks in TensorFlow (Coursera)
  • Introduction to TensorFlow for AI, Machine Learning, and Deep Learning (Coursera)
  • Introduction to Containers w/ Docker, Kubernetes & OpenShift (Coursera)
  • Containers & Kubernetes Essentials (IBM)
  • Design Thinking for Innovation – University of Virginia (Coursera)

Experience

Concordia University

Research Assistant

Conducted research in 3D vision, point cloud video compression, and 3D video data processing, developing and optimizing novel algorithms for point cloud completion and enhanced 3D video data processing for real-time video streaming. Published and presented findings at academic conferences, contributing to advancements in 3D real-time video streaming.

Focus Areas: Point Cloud Completion, Point Cloud Video Compression

Jan 2023 – Dec 2024 | Montreal, Quebec

CINTIQS

Senior Artificial Intelligence Engineer

I worked on AI Innovation projects building end-to-end AI software pipelines for the military and defense industry platforms using Flask, OCR, and Docker.

Tools: Python, Flask, SQLite, OCR

January 2022 – Sep 2022 | Ottawa, Ontario, Canada

Atelesys

Software Developer

Developed end-to-end applications using Python and React to efficiently meet project requirements.

Tools: Python, Flask, React

Oct 2021 – Dec 2022 | Toronto, Ontario, Canada

Intelense

Artificial Intelligence Developer

Developed real-time video analytics for public safety applications including anomaly detection (GAN, WGAN, VAE), accident detection, fall detection (pose estimation), fight detection, fire and smoke detection.

Integrated AI solutions with real-time camera feeds (RTSP, HTTP) using deep learning and computer vision techniques.

Built modular video analytics apps with Flask and OpenCV for multi-camera tracking using perspective transformation, object detection, and tracking.

Conducted R&D, reviewed literature, and built pipelines for various projects.

Developed alert systems sending notifications when anomaly detection thresholds were exceeded.

Tools: Python, Flask, JavaScript

Jun 2020 – Sep 2021 | Toronto, Canada

Polaris Transport

Machine Learning Developer

Worked on classification of customer emails in automation processes and other relevant machine learning tasks.

Tools: Python, Scikit-learn, NLTK

Aug 2018 – Nov 2018 | Toronto, Canada

Dutch-Bangla Bank

Data Analyst

Conducted data mining and retrieval using MySQL to identify critical business areas. Achieved identifying top customers using clustering algorithms.

Delivered a POC solution with 96% accuracy on company data.

Tools: Python, MySQL

Jun 2010 – Apr 2016 | Dhaka, Bangladesh

Publications

GESA: Exploring Loss-based Adversarial Attacks in Volumetric Media Streaming

IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) 2024

We propose the Gilbert-Elliott Shape Attack algorithm to generate a 3D point cloud incomplete dataset, namely GESA, that mimics packet loss impact. This dataset represents real-world point cloud streaming scenarios, allowing performance evaluation of completion models under realistic conditions.

LIGHTSEG: EFFICIENT YET EFFECTIVE MEDICAL IMAGE SEGMENTATION

IEEE International Symposium on Biomedical Imaging (ISBI) 2022

Reducing the number of parameters is crucial for affordable and applicable model development. LightSeg is a lightweight, fast, and effective approach to medical image segmentation suitable for limited storage and computation.

PAY ATTENTION FOR COVID-19 DETECTION USING EFFICIENT CONVOLUTION

IEEE International Symposium on Biomedical Imaging (ISBI) 2022

The lightweight COVIDAT-Net model with lower computational cost is an effective and robust solution for COVID-19 case classification.

Awards

  • Split Graduate Fellowship GCS – 2023
  • Concordia Conference and Exposition Allowance – 2024

Service

  • Volunteer, Toronto AI Meetup Group
  • Organizer, University AI Symposium 2023