About
I am a Machine Learning expert specializing in R&D of Large Language Models (LLMs) and Multimodal LLMs (MLLMs) that integrate text, images, and video. I design scalable transformer-based architectures for advanced cross-modal reasoning, aligning language with visual content. My work spans fine-tuning, optimization, and deployment of LLMs/MLLMs for tasks like image captioning, visual question answering, video-language grounding, and document intelligence.
Education
Toronto Metropolitan University
Toronto, Canada
Degree: Master of Engineering in Electrical and Computer Engineering
CGPA: 3.63 / 4.33
Relevant Courseworks:
- Neural Information Processing
- Machine Learning
- Topics in Data Science
- Deep learning
Rajshahi University of Engineering & Technology (RUET)
Rajshahi, Bangladesh
Degree: Bachelor of Science in Electrical & Electronic Engineering
CGPA: 3.62 / 4.0
Relevant Courseworks:
- Introduction to Programming Language
- Engineering Mathematics IāV
- Electromagnetic Fields & Waves
- Introduction to Digital System & Design
- Microprocessor & Microcomputer System
- Advanced Computer Programming
Certifications
- How Google does Machine Learning ā Google Cloud Training (Coursera)
- Launching into Machine Learning ā Google Cloud Training (Coursera)
- Convolutional Neural Networks in TensorFlow (Coursera)
- Introduction to TensorFlow for AI, Machine Learning, and Deep Learning (Coursera)
- Introduction to Containers w/ Docker, Kubernetes & OpenShift (Coursera)
- Containers & Kubernetes Essentials (IBM)
- Design Thinking for Innovation ā University of Virginia (Coursera)
Experience
Concordia University
Research Assistant
Conducted research in 3D vision, point cloud video compression, and 3D video data processing, developing and optimizing novel algorithms for point cloud completion and enhanced 3D video data processing for real-time video streaming. Published and presented findings at academic conferences, contributing to advancements in 3D real-time video streaming.
Focus Areas: Point Cloud Completion, Point Cloud Video Compression
Jan 2023 ā Dec 2024 | Montreal, Quebec
CINTIQS
Senior Artificial Intelligence Engineer
I worked on AI Innovation projects building end-to-end AI software pipelines for the military and defense industry platforms using Flask, OCR, and Docker.
Tools: Python, Flask, SQLite, OCR
January 2022 ā Sep 2022 | Ottawa, Ontario, Canada
Atelesys
Software Developer
Developed end-to-end applications using Python and React to efficiently meet project requirements.
Tools: Python, Flask, React
Oct 2021 ā Dec 2022 | Toronto, Ontario, Canada
Intelense
Artificial Intelligence Developer
Developed real-time video analytics for public safety applications including anomaly detection (GAN, WGAN, VAE), accident detection, fall detection (pose estimation), fight detection, fire and smoke detection.
Integrated AI solutions with real-time camera feeds (RTSP, HTTP) using deep learning and computer vision techniques.
Built modular video analytics apps with Flask and OpenCV for multi-camera tracking using perspective transformation, object detection, and tracking.
Conducted R&D, reviewed literature, and built pipelines for various projects.
Developed alert systems sending notifications when anomaly detection thresholds were exceeded.
Tools: Python, Flask, JavaScript
Jun 2020 ā Sep 2021 | Toronto, Canada
Polaris Transport
Machine Learning Developer
Worked on classification of customer emails in automation processes and other relevant machine learning tasks.
Tools: Python, Scikit-learn, NLTK
Aug 2018 ā Nov 2018 | Toronto, Canada
Dutch-Bangla Bank
Data Analyst
Conducted data mining and retrieval using MySQL to identify critical business areas. Achieved identifying top customers using clustering algorithms.
Delivered a POC solution with 96% accuracy on company data.
Tools: Python, MySQL
Jun 2010 ā Apr 2016 | Dhaka, Bangladesh
Publications
GESA: Exploring Loss-based Adversarial Attacks in Volumetric Media Streaming
IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) 2024
We propose the Gilbert-Elliott Shape Attack algorithm to generate a 3D point cloud incomplete dataset, namely GESA, that mimics packet loss impact. This dataset represents real-world point cloud streaming scenarios, allowing performance evaluation of completion models under realistic conditions.
LIGHTSEG: EFFICIENT YET EFFECTIVE MEDICAL IMAGE SEGMENTATION
IEEE International Symposium on Biomedical Imaging (ISBI) 2022
Reducing the number of parameters is crucial for affordable and applicable model development. LightSeg is a lightweight, fast, and effective approach to medical image segmentation suitable for limited storage and computation.
PAY ATTENTION FOR COVID-19 DETECTION USING EFFICIENT CONVOLUTION
IEEE International Symposium on Biomedical Imaging (ISBI) 2022
The lightweight COVIDAT-Net model with lower computational cost is an effective and robust solution for COVID-19 case classification.
Awards
- Split Graduate Fellowship GCS ā 2023
- Concordia Conference and Exposition Allowance ā 2024
Service
- Volunteer, Toronto AI Meetup Group
- Organizer, University AI Symposium 2023