amramer - Overview

👋 About Me

I build intelligent visual systems that see, understand, and generate , from multimodal 3D avatar generation and real-time sports analytics to medical imaging and autonomous driving perception.

My work lives at the intersection of computer vision, deep learning, and generative modeling, with a focus on pushing research-grade systems toward scalable, real-world solutions.

🚀 Featured Projects

🎓 Master's Thesis: Personality-Aware Non-verbal Behavior Generation

A multimodal generative model for synthesizing realistic listener avatars in dyadic conversations, conditioned on personality traits. The model predicts facial expressions, head motion, and upper-body gestures of a listener from the speaker's audio and motion signals.

Tech Stack: PyTorch · Vision Transformers · VQ-VAE · SMPL-X (PIXIE) · Librosa · OpenCV · CUDA · TensorBoard · SLURM · Enroot · Multi-GPU Training (A100)

🔗 Thesis Website | 📄 Full Thesis PDF

🏸 Badminton-VisionAI

An end-to-end computer vision pipeline for badminton match analysis that converts raw match footage into structured performance analytics for players and coaches. The system tracks players and the shuttlecock, detects shot events, projects motion onto a mini-court representation, and generates a downloadable coach-style performance report.

Tech Stack: Python · PyTorch · OpenCV · YOLO · ByteTrack · TrackNet · SAM · Streamlit · Plotly · Docker · CUDA · Homography · Body Pose Estimation · 3D Motion Trajectories · CI/CD Pipeline

🔗 Live Demo | 🎬 CV Pipeline Video | 📊 Dashboard Video

🚗 Semantic Segmentation for Autonomous Vehicles

A semantic segmentation system built on the BDD100K dataset — a large-scale, diverse benchmark for autonomous driving. The project accurately segments and identifies objects in street scenes to improve the AI perception capabilities of autonomous vehicles.

Tech Stack: PyTorch · Fastai · Semantic Segmentation · Hyperparameter Tuning · Weights & Biases

🩺 3D Brain Tumor Segmentation (MRI)

An end-to-end workflow for multi-label brain tumor segmentation from 3D multimodal MRI scans. The project targets segmentation of glioma subregions — tumor core, whole tumor, and enhancing tumor — using a 3D SegResNet model.

Tech Stack: PyTorch · MONAI · 3D SegResNet · Multi-modal MRI · 3D Medical Image Transforms · Sliding Window Inference · Weights & Biases

🎥 Realtime Vision Captioning

A curated set of Jupyter notebooks demonstrating core computer vision and vision–language capabilities using VLM foundation models. The notebooks progress from offline image understanding tasks to a real-time webcam application that performs live image captioning and image classification.

Tech Stack: PyTorch · Torchvision · Hugging Face Transformers · Gradio · PIL · VLMs · Real-time Inference

🎙️ AI Conversational Agent (In Progress)

An AI agent that goes beyond a standard chatbot — combining real-time voice and text interaction, LLM-powered reasoning, and a digital human conversational interface.

Tech Stack: Python · OpenAI GPT · Speech-to-Text · Text-to-Speech · HTML · CSS · JavaScript · Bootstrap · jQuery

🧠 Technical Skills

Area	Skills & Tools
Computer Vision	Object Detection · Segmentation · Tracking · Video Motion Analysis · Digital Human Models
Models & Frameworks	PyTorch · TensorFlow · OpenCV · YOLO · Vision Transformers (ViT) · Supervision
Generative AI	VQ-VAE · Diffusion Models · VLMs · Multimodal Generation
Training & Evaluation	Transfer Learning · Hyperparameter Optimization · Weights & Biases · TensorBoard
Deployment	Docker · AWS (EC2 GPU) · Streamlit · Batch & Real-time Inference
Software Engineering	Python · C++ · Object-Oriented Design · Git · Unit Testing · CI/CD

🌐 Portfolio

For a full overview of my work, visit amramer.github.io.

Building systems that see and understand the world — one model at a time.