Final Project Reports

Please see the Project page for details regarding the final project.

Showing 178 of 291 projects (113 requested to remain private).


TitleAuthors
Highlight: PocketNeRF: Fast-Converging Neural Radiance Fields for Indoor Reconstruction from Few-Shot Mobile Images Aaron Jin, Ryan Joonwon Suh, Lucas Emmanuel Brennan-Almaraz
Highlight: CURVLM: Circuit Understanding Via Group Relative Policy Optimization (GRPO) on Vision Language Models Diego Porras, Mike Zhao, Adhi Daiv
Highlight: Bridging Vision and Language for Sign Language Understanding Ali Sartaz Khan, Dat Tran, Sufen Fong
Highlight: Video Style Transfer with Reinforcement Learning Amelia Kuang, Coco Xu, Ariel Chen
Highlight: 3D Semantic Segmentation with 3D LiDAR Point Clouds and 2D Camera Images for Autonomous Driving Anze Liu
Highlight: Two-Step Deep Learning Approach for Classifying Bridges using Street Imagery Carmen Edna Andrade von Hillebrandt
Highlight: Improved Mineral Detection via Spectral Attention U-Net with a Novel Hapke Layer: Accelerating Discovery of Critical Minerals for the Green Transition Ryan Wang, Chandra Suda
Highlight: Fast Acoustic Wave Simulation with Neural Operators Fangjun Zhou, Kevin Liu, Hong Nhu Ngoc Vo
Highlight: Total Variation Loss for Compact Objects’ Segmentation Applications to Satellite Images Giacomo Fraccaroli
Highlight: Replicate FaceApp Effect and Enable Real-time Performance Based on GANs Jacky Yin
Highlight: Enhancing Segment Anything Model (SAM) for Brain Tumor Image Segmentation June Zheng, Yi Jing, Komei Ryu
Highlight: Adaptive Contrastive Masked Autoencoders for Structured Representation Learning Karan Singh
Highlight: Chess Position (FEN) generation using Chessboard and Piece Recognition Matt S Yang, Satya Prakash Biswal, Samruddhi Yashwant Kahu
Highlight: Rock Image Super-resolution: From CT to micro-CT Zitong Huang, Minghui Xu
Highlight: Deep Learning based Segmentation of Focal Adhesions from Immunofluorescence Microscopy Images Xingyuan Zhang
Blurred Lines: Automated Video Obfuscation With Computer Vision James Ashton Varah, Tobias Charles Moser
PalmPilot: Drone Control using Live Hand Signal Detection Chris Kay, Carlos Daniel Joseph Hernandez, Matt Mahowald
Beyond Janus: Enhancing 3D Consistency in Text-to-3D Generation Through Foundation Model Reasoning J Yim, Soumyadeep Bhattacharjee, Antonio Llano
AstroDINO: Self-Supervised Learning on Astronomical Images Sagar Kapare
Video Understanding by A Sequence Model: A Case Study on Golf Swing Sequencing Yanming Zhu
Skin Cancer Detection from Smartphone Camera Images via Transfer Learning Darren Chan, Jack Zhang, Flora Fenghua Yuan
Dynamic Sparse Voxel Attention for Efficient Transformers Veljko Skarich, Manel Bermad
Exploring Lora Merging Techniques in Virtual Try-On Steven Songqi Pu, Joey McCoy
Finding the Fit: Vision-Language Models for Clothing Retrieval Henry Payne Palmer, Upamanyu Dass-Vattam
Finetuning Pretrained Models for Compressed Dermatology Image Analysis Eric Cui, Sonnet H Xu
Investigating the Functional Role of Learned Channel Suppression in Diffusion VAEs Oleg Roshka, Eugenie Shi, Herry Wang
Reviving an Endangered Script: Optical Character Recognition for Syriac Erick Angelo Ramirez
Comparing Vision Generative Models on Talking-Head Synthesis Songyu Han, Mustafa Abdelrahim Haroun Fadl, Hannah Rachel Levin
Exploring the Integration of Gaussian Splatting in Formula 1 Racing Colette Bao-Quan Do, Harviel Kyle Arcilla Arcilla
GeoVision: Fine-Grained Urban Geolocation in San Francisco via Distribution-Aware Visual Models Mathijs Ammerlaan, Nils Frederik Kuhn, Raul Molina Gomez
Computer Vision for Financial Statement Analysis Ryan Samuel Samadi
Obstacle Detection for Autonomous Driving Using Semantic, Instance, and Transformer Models on the Waymo Open Dataset Carlos Henrique Aranguren
Improving 3D Scene Segmentation with Prior 3D Object Knowledge Laszlo Szilagyi
Beyond ViT: Accelerating ASL Recognition with Convolution-Attention Fusion Hengjin Tan, Kai Liu, Jingshu Liu
Image Segmentation For Wildfire Prediction Nevin George, Anthony Maltsev
OmniLoc: Towards Leveraging Multiple Perspectives for Probabilistic Visual Geolocalization Xiyan Shao, Charlie John Haywood
StoryCrafter: Comic-Style Storyboarding Meets 3D Camera Animation Alex Gu, Isaac Hongzhe Kan, Jack P Le
Understanding and comparing learned features in CLIP models via Sparse Autoencoders Logan Ivins Graves, Franklin Sheng Zhu
Predicting MLB Pitch Outcomes From Video Data Ohm Patel, Ishan Chirag Mehta
WEBSIGHT: A Vision-First Architecture for Robust Web Agents Tanvir Bhathal, Asanshay Gupta
Eye in the Sky: Live Blackjack Card Counting via Real-Time Video Analysis Mini Rawat, Cameron James Heskett, Ankur Jai Sood
3D Model Reconstruction from Image(s) Zhen Lin
CS231n Project: Particle Identification and Energy Reconstruction in Calorimeter Readout Using dSiPM Bo Liu, Qihua Wang, Liangyu Wu
A Computer Vision Approach to Monitoring beehive Activity Maxime de Belloy
Diffusion-Based Super-Resolution of Micro–CT to SEM for Porous Media Lisa Li
Gaming the Video Against Yourself Joshua Boisvert
Learning Stethoscope Placement for Heartbeat Detection Using Convolutional Neural Networks Bryan Jang Yit Tiang, Cristian Gabriel Galdeano
ProtMAE: Masked Autoencoding of Protein Distance Maps for Structure-Aware Representation Learning Emi Maria Mathew, Aya Aburous, Jad Bitar
Modeling Urban Food Insecurity with Google Street View Images David Li
A Multi-Modality End-to-End Autonomous Driving System Bryan Alexis Pineda, Ziyu Li, Kamal Mohammed ElMallah
Image Label Disambiguation for Rich Semantic Representations in Language-Prompted Segmentation Models Justin Anthony Hall, Walter Lopez Chavez
Interpretable Multimodal Deep Learning Model on MIMIC-CXR Dataset Zhenghui Chen
Exploring the Effects of Contrastive Pretraining and Test-Time Post-training on Semantic Segmentation of Waste Andy Ouyang, Nishikar Paruchuri, James Cheng
Leveraging Transfer Learning with Swin Transformer to Identify Coronary Artery Disease using Cardiac MRI Natasha Banga
ScreenShield: Computer Monitor Tracking and Blurring for Video Aniket Mahajan, Niall Thomas Kehoe
Fluorescent Neuronal Cell Counting using a modified ResUnet Model with Attention Gates Aanya Tashfeen
GeorAIn: Demystifying GeoGuessr Fayez Navid Anwar
Investigating Explanation Stability under Distribution Shifts Laura Paola Gomezjurado Gonzalez
Spatial Spectral Deep Learning for Tumor Detection in Colorectal Cancer Nikhiya Shamsher
Generalizing Discretization Representations for the Physical Solutions via Flexible Spatial Image Data Structures Zi Wang
Breaking Bots: Enhancing CAPTCHA Design with Neural Style Transfer Rinnara Sangpisit
Bridged Clustering for Computer Vision Ellie Tanimura, Pierre R Labroche
Towards A Unified Deep Learning Architecture for Extraterrestrial Surface Perception Brian Y Wu
Parameter Estimation of Digital Audio Effects from Spectrograms Ethan L Buck, Wesley Kavin Larlarb, Richard Thompson Lee
Evaluating a residual learning framework for 3D computed tomography data Evan Vincent Maestri
Hoops Radar: Player Tracking with NBA Broadcast footage Tomas Coghlan
Diffusion Board: An End to End Chess Move Prediction Pipeline Leveraging Discrete Diffusion for Long Horizon Prediction Prerit Choudhary, Akshay Gupta
Predicting Image Geolocation Using Feature-Based Fine-Tuning Kai Qi Wu
From Mode Collapse to State-of-the-Art: Engineering Robust Vision-Based 3D Hand-Object Manipulation Understanding Bryan Dong, Howard Ji
Cap4Art: Improving Image Captioning Capabilities Through Multi-Task Learning Sunny Yu, Willy Chan, Xiaofei Yan
Visual Age Estimation of Infant Photographs Using Deep Neural Networks Marcelo Bernardo Fernandez, Edgar Omar Leon
VAE Matters: Latent Compression Choices for DiT Architectures Sherry Xie, Eric Liu, Artur Barbosa Carneiro
Parameter-Efficient Fine-Tuning of BiomedCLIP for Diabetic Retinopathy Detection Kevina Wang, Adi Badlani
Trash Into Treasure: Classifying Garbage from Drone Imagery Using Image Classification Algorithms Alyssa Fong
Egocentric RGB-D Perception for High-Level Locomotion Planning in Humanoid Robots Tae Yang
AI-based Acoustic Defect Detection for Speaker Manufacturing Yitong Lu
You Only Dive Once: Real-Time Pose Scoring in Competitive Diving Agnes Liang, Renee Zbizika, Yoshi George Nakachi
Distributed 3D Reconstruction of Aerial Footage Igor Barakaiev
Glimpse Attention Models Victor Ng
report Harshvardhan Singh
Real-Time Video Segmentation for Autonomous Robotic Manipulation Chetan Reddy Narayanaswamy, Vakula Venkatesh
Video-Based Prediction of VO Gustavo D Martinez
Lightweight Model Adaptation for Mitigating Bias in Deep Learning Models for Chest X-Ray Analysis Clemence Marie Mottez
Using Transfer Learning to Adapt MobileNet for General Plant Disease Detection on Irregular Images Medhya Goel
Fair Enough to Diagnose: Reducing Gender Bias in Pneumonia Detection with Swin Transformers Yiting Shen, Dante Serafino Koffler
Enabling Rapid Disaster Response: Multimodal Remote Sensing Coregistration Evan John Twarog
Improving Adversarial Robustness of Image Classification Through Pretraining on Neural Data Shenghua Liu
ChessMates: How good are VLLMs at Chess? Rahul Chand
Video Caption Generation Ting Fu
Real2Code2Real: Articulated Full-Scene Reconstruction with 3D Asset Generation Eric Liang, Jacob Nathan Goldberg
Posterior Sampling using Diffusion Models for HDR Reconstruction Jamin Jia-Ming Xie
AI-Powered Dance Coaching via Pose Estimation, Vision Transformers and Dynamic Time Warping Arnold Tianyi Yang, Henry Jingsong Zhou, Roshen Sanjay Nair
Detecting Abnormalities in Musculoskeletal X-Rays: Project Milestone Adisa Kruayatidee
Chess Position (FEN) generation using Chessboard and Piece Recognition Matt S Yang, Satya Prakash Biswal, Samruddhi Yashwant Kahu
Nonrigid Motion Correction in MRI Using Neural Space-Time Modelling Jaehyeok Bae, Aizada Nurdinova, Yimeng Lin
Single-view 3D Human Reconstruction Using Generative Prior Zhengmao Liu
Addressing Class Imbalance in Deepfake Detection through ResNet-50 Ensemble with Specialist Models and Threshold Optimization Jiheng Zhang, Victor Chen, Madhuhaas Gottimukkala
Context-Aware Augmentation for Semantic Segmentation in Low-Data Regimes Ariel Tian Wang
Modeling the Margins: Edge-Aware E2E Driving Kevin James Selig
Chair Generation Model (CGM): Utilizing Fine-tuned and Multi-view diffusion with Shape Generation for text-to-3D Chair Model Generation Gabriel K Bo, Ian Yue-Ran Chen, Marc Bernardino
Language-Driven Primitive-Based 3D Scene Generation with Infinigen Eyrin Kim, Michelle Borg Yan Lau
Building An AI-Powered Fashion Application: Virtual Clothing Try-On Shawn Zhang
Robust Depth Estimation in Adverse Visual Conditions Wenfu Lei, Jiamin Sun
Gradient-Based Image and Protein Generation Jun Woo Kim
CS231N Project: Organic Waste Quantification in Public Trash Bins in Urban Areas Using Thermal Videos Varun Sahay, Pin Li, Seoyoung Oh
Transfer Learning Under the Surface: Explainable Coral Bleaching Classification Across Datasets Samantha Estrada
FloodscapeDiffuser: Low-Rank Conditioning for Diffusion-Based Post-Flood Satellite Imagery Simulation Martin Scott Pollack, Khadijah Anwar, Tony Yu
Learning 3D Structure in Irradiated Lithium Fluoride via Masked Autoencoders Piper Fleming, Carolyn Hellerqvist Smith
Multi-Modal Large Language Models for Historical Handwritten Text Recognition (HTR) and Data Augmentation Yuanhao Zou
Event Retrieval for Driving Scenarios Highlighting Bingqing Zu, John Ren
PrivacyGuard: Real-Time Detection and Redaction of Sensitive Visual Information Mutyala Naidu Kannuru
Deep Learning to Predict Lithium-Ion Battery State-of-Health from Partial Discharge Data: Comparing 1-D Temporal Models and Novel Convolutional Curve-Image CNNs Steven D. Liu
Learning Predictive Candlestick Patterns: Vision Transformers for Technical Analysis Arnav Gupta
Agentic Retrieval and Editing System for Image Generation Berwyn Berwyn
Diagnose and Defend: Lightweight Behavior-Aware Attention Gating for Robust Vision Transformers Natalie Si-Chi Kuo, Yanny Gao, Sara Kothari
Exploration of Visual Speech Recognition with LipNet Manan Sheth
What Do You See In a Poem? Image Generation from English Romantic Poetry Md Ahsanur Rashid, Shaoxiong Zhang, Funing Yang
Slippify: Parsing Super Smash Bros. Melee Frames Matthew George Lee, William Hu, Samuel A Do
NIGnets and Neural ODEs for Representing Non-Self-Intersecting Geometry Atharva Aalok
From Lyrics to Visuals: A Conditional GAN Framework for Album Cover Generation Laura Wu, Juliana Ma
Full-Page Chinese Calligraphy Generation via LoRA Fine-Tuning of Stable Diffusion Huici Pan, Jieshu Huang, Zhiyin Pan
Fast Inference for Vision-Language Model Image Captioning Taeuk Kang, Andrew C Shi, Nash Brown
Jack Gross-Whitaker, Dev Narasimhan Gopal
Real-Time American Sign Language (ASL) Recognition with Visual and Pose-Based Classification Elisabeth A Holm, Armando Alejandro Borda
Accelerated MRI Reconstruction with SwinUNet: Enhancing Image Quality through Transformer-Based Architecture Olufeolu Oluwapelumi Kolawole, Yogesh Seenichamy-Venkatesan, Kesavan Ramakrishnan
Diffusion-Guided Gaussian Splatting for Autonomous Driving Tao Wang, Maxton Huff
Investigate Transfer Learning For Pre-Trained Visual Foundation Encoder on Robot Manipulation Policy Yu Chi Hsu, Yu Wei Lin, Wei-Lin Pai
High-Fidelity Traffic Simulation with Camera Embeddings Yina Jian, Jerry Gu, Ryan Zhijie Rong
BenchPRISM: Benchmarking Physical Relationship Understanding In Segmentation Models Leo Li, Tahmid Jamal
ByeBye: A Zero-Shot Human Removal and Replacement Pipeline with Stylized Character Insertion Ashwin Mahendran, Arihan Varanasi, Caleb Youngjae Whang Choe
Evaluating SubCell Foundation Vision Transformer on Yeast Cell-Cycle and Protein Localization Tasks Mihajlo Stojkovic
Skin Cancer Detection with Deep Learning Keyan Azbijari
Do Hero Images Perpetuate Gender Bias? Anika Fuloria
Self-supervised Denoising Techniques for Diffusion Tensor Imaging Irmak Sivgin, Kamyar Rajabali Fardi
Guided by Style: Fine-Grained Modulation in Multi-Style Artistic Transfer Christina Ba, Catherine M Zhang
Weakly Supervised Learning via Relational Comparisons Junha Lee, Sina Mollaei
Lightweight 3D Inpainting for Cultural Heritage Restoration Using Diffusion Models Aarya Sumuk
Deep Learning-Based Pose Estimation and Boundary-Aware Mouse Brain Slice Registration for ABBA Cherry Chen
Multi-Agent Deep Learning for Visual T Cell Behavioral Modeling Joseph Li, Sean Tsung, Adrian Sadik Molofsky
Training CoCo: Continuity and Consistency in Subject-Driven Diffusion Models Eric Lee
Rock Image Super-resolution: From CT to micro-CT Zitong Huang, Minghui Xu
Why Are CNNs The Model of Choice for Simulated Robotic Picking? Doug Ian Fulop, Olivia Kelly Taylor, Josh James Citron
Bridging the Reality Gap: Synthetic Data Generation for Food Portion Estimation Ben Shlomo Gur
Structured Radiology Report Summarization with Fine-tuned BLIP-2 Nahome Gebremariam Hagos
Enhancing Visual Question Answering for Smart Glasses Using Vision-Language Models Xinxi Chen, Tianyang Chen
DETRmining the Cosmos: A Transformer-Based Approach to Galaxy Morphology Detection Kumar Chandra, Renn Su, Max Luis Rodriguez
Indoor Scene Understanding via 2D and 3D Semantic Segmentation: Integrating Depth for Geometry-Aware Reconstruction Karthik Pythireddi
Engagement-weighted and style-aware scoring for fashion compatibility Melissa H Liu, Sally Lee
3D Human Hand Reconstruction Using Gaussian Splatting with Deep Implicit Anatomical Shape Priors Jonathan Hui Wen, Elijah Song, Allen Kiriroath Chau
DashGuard: Hierarchical Attention for Dashcam Video Accident Detection Kory Zifeng Yang, Luca Mondonico
Waste Classification and Management Using Computer Vision Annie Fan, Jason Sun, Sue Deng
Visual scrolling detection to enhance GUI agent training Chena Lee
Parking Spot Detection Using Deep Learning Computer Vision Applied to Satellite Imagery: Applications for Solar Carport Potential Estimation Renee Duarte White, Peiyu Li, Josh Chad Neutel
Laparoscopic Surgical Image Segmentation - CS231N Final Project Report Brian Jonathan Sutjiadi
L Onyinyechi Nichole Okoye
STAGED: Spatio-temporal Tracking and Analysis for Ground-level Event Detection Joshua Logan Shunk
JetVision-Mamba: Selective State Space Models for Jet Classification in High Energy Physics Dimitris Ntounis
The Not-So-Secret Life of Dogs Bea Lai Kuan Lim
Leveraging Captions for Context-Aware Image Colorization Sheena Lai, Haoming Song
A.I.R.G.T.R. – Artificial Intelligence for Real-time Gesture-based Tonal Rendering Jacob Alan Rubenstein, Shane Robinson Mion
Ultimate Vision: A System to Autonomously Track An Ultimate Frisbee in Video Frames Mallika Parulekar, Yash Suvidh Kankariya
Aligning Text-to-Image Diffusion Models using Human Utility Optimization and Low-Rank Adaptation Yiwen Zhang, Wendy Yin, Yicheng Zhang
Enhancing Bearing Quality Control: A CNN-Based Approach for bearing defect classification. Mengyuan Huang
MLLM-Driven Highlight Reel Generation for Ultimate Frisbee Games Heather Szczesniak, Megan Ja, Farah Shahbaz
Benchmarking ML-Based Antarctic Sea Ice Forecasting in a Data-Rich Setting Yuchen Li
Segmenting the Earth: Challenges in Land Cover Classification Shirley Cheng
Deception classification from video input Stephanie Stephanie Vezich Tamayo, Lillian Lillian Ma
Fail Fast, Run Faster: Shape Safe Deep Learning in Rust on Apple Silicon Jai Krishna Agrawal, Taylor Dosia Tam
This is The Way: Vision-Based End-to-End Planning for Autonomous Driving Arpit Dwivedi, Purushotham Mani, Anishalakshmi Venkata Palaparthi
CS231N Final Project Baptiste Brugerolle, Nael Ghoundale, Erika MacDonald
FDSA-GAN: A Frequency-Domain Self-Attention GAN For Improved Line Art Generation Of Anime Faces Richard Wu
A Systematic Evaluation of Independent Strategies for Enhancing Text-to-Image Semantic Alignment in Stable Diffusion Xinxie Wu
Exploring Dance Expression Through Self-Supervised Transformer-Based Contrastive Representation Learning Samuel Alexander Tong
DeepSneak: Deepfake Video Detection Christine Tung, David Hung Tung
Detecting Pedestrian Hazards on Urban Sidewalks in Low-Visibility Conditions Adrian Adesola Adegbesan, Sathvik Nori
Story Augmentation with Generative AI (SAGANets): Investigating Multi-Image Story-Generation Pipelines Bradley Konane Moon, Connor William Janowiak, Sade U Ried
Machine Vision Based Scoring of Coronary Calcium Ibrahim Kecoglu
CustomFX: A Lightweight Hand Tracking Model for Musical Instruments Gayatridevi Dinar Kamat Tarcar, Sid Yu, Owen Jung
Toward Accessible, Lightweight, At Home Dermatological Screening Nikhil R Lyles
Zero-Shot vs. Few-Shot CLIPSeg: Efficient Urban Feature Segmentation Yun-Dam Ko