Cristina Mata, Ph.D. Research Proficiency Presentation: 'Domain Adversarial Learning with Unlabeled Videos for Video Segmentation'

Tuesday, September 21, 2021 - 10:30am to 11:30am
Zoom - contact for Zoom info.
Event Description: 
Abstract: We study video object segmentation and video semantic segmentation, where a model must output object labels for the foreground or for all pixels in a video. Segmentation annotations are time-consuming and expensive to obtain for videos, so datasets for these tasks remain relatively small. Models struggle to output good segmentations unless an initial frame annotation is provided. There is an abundance of unlabeled video data and annotated image data, which we propose to use in an adversarial fashion by treating images and videos as two domains. We apply an adversarial loss to a network in order to learn object features invariant to video artifacts that arise from domain differences. We find that 2D networks perform well with this training method but 3D space-time networks based on convolutions and self-attention struggle. We turn to using vision transformers for video semantic segmentation and formulate a method that applies our adversarial loss per-token.
Computed Event Type: