Space-Time Memory Network for Sounding Object Localization in Videos


Leveraging temporal synchronization and association within sight and sound is an essential step towards robust localization of sounding objects. To this end, we propose a space-time memory network for sounding object localization in videos.

BMVC, 2021


Sizhe Lester Li
My research interests span robot learning, vision, and physics simulation. Currently, I develop methods for robots to learn to interact with deformable objects with challenging dynamics.