Researchers identify a property that helps computer vision models learn to represent the visual world in a more stable, predictable way.
Imagine sitting on a park bench, watching someone stroll by. While the scene may constantly change as the person walks, the human brain can transform that dynamic visual information into a more stable representation over time. This ability, known as perceptual straightening, helps us predict the walking person’s trajectory.
Unlike humans, computer vision models don’t typically exhibit perceptual straightness, so they learn to represent visual information in a highly unpredictable way. But if machine-learning models had this ability, it might enable them to better estimate how objects or people will move.
MIT researchers have discovered that a specific training method can help computer vision models learn more perceptually straight representations, like humans do. Training involves showing a machine-learning model millions of examples so it can learn a task.
Read more at Massachusetts Institute of Technology
Photo Credit: geralt via Pixabay