Introduction to high-level vision

Introduction to high-level vision

With this blog post, we begin to talk about a new and challenging topic in human vision: high-level vision. Before, we go any further, we need to address an important question: How is this different than what we’ve been talking about before?

To date, we’ve spent a lot of time talking about some of the basic principles regarding the way your visual system measures the light in your environment. For example, you have photoreceptors that transduce light and provide a signal that carries information about the wavelength of the light you saw. Further on in your visual system, cells in the LGN and V1 measure various patterns of light and send information on about the dots, edges, and lines you may have seen. These kinds of processes that are mostly concerned with making measurements that describe the incoming light are what I’d refer to as low-level vision.

Figure 1 - Measuring the wavelength content of a light or the pattern of light that we're seeing are examples of low-level visual processes.

Next, we encountered another set of processes that we said were carried out in parts of your visual system that were even further along. The difference between these processes and the low-level measurements that came before them was that in each case, we were interested in recovering some property of the things in the world that gave rise to the pattern of light you could measure with low-level tools. For example, we’ve talked about using that raw information to make good guesses about the color or lightness of the objects in a scene, or the position of those objects in 3D space, or most recently, the way in which things are moving around you. All of these processes are what I’d refer to as mid-level vision. These processes rely on the information we measure about light, but then we use that information to start making inferences about what things there are out in the world.

Figure 2 - Looking at complex patterns of light (upper left) and using low-level information to estimate the depth (upper right), object reflectance (lower left), and motion (lower right) are examples of mid-level processes.

Now that we’re moving on, what are we moving on to? When we talk about high-level vision, we’re talking about processes that involve attaching labels to parts of our visual world, or using the information we get from low- and mid-level processes to plan behaviors involving the things we see. This encompasses a lot of different visual tasks and many different interesting computational problems. We’ll spend our time discussing just three aspects of high-level vision: (1) recognizing objects, (2) visual search, and (3) using vision to plan actions. In each case, I have to point out something important: We don’t really know how these processes work! That is, while I can give you some ideas about how they might work, these are all active areas of research that remain fairly open even regarding some basic questions. This isn’t to say that we’ve completely figured out low-level and mid-level vision, but I think it’s fair to say that we have more specific models of how those processes work that rely on mathematical concepts that aren’t so hard to understand. For high-level problems, we’re still working towards coming up with good behavioral, neural, and computational evidence that points towards a clearly-described mechanism supporting what you see and do. This means that I can’t give you detailed procedures describing how we do these things, but I can tell you about some good candidates. This is the plan from here on in, and in each case, I’m going to do my best to tell you about the strengths and weaknesses of particular ideas while being as specific as I can about how they would actually work.

Let’s begin by considering a question that’s easy to ask, but hard to answer: How do you recognize the things that you see?

Seeing and Perceiving

Search This Blog

Introduction to high-level vision

Comments

Post a Comment

Popular posts from this blog

Monocular cues for depth perception

What is Light?

Observing the retina (and what it can do)