Skip to main content

Introduction to high-level vision

Introduction to high-level vision
With this blog post, we begin to talk about a new and challenging topic in human vision: high-level vision. Before, we go any further, we need to address an important question: How is this different than what we’ve been talking about before?

 To date, we’ve spent a lot of time talking about some of the basic principles regarding the way your visual system measures the light in your environment. For example, you have photoreceptors that transduce light and provide a signal that carries information about the wavelength of the light you saw. Further on in your visual system, cells in the LGN and V1 measure various patterns of light and send information on about the dots, edges, and lines you may have seen. These kinds of processes that are mostly concerned with making measurements that describe the incoming light are what I’d refer to as low-level vision.


Figure 1 - Measuring the wavelength content of a light or the pattern of light that we're seeing are examples of low-level visual processes.

Next, we encountered another set of processes that we said were carried out in parts of your visual system that were even further along. The difference between these processes and the low-level measurements that came before them was that in each case, we were interested in recovering some property of the things in the world that gave rise to the pattern of light you could measure with low-level tools. For example, we’ve talked about using that raw information to make good guesses about the color or lightness of the objects in a scene, or the position of those objects in 3D space, or most recently, the way in which things are moving around you. All of these processes are what I’d refer to as mid-level vision. These processes rely on the information we measure about light, but then we use that information to start making inferences about what things there are out in the world.

Figure 2 - Looking at complex patterns of light (upper left) and using low-level information to estimate the depth (upper right), object reflectance (lower left), and motion (lower right) are examples of mid-level processes.

Now that we’re moving on, what are we moving on to? When we talk about high-level vision, we’re talking about processes that involve attaching labels to parts of our visual world, or using the information we get from low- and mid-level processes to plan behaviors involving the things we see. This encompasses a lot of different visual tasks and many different interesting computational problems. We’ll spend our time discussing just three aspects of high-level vision: (1) recognizing objects, (2) visual search, and (3) using vision to plan actions. In each case, I have to point out something important: We don’t really know how these processes work! That is, while I can give you some ideas about how they might work, these are all active areas of research that remain fairly open even regarding some basic questions. This isn’t to say that we’ve completely figured out low-level and mid-level vision, but I think it’s fair to say that we have more specific models of how those processes work that rely on mathematical concepts that aren’t so hard to understand. For high-level problems, we’re still working towards coming up with good behavioral, neural, and computational evidence that points towards a clearly-described mechanism supporting what you see and do. This means that I can’t give you detailed procedures describing how we do these things, but I can tell you about some good candidates. This is the plan from here on in, and in each case, I’m going to do my best to tell you about the strengths and weaknesses of particular ideas while being as specific as I can about how they would actually work.

Let’s begin by considering a question that’s easy to ask, but hard to answer: How do you recognize the things that you see?



Comments

Popular posts from this blog

Lab #4 - Observing retinal inhomgeneities

Lab #4 - Observing retinal inhomgeneities Back-to-back lab activities, but there's a method to the madness: In this set of exercises, you'll make a series of observations designed to show off how your ability to see depends on which part of your retina you're trying to see with. Here's a link to the lab document: https://drive.google.com/file/d/1VwIY1bDNF4CI4CUVaY5WSvQ0HxF9Mn6Y/view When you're done here, we're ready to start saying more about the retina and how it works. Our next posts will be all about developing a model that we can use to describe the retina's contribution to your vision quantitatively, so get ready to calculate some stuff!

Lab #3 - Photopigments

Lab #3 - Photopigments Our next task is to work out how you translate the image formed in the back of a pinhole camera into some kind of signal that your nervous system can work with. We'll start addressing this question by examining photopigments  in Lab #3. To complete this lab, you'll need access to some sunprint paper, which is available from a variety of different sources. Here's where I bought mine:  http://www.sunprints.org . You can find the lab documents at the link below: https://drive.google.com/file/d/17MVZqvyiCRdT_Qu5n_CtK3rVcUP0zoOG/view When you're done, move on to the Lab #4 post to make a few more observations that will give us a little more information about the retina. Afterwards, we'll try to put all of this together into a more comprehensive description of what's happening at the back of the eye.

Color Constancy: Intro

Color Constancy: Estimating object and surface color from the data. In our last post, we introduced a new kind of computation that we said was supposed to help us achieve something called perceptual constancy . That term referred to the ability to maintain some kind of constant response despite a pattern of light that was changing. For example, complex cells in V1 might be able to continue responding the same way to a line or edge that was at different positions in the visual field. This would mean that even when an object changed position over time because you or the object were moving, your complex cells might be able to keep doing the same thing throughout that movement. This is a useful thing to be able to do because your visual world changes a lot as time passes, but in terms of the real objects and surfaces that you’re looking at, the world is pretty stable. Think about it: If you just move your eyes around the room you’re sitting in, your eyes will get very different pattern...