Spatial vision in V1
In the lateral geniculate nucleus (or LGN), we saw the
beginnings of what we’ll call spatial
vision, or the measurement of patterns of light by the visual system.
Compared to what the photoreceptors do, LGN cells (and their predecessors in
the visual system, the retinal ganglion cells) respond best when light is in
specific arrangements. Within the receptive
field of an LGN cell, there are both excitatory
and inhibitory regions arranged
into a center-surround structure. This ends up meaning that LGN cells respond
most vigorously when there is either a spot of light surrounded by a darker
region (an on-center cell) or a
darker spot surrounded by light (an off-center
cell). In the magnocellular layer, these cells only measure light/dark
contrast. In the parvocellular layer, the excitatory and inhibitory regions are
also wavelength-specific: An LGN cell may respond best when there is red light
in the surround and green light in the center, for example. These receptive
field structures enable these cells to measure basic luminance and chromatic
contrast - parts of an image that differ
from what is around them. If this is our first set of measurements that help us
measure something about the patterns of light in an image, what comes next? How
does your visual system build on these measurements to encode more complex
aspects of an image?
To answer this question, we’re going to continue following
the anatomical connections from your retina onward to your brain. Our next stop
on this route is at the back of your occipital
lobe, in a region called primary visual cortex or V1. There is a great deal that we could
say about the way this region is laid out (which is pretty cool), but we’re
going to focus on trying to understand what the cells in this part of the brain
are measuring using the same tools we introduced to study what LGN cells
responded to best: single-unit recordings. Remember, this is a handy way to
identify the receptive field of a particular cell and also describe the layout
of excitatory and inhibitory regions within that receptive field. If we listen
to single-units in V1, what do we learn about the kinds of stimuli that they
like? What we’ll find are patterns of excitatory/inhibitory regions that might
look something like what you see below:
Figure 1 - A typical pattern of excitatory and inhibitory regions in a V1 cell's receptive field.
Never mind the hexagons here – I’m just using them as a
handy way to divide up a 2D region with cells that can be closer to circular.
The real question is what this pattern of pluses and minuses means for the
kinds of stimulus that this cell likes best. What I hope is fairly clear from
looking at this pattern is that this V1 cell is going to like something like a
line that is tilted at a particular angle. Another way to say that is to say
that this cells has a preference for a specific orientation of a line, or that it is tuned to that orientation. Other cells in V1 will have receptive
fields like this, too, meaning that there are cells in V1 that respond to
lines, edges, and bars tilted at different orientations. Beyond the spots of
lights that are being measured in the LGN, V1 appears to be measuring something
more complex: Boundaries that can bend to follow the contours of objects and
surfaces. These cells, unlike those that we found in the LGN, are capable of
not just telling us that they’ve found a place where the image changes but also
telling us about the specific shape of that border between light and dark parts
of a picture.
I hope it’s fairly clear that all this is just a by-product
of having a different pattern of excitatory and inhibitory regions inside of a
receptive field. So far, we haven’t introduced any kind of new computation: To
predict what a V1 cell does in response to a picture, we’d use the same tools
that we established for the LGN – we’d need a description of the image in terms
of pixel intensities, a description of the receptive field in terms of
excitatory and inhibitory regions, and we’d need to take a dot product between
those two things to calculate a response. Doing this for a particular V1 cell,
we can easily see how a cell with a preferred orientation responds to edges or
lines that are oriented at different angles. The graph below shows you what
we’d calculate in terms of a response for such a cell as we rotate a line that
it likes a lot away from its preferred orientation (Figure 2). We call this
graph the tuning curve for a cell.
Figure 2 - An orientation tuning curve for a V1 cell. This cell prefers vertical orientations, but has intermediate responses to lines near that degree of tilt.
Because all of this is stuff we’ve seen before, we’re going
to concentrate on something new: How do we put information from multiple V1
cells together? We’re going to think about this in two different ways using two
different kinds of computation to do so. In the first case, we’re going to
explore how to use population coding
to use a relatively small number of different kinds of cell to measure
line/edge orientation precisely. In the second case, we’re going to see how to
use logic gates to combine the
outputs of small groups of V1 cells to achieve something called invariance to certain transformations of
the image. Both of these are important ways to measure things we’d like to know
about images, and both of them involve using a set of simple measurements to do
something more complex. This is going to be a continuing theme as we move
through the visual system: How can we keep elaborating on the measurements
we’re making to achieve goals that are more and more useful/interesting? We’ll
begin by talking about listening to populations of cells in V1.
Comments
Post a Comment