Talk from Kalanit Grill-Spector – Stanford University / Department of Psychology
Short abstract:
The human visual system is extremely organized in many spatial scales. In the submillimeter scale, feature maps in primary visual cortex have a regular organization including pinwheels for orientation, blobs for colors, and maps of spatial frequency. In the millimeter to centimeter scale, functional regions align with macroanatomical landmarks, cytoarchitecture, and long-range white matter connections in both primary visual cortex and higher-level regions such as category-selective regions in the ventral stream. In the centimeter-decimeter scale, there is a large scale organization into three parallel processing streams (ventral, lateral, and dorsal) spanning occipitotemporal and occipitoparietal cortices. However, it remains a mystery why this functional organization arises. Here, we developed a new class of topographic deep neural artificial network (TDANN) - a unified end-to-end system that is trainable from natural visual inputs and in which model units in each layer are placed in a simulated cortical sheet. In addition, we develop an optimization method that directly matches model units to brain voxel, that not only allows us to evaluate both the functional and spatial arrangement (topography) of the model, but also provides a more stringent test of computational models of the brain than prior methods. We find that a TDANN trained in an unsupervised way under a spatial constraint that minimizes wiring length by encouraging nearby units to have correlated responses predicts the function and spatial topography of the visual system at multiple scales: within a visual area (e.g., orientation, spatial frequency and color maps across primary visual cortex), across a cortical expanse (e.g.,clustered organization of category-selectivity in ventral temporal cortex), and across the visual system (e.g., parallel processing streams). Surprisingly, the TDANN predicts not only the spatial organization but also the functional responses better than models trained on tasks associated with key visual behaviors that have been associated with visual streams. This suggests an intriguing new idea that functional organization of the visual system arises from a single principle: balancing general representation learning from the statistics of visual input with local spatial constraints.