Neural Networks for Segregation of Multiple Objects: Visual Figure-Ground Separation and Auditory Pitch Perception.
Abstract
An important component of perceptual object recognition is the segmentation into coherent perceptual units of the "blooming buzzing confusion" that bombards the senses. The work presented herein develops neural network models of some key processes of pre-attentive vision and audition that serve this goal. A neural network model, called an FBF (Feature -Boundary-Feature) network, is proposed for automatic parallel separation of multiple figures from each other and their backgrounds in noisy images. Figure-ground separation is accomplished by iterating operations of a Boundary Contour System (BCS) that generates a boundary segmentation of a scene, and a Feature Contour System (FCS) that compensates for variable illumination and fills-in surface properties using boundary signals. A key new feature is the use of the FBF filling-in process for the figure-ground separation of connected regions, which are subsequently more easily recognized. The new CORT-X 2 model is a feed-forward version of the BCS that is designed to detect, regularize, and complete boundaries in up to 50 percent noise. It also exploits the complementary properties of on-cells and off -cells to generate boundary segmentations and to compensate for boundary gaps during filling-in. In the realm of audition, many sounds are dominated by energy at integer multiples, or "harmonics", of a fundamental frequency. For such sounds (e.g., vowels in speech), the individual frequency components fuse, so that they are perceived as one sound source with a pitch at the fundamental frequency. Pitch is integral to separating auditory sources, as well as to speaker identification and speech understanding. A neural network model of pitch perception called SPINET (SPatial PItch NETwork) is developed and used to simulate a broader range of perceptual data than previous spectral models. The model employs a bank of narrowband filters as a simple model of basilar membrane mechanics, spectral on-center off-surround competitive interactions, and a "harmonic sieve" mechanism whereby the strength of a pitch depends only on spectral regions near harmonics. The model is evaluated using data involving mistuned components, shifted harmonics, complex tones with varying phase relationships, and continuous spectra such as rippled noise and narrow noise bands.
- Publication:
-
Ph.D. Thesis
- Pub Date:
- 1994
- Bibcode:
- 1994PhDT.......106W
- Keywords:
-
- Physics: Acoustics; Psychology: General; Engineering: Biomedical