A self-organizing neural system for learning to recognize textured scenes. (65/10314)

A self-organizing ARTEX model is developed to categorize and classify textured image regions. ARTEX specializes the FACADE model of how the visual cortex sees, and the ART model of how temporal and prefrontal cortices interact with the hippocampal system to learn visual recognition categories and their names. FACADE processing generates a vector of boundary and surface properties, notably texture and brightness properties, by utilizing multi-scale filtering, competition, and diffusive filling-in. Its context-sensitive local measures of textured scenes can be used to recognize scenic properties that gradually change across space, as well as abrupt texture boundaries. ART incrementally learns recognition categories that classify FACADE output vectors, class names of these categories, and their probabilities. Top-down expectations within ART encode learned prototypes that pay attention to expected visual features. When novel visual information creates a poor match with the best existing category prototype, a memory search selects a new category with which classify the novel data. ARTEX is compared with psychophysical data, and is bench marked on classification of natural textures and synthetic aperture radar images. It outperforms state-of-the-art systems that use rule-based, backpropagation, and K-nearest neighbor classifiers.  (+info)

Mechanics of accommodation of the human eye. (66/10314)

The classical Helmholtz theory of accommodation has, over the years, not gone unchallenged and most recently has been opposed by Schachar at al. (1993) (Annals of Ophthalmology, 25 (1) 5-9) who suggest that increasing the zonular tension increases rather than decreases the power of the lens. This view is supported by a numerical analysis of the lens based on a linearised form of the governing equations. We propose in this paper an alternative numerical model in which the geometric non-linear behaviour of the lens is explicitly included. Our results differ from those of Schachar et al. (1993) and are consistent with the classical Helmholtz mechanism.  (+info)

A visual evoked potential correlate of global figure-ground segmentation. (67/10314)

Human observers discriminated the global orientation of a texture-defined figure which segregated from a texture surround. Global figure discriminability was manipulated through within-figure collinearity, figure-surround interaction, and figure connectedness, while the local orientation contrast at edges between figure and surround was kept constant throughout all the experiments. Visual evoked potentials (VEPs) were recorded during onset-offset stimulation in which the figure cyclically appeared and disappeared from a uniform texture background. A difference component was obtained by subtraction of offset-from onset-VEP. Two negative peaks of the difference component are found with latencies around 140-160 and 200-260 ms, respectively. Enhanced discriminability of the global figure reduced (11-25 ms) the latency of the second peak, hence indicating that the 200-260 ms component was produced by global figure-ground segmentation.  (+info)

Seeing better at night: life style, eye design and the optimum strategy of spatial and temporal summation. (68/10314)

Animals which need to see well at night generally have eyes with wide pupils. This optical strategy to improve photon capture may be improved neurally by summing the outputs of neighbouring visual channels (spatial summation) or by increasing the length of time a sample of photons is counted by the eye (temporal summation). These summation strategies only come at the cost of spatial and temporal resolution. A simple analytical model is developed to investigate whether the improved photon catch afforded by summation really improves vision in dim light, or whether the losses in resolution actually make vision worse. The model, developed for both vertebrate camera eyes and arthropod compound eyes, calculates the finest spatial detail perceivable by a given eye design at a specified light intensity and image velocity. Visual performance is calculated for the apposition compound eye of the locust, the superposition compound eye of the dung beetle and the camera eye of the nocturnal toad. The results reveal that spatial and temporal summation is extremely beneficial to vision in dim light, especially in small eyes (e.g. compound eyes), which have a restricted ability to collect photons optically. The model predicts that using optimum spatiotemporal summation the locust can extend its vision to light intensities more than 100,000 times dimmer than if it relied on its optics alone. The relative amounts of spatial and temporal summation predicted to be optimal in dim light depend on the image velocity. Animals which are sedentary and rely on seeing small, slow images (such as the toad) are predicted to rely more on temporal summation and less on spatial summation. The opposite strategy is predicted for animals which need to see large, fast images. The predictions of the model agree very well with the known visual behaviours of nocturnal animals.  (+info)

Temporal sensitivity of human luminance pattern mechanisms determined by masking with temporally modulated stimuli. (69/10314)

Target contrast thresholds were measured using vertical spatial Gabor targets in the presence of full field maskers of the same spatial frequency and orientation. In the first experiment both target and masker were 2 cpd. The target was modulated at a frequency of 1 or 10 Hz and the maskers varied in temporal frequency from 1 to 30 Hz and in contrast from 0.03 to 0.50. In the second experiment both target and masker had a spatial frequency of 1, 5 or 8 cpd. The target was modulated at 7.5 Hz and the same set of maskers was used as in the first experiment. The results are not consistent with a widely used model that is based on mechanisms in which excitation is summed linearly and the sum is transformed by an S-shaped nonlinear excitation-response function. A new model of human pattern vision mechanisms, which has excitatory and divisive inhibitory inputs, describes the results well. Parameters from the best fit of the new model to the results of the first experiment show that the 1 Hz and 10 Hz targets were detected by mechanisms with temporal low-pass and band-pass excitatory sensitivity, respectively. Fits to the second experiment suggest that at 1 cpd, the excitatory tuning of the detecting mechanism is band-pass. At 5 and 8 cpd, the mechanisms are excited by a broad range of temporal frequencies. Mechanism sensitivity to divisive inhibition depends on temporal frequency in the same general way as sensitivity to excitation. Mechanisms are more broadly tuned to divisive inhibition than to excitation, except when the target temporal frequency is high.  (+info)

Context-dependent changes in visual sensitivity induced by Muller-Lyer stimuli. (70/10314)

We measured the detectability of a single line (target) flanked by high-contrast inward- or outward-pointing arrowheads (context). We show that as a function of target contrast, context angle, and context position there is a continuum of contextual modulations of target detectability that vary from strong inhibition (target detection is impaired) to strong excitation (target detection is facilitated), but target detection is not affected when the context is presented at low contrasts. The results show striking correlations with the perceived length distortions in the Muller-Lyer illusion, i.e. an inward-pointing arrowhead results in improved target detectability and increased perceived length of the bar, whereas an outward-pointing arrowhead results in diminished target detectability and decreased perceived length of the bar. Both suppressive and facilitatory effects diminish as target contrast, arrowhead angle, and line-arrowhead spatial disparity are increased. At larger distances between line and arrowhead the suppressive effects become facilitatory (the Muller-Lyer illusion reverses). When concurrent Muller-Lyer extent experiments are run, we found that the perceived length of the target stimulus is overestimated or underestimated as it is flanked by high-contrast inward or outward-pointing arrowheads, the magnitude of the length distortion effects diminishing as target contrast increases. To explain the nature of both context-induced suppression and facilitation in contrast detection we present a population model of orientation detectors in visual cortex that relies on short and long-range horizontal cortical connections, and suggest that that the same type of mechanism that accounts for contrast detection may account for perceived extent.  (+info)

One spatial filter limits speed of detecting low and middle frequency gratings. (71/10314)

Reaction times for detecting sinusoidal gratings depend jointly on grating contrast and spatial frequency. We examine whether the effect of spatial frequency results from low-pass filtering in a single channel or reflects processing of different frequencies by two or more different processing streams. Observers performed a speeded two-alternative spatial forced-choice detection. Errors and reaction times were measured. Contrasts varied from 0.05 to 0.67, and spatial frequencies from 0.72 to 6.51 cpd. No effect of uncertainty about spatial frequency was found, arguing against multiple channels. The data are well fit by a single channel model driven by a low pass filter.  (+info)

The role of perspective information in the recovery of 3D structure-from-motion. (72/10314)

When investigating the recovery of three-dimensional structure-from-motion (SFM), vision scientists often assume that scaled-orthographic projection, which removes effects due to depth variations across the object, is an adequate approximation to full perspective projection. This is so even though SFM judgements can, in principle, be improved by exploiting perspective projection of scenes on to the retina. In an experiment, pairs of rotating hinged planes (open books) were simulated on a computer monitor, under either perspective or orthographic projection, and human observers were asked to indicate which they perceived had the larger dihedral angle. For small displays (4.6 x 6.0 degrees) discrimination thresholds were found to be similar under the two conditions, but diverged for all larger stimuli. In particular, as stimulus size was increased, performance under orthographic projection declined and by a stimulus size of 32 x 41 degrees performance was at chance for all subjects. In contrast, thresholds decreased under perspective projection as stimulus size was increased. These results show that human observers can use the information gained from perspective projection to recover SFM and that scaled-orthographic projection becomes an unacceptable approximation even at quite modest stimulus sizes. A model of SFM that incorporates measurement errors on the retinal motions accounts for performance under both projection systems, suggesting that this early noise forms the primary limitation on 3D discrimination performance.  (+info)