| Home>>Research>> Textons |
Textons Textons refer to fundamental micro-structures in generic natural images and the basic elements in early (pre-attentive) visual perception. In practice, the study of textons has important implications on a series of problems. Firstly, decomposing an image into its constituent components reduces information redundancy and thus leads to better image coding algorithms. Secondly, the decomposed image representation often has much reduced dimensions and less dependence between variables (coefficients), therefore it facilitates image modeling which is necessary for image segmentation and recognition. Thirdly, in biologic vision the micro-structures in natural images provide an ecologic cue for understanding the functions of neurons in the early stage of biologic vision system. One related mathematical theory for studying image components is harmonic analysis which is concerned with decomposing some classes of mathematical functions. This includes Fourier transforms, wavelet transforms, and recently wedgelets and ridgelet and various image pyramids in image analysis. In recent years, there is a widespread consensus that the optimal set of image components should be learned from the ensemble of natural images. The natural image ensemble is known to be very different from those classic mathematical functional classes from which the Fourier and wavelet transforms were originally derived. This consensus leads to a vast body of work in the study of natural images statistics and image micro-structures, among which two streams are most remarkable. One stream studies the statistical regularities of natural images. This includes the scale invariance, the joint density (histograms) of small image patches (e.g. 3 × 3 pixels), and the joint histogram or correlation of filter responses. Then probabilistic models are derived to account for the spatial statistics. The other stream learns over-complete basis from natural images under the general idea of sparse coding. In contrast to the orthogonal bases or tight frame in the Fourier and wavelet transforms, the learned bases are highly correlated, and a given image is coded by a sparse population in the over-complete basis. While the over-complete basis presents a major progress in the pursuit of fundamental image elements, one may wonder what are the image structures beyond bases. By an analogy to physics, if we compare the image bases in the sparse or ICA coding to protons, neutrons, and electrons, then what are the "atoms", "molecules", and "polymers" in natural images? How do we learn such structures from generic images? We present one step towards this goal. We first examine the generative model in the sparse coding scheme. One basic assumption under this scheme is that the bases are independent and identically distributed. To release this assumption, we study the spatial structures of the bases under a generative model and define a texton as a mini-template that consists of a varying number of image bases with some geometric and photometric configurations. Like an atom in physics, a couple of bases in the texton have relatively large coefficients (heavy weights) and thus form the "nucleus" which is augmented by some bases with small coefficients (light weight) like electrons. Then a small number of textons can be learned from training images as repeating micro-structures. Figure 2 illustrates an example of a star pattern from bases to textons. Figure 2 illustrates an example of a star pattern from bases to textons.
b) The texton template for the star pattern.
c) How bases compose the image of a star. Figure 2. The illustration of from bases to textons by an example of a star pattern. (Zhu, Guo, Wu and Wang 2002)
|