next up previous
Next: Fusing Saliencies Up: Bottom-up saliency Previous: Bottom-up saliency

Feature Computations

The first step for computing bottom-up saliency is to generate image pyramids for each feature to enable computations on different scales. Three features are considered: Intensity, orientation, and color. For the feature intensity, we convert the input image into gray-scale and generate a Gaussian pyramid with 5 scales $ s_0$ to $ s_4$ by successively low-pass filtering and subsampling the input image, i.e., scale $ (i+1)$ has half the width and height of scale $ i$.

The intensity maps are created by center-surround mechanisms, which compute the intensity differences between image regions and their surroundings. We compute two kinds of maps, the on-center maps $ I''_{\mbox{\scriptsize
on}}$ for bright regions on dark background, and the off-center maps $ I''_{\mbox{\scriptsize off}}$: Each pixel in these maps is computed by the difference between a center $ c$ and a surround $ \sigma$ ( $ I''_{\mbox{\scriptsize
on}}$) or vice versa ( $ I''_{\mbox{\scriptsize off}}$). Here, $ c$ is a pixel in one of the scales $ s_2$ to $ s_4$, $ \sigma$ is the average of the surrounding pixels for two different radii. This yields 12 intensity scale maps $ I''_{i,s,\sigma}$ with $ i
\, \epsilon \, \{$on$ ,$off$ \}, s \, \epsilon \, \{s_2$-$ s_4\}$, and $ \sigma
\, \epsilon \, \{3,7\}$.

The maps for each $ i$ are summed up by inter-scale addition $ \bigoplus$, i.e., all maps are resized to scale 2 and then added up pixel by pixel yielding the intensity feature maps $ I'_i =
\bigoplus_{s,\sigma} I''_{i,s,\sigma}$.

To obtain the orientation maps, four oriented Gabor pyramids are created, detecting bar-like features of the orientations $ \theta = \{0^{\,\circ}, 45^{\,\circ}, 90^{\,\circ}, 135^{\,\circ}\}$. The maps 2 to 4 of each pyramid are summed up by inter-scale addition yielding 4 orientation feature maps $ O'_\theta$.

To compute the color feature maps, the color image is first converted into the uniform CIE LAB color space [2]. It represents colors similar to human perception. The three parameters in the model represent the luminance of the color (L), its position between red and green (A) and its position between yellow and blue (B). From the LAB image, a color image pyramid $ P_{\mbox{\scriptsize LAB}}$ is generated, from which four color pyramids $ P_R$, $ P_G$, $ P_B$, and $ P_Y$ are computed for the colors red, green, blue, and yellow. The maps of these pyramids show to which degree a color is represented in an image, i.e., the maps in $ P_R$ show the brightest values at red regions and the darkest values at green regions. Luminance is already considered in the intensity maps, so we ignore this channel here. The pixel value $ P_{R,s}(x,y)$ in map $ s$ of pyramid $ P_R$ is obtained by the distance between the corresponding pixel $ P_{\mbox{\scriptsize LAB}}(x,y)$ and the prototype for red $ r =
(r_a,r_b) = (255,127)$. Since $ P_{\mbox{\scriptsize LAB}}(x,y)$ is of the form $ (p_a,p_b)$, this yields: $ P_{R,s}(x,y) = \vert\vert
(p_a,p_b),(r_a,r_b) \vert\vert = \sqrt{(p_a - r_a)^2 + (p_b - r_b)^2}.$

On these pyramids, the color contrast is computed by on-center-off-surround differences yielding $ 24$ color scale maps $ C''_{\gamma,s,\sigma}$ with $ \gamma \, \epsilon \, \{$red$ ,$   green$ ,$blue$ ,$   yellow$ \}, s \, \epsilon \, \{s_2$-$ s_4\}$, and $ \sigma
\, \epsilon \, \{3,7\}$. The maps of each color are inter-scale added into 4 color feature maps $ C'_\gamma = \bigoplus_{s,\sigma} \hat{C}_{\gamma,s,\sigma}$.

next up previous
Next: Fusing Saliencies Up: Bottom-up saliency Previous: Bottom-up saliency
root 2005-01-27