[IEEE Fourth International Conference on Computational Intelligence and Multimedia Applications....
Transcript of [IEEE Fourth International Conference on Computational Intelligence and Multimedia Applications....
Query by Image Similarity Using a Fuzzy Logic Approach
Anchizes do E. L. GonÁalves Filho Electrical Engineering Department, PUC-RIO ñ Pontifical Catholic University of Rio de Janeiro [email protected]
Guilherme L. A. Mota Electrical Engineering Department, PUC-RIO - Pontifical Catholic University of Rio de Janeiro [email protected]
Marley M. B.R. Vellasco Electrical Engineering Department, PUC-RIO - Pontifical Catholic University of Rio de Janeiro and
Systems Engineering Department, UERJ - State University of Rio de Janeiro,
Brazil. [email protected]
Marco A. C. Pacheco Electrical Engineering Department, PUC-RIO - Pontifical Catholic University of Rio de Janeiro and
Systems Engineering Department, UERJ - State University of Rio de Janeiro,
Brazil [email protected]
Abstract
In this paper we propose a new model for query by image similarity. The model utilizes a fuzzy logic approach to cluster intrinsic image characteristics, which are extracted from subregions of the image. The clustering process provides a set of parameters that are used to compare a target image with a group of images. As a result, the system provides the images in the data set which are similar to the target image. We present as an example some queries by similarity on an image database composed of 20 types of animals. The main objective of this model is to develop an intelligent image query system that can be applied on the web and image databases. 1. Introduction In modern society, the use of digital images has greatly increased in a broad variety of
fields, particularly on account of the worldwide diffusion of the Internet. Therefore, there is an urgent need to obtain robust methods to search for such images on the web. Since manual search methods by topics of interest that required human intervention became an impossible task, query sites that are capable of automatically performing such searches were created in order to obtain faster and more direct access to the information desired. Based on a vast number of techniques that were already being developed for applications in the areas of vision and robotics, queries by keywords recently began to be employed jointly with pattern recognition techniques with the purpose of recognizing visual attributes [2]. In this paper, a new methodology for comparing images by similarity is presented. The method simultaneously considers three classes of attributes that are relevant to image characterization: color, texture and shape. Based on these parameters, a measure of similarity is determined and this makes it possible to define the images that are most similar to the target image.
2. Image query system Generally speaking, the desirable functionalities in an image query system may be described as follows [3]: 1.allow queries by keywords of the image description, date and generator; 2. allow queries by color, shape and formal attributes; 3. search on the internal databases of a specific site; 4. display the image as part of the query results; 5. provide the means to find information on copyrights, and; 6. provide the terms for authorization and permission to use the system. Among the existing image query systems, the following are worthy of note: IBMís QBIC [4], MITís Photobook [5], UC Berkeleyís Chabot [6], the C-BIRD [7], by Canadaís Simon Fraser University, and Excalibur Technologiesí Excalibur Retrieval Ware [8]. These systems, however, only meet the functionality requirements in items 1, 2 and 4 described above. In addition, they do not allow the user to supply an image that is unknown to the system and use it as input for a query. Regardless of the type of database that is being used, the existing systems provide in their initial interface a list of image classes from which the user chooses the target image for the query. A system such as the one proposed in this paper, that allows the user to supply an image not previously classified as input for a query, has not yet been provided. 3. Methodology The modeling proposed for queries by image similarity is based on the theory of fuzzy sets. The fuzzy clustering algorithm fuzzy c-means (FCM) [9] was applied and a methodology that is capable of measuring the similarity between two images was proposed. In this way, the similarity of a given target image is measured against each image in the database and the algorithm returns the images that correspond to the highest similarity values. 3.1. FCM Algorithm
In non-fuzzy clustering, the boundaries between different groups (clusters) are precise, in other words, each pattern belongs to a single cluster only. In the case of fuzzy clustering, each pattern belongs to several clusters concomitantly, with different degrees of membership. The FCM algorithm is a variant of the fuzzy clustering algorithm. Such fuzzy clustering models may be described as follows: Let X={x1T, x2T, Ö , xrT} be a data matrix composed of xi elements in the Euclidean space with a dimension of p. The objective is to partition the elements into c fuzzy clusters by supplying the following matrix: U.
],[ ijuU = where i =1, Ö , c e j=1, Ö , r Eq. 1 where uij is a value between [0, 1] that indicates the degree of membership of element xj to cluster i. During the clustering process, the FCM algorithm minimizes the objective function defined in Eq. 2,
( ) 2
1i
c
i
r
j
m υυ −=∑∑=
jkij xu,uJij
for m ∈ [1,∞] Eq. 2
where υi represents the centroid of cluster i and m represents the fuzzy coefficient that is responsible for the degree of fuzziness of the elements of the U matrix. The higher the fuzzy coefficient, the fuzzier the U matrix, and for m=1, the objective function is reduced to the crisp case.
In order to minimize the objective function, the FCM algorithm goes through the 4 stages below, which consist of: 1. Choosing the number of clusters k (2≤ k≤ r), the fuzziness coefficient m, the stopping
coefficient ε, and the initial matrix U(0). Placing the index l=0. 2. Calculating the center of the clusters {υi(l) | i=1, 2, Ö , c}. 3. Calculating the new matrix U(l+1). 4. Calculating ∆= || U(l+1) - U(l) || = maxi,j| uij(l+1) - uij(l) |. If ∆ > ε, then l=l+1 and return to
step 2, or else, end. 3.2. Methodology employed for comparison between images
The algorithm proposed for comparison between images is based on a representation which subdivides each image into regions of 16 X 16 pixels with no overlaps. Texture, shape and color parameters that are invariant in terms of rotation are extracted from each of the regions. The set of all the parameters that have been extracted from a region composes an attributes vector xi={x1,x2,Ö ,xp} with a p dimension. The concatenation of the attributes vectors extracted from all r regions of an image forms the attributes matrix of an image X=[x1, x2, Ö ,xr]T of r×p dimension. In this matrix, whose lines are observations in a p dimensional space, an image is represented by its intrinsic characteristics in terms of texture, shape and color. The methodology for queries by similarity comprises two stages. In the first stage, the attributes matrices that correspond to the images on the database are calculated. Next, considering this set of observations, the whitening transformation matrix [10] is calculated and applied to each attributes matrix and as a result, a transformed matrix (TM) is produced for each image. Each TM is clustered using the FCM algorithm. The resulting k centroids are then stored, assuming that they represent the essence of the intrinsic characteristics of each of the images on the database. In the second stage, the comparison between the target image and an image of the database is accomplished. Once a target image is given, its attributes matrix is calculated and submitted to the same whitening transformation that had been obtained previously. The k centroids of one of the images on the database are attached to the transformed matrix and the resulting matrix is clustered with the FCM algorithm. Part of the U matrix that results from this clustering process supplies the degrees of membership of the centroids of the database image to the clusters of the target image thus forming the membership matrix M of k×k dimension, where the mij element provides the membership of centroid j of the database image to cluster i of the target image. By associating each centroid with a single cluster, a measure of similarity is extracted from the membership matrix (see section 3.5). The procedure described in the second stage is repeated for each image on the database and the measure of similarity is stored. In the end, the algorithm returns the images that were considered the most similar. 3.3. Composition of the attributes vector
The attributes vector is made up of three classes of parameters: color, texture and shape. The HSV (Hue, Saturation and Value) representation of the average color of the pixels in the region was chosen because this model contains an intuitive notion of color that is associated with the hue component that represents the dominant color as perceived by an observer [11]. In [12], R. Haralick et al. proved that the texture contained in an image or image region may be represented by the co-occurrence matrices in 4 directions (d): 0o, 45o, 90o e 135o. Considering grayscale images quantified in N tones, each one of these matrices is used for calculating 13 measures which are known as Haralickís parameters.
Only 11 parameters (hi) have been considered in this paper, and the average of each Haralick parameter f i in the four directions ( f id ) is considered:
∑∈∀
=D
hd
idi f.
41 , where D={00, 450, 900, 1350} e i = {1, ..., 11} Eq. 3
The theory of invariant moments, which was proposed by Ming-Kuei Hu [13] in 1962, makes use of an analogy to algebraic invariants. This method allows the extraction of characteristics that are based on the description of surfaces according to their inertial characteristics which are represented in terms of the order moments (p,q) [14]. 3.4. Whitening transformation
In order to make identical the influence of each attribute, before the clustering process begins, their ranges are normalized by the application of the whitening transformation defined in Eq. 4:
ii Γ.xy = Eq. 4 where xi represents the attributes vector, Γ corresponds to the whitening transformation matrix [10] with p×p dimensions, and yi corresponds to the transformed vector.
After the whitening transformation is applied, the influence of each attribute on the clustering process tends to be identical and the clusters obtained may reveal the essence of the intrinsic characteristics of the image.
3.5. Similarity measure
The FCM algorithm considers that each pattern belongs to all the clusters with different degrees of membership by supplying the membership matrix M, which reveals the similarities of the images. The mij element of this matrix corresponds to the degree of membership of centroid j of the database image to cluster i of the target image. The proposed similarity measure is described in Eq. 5 and corresponds to the sum of the degrees of membership of each centroid to a cluster with each cluster being considered only once.
kcccck kmmmmcccf ++++= ...),...,,,( 32121 321
M Eq. 5 4. Case studies In order to carry out the experiments, a database was constructed with 70 images extracted from the CorelDRAW [15] image database. Each image contains the photo of one out of a total of 20 types of animals against a white background. The attributes vector that was used to represent each image contains a total of 21 parameters ñ 11 for texture, 7 for shape and 3 for color. Upon each query, the algorithm returns the 3 images that were considered the most similar. Two experiments were carried out for the evaluation of the proposed method. The first evaluates the performance of the method by making use of the k value that was defined in the previous experiment; and the second one aims to confirm the invariance to rotation of the proposed methodology. 4.1. Queries for different target images
This experiment verifies the potential of the proposed similarity-based method for image queries. 4 queries were performed considering k=5 and using 4 distinct target images, as is presented in Table 1.
It must be stressed that the 4 target images provided in this experiment were not part of the previously classified database. The images returned by the algorithm bear similarities to the input image that are coherent from the cognitive viewpoint, in other words, an observer would find similarities in terms of color, texture and shape, though the similarity in terms of shape is a bit confusing since the common observer would tend to be unable to abstract that a lynx (Table 1, line 3, column 1) is similar to a bison (Table 1, line 3, column 3), which only becomes clear when the outline of these two animals is considered. It is difficult to apprehend to what extent texture influences the response of the system because it is a very abstract concept and on this account, color is the feature that the observer is able to distinguish more easily. Since the number of images the database contains is quite limited, it is advisable to run new tests with databases that contain more images in order to evaluate the algorithm more accurately.
4.2. Invariance to rotation
The objective of this experiment is to confirm the invariance to rotation of the proposed methodology. To achieve this goal 4 queries were performed considering k=5 and using 4 distinct target images, which are images from the database rotated 30o anti-clockwise, as presented in Table 2, first column. Table 2 illustrates the three most similar images from the database, for each reference image. It can be observed from the results obtained that the majority of the returned images are very similar to the target images provided as input. In addition, in all queries performed, the most similar image returned is exactly the original image which demonstrate the robustness of the proposed method regarding invariance to rotation.
5. Conclusions This paper has proposed a model based on fuzzy logic for queries by image similarity. The images are divided into small subregions from which several texture, shape and color parameters are measured. The intrinsic characteristics of the images are extracted by means of the FCM algorithm and are used as a basis for their comparison. A given target image is compared with those that are on the database and in the end, the algorithm returns those that are considered the most similar. The experiments were carried out with a database that contained 70 images. The results obtained were quite satisfactory and proved to be cognitively coherent. In the future, this model of intelligent query by similarity may be used to enhance the management of image databases that are available on the internet, especially after the system is expanded with the insertion of new attributes and similarity-based measuring methods. Another potential contribution is the possibility of controlling the influence of the most important attributes, automatically or not, and allowing the user to specify the characteristics that are most relevant to the desired similarity.
6. References [1] Alta Vista: http://www.altavista.digital.com. [2] WebSEEK: http://disney.ctr.columbia.edu/webseek/. [3] The Big Picture: http://www.onlineinc.com/onlinemag/OL1998/beristein5.html. [4] http://www.qbic.almaden.ibm.com/ [5] A. Pentland, R.W. Picard, S. Sclaroff, ì Photobook: Content-Based Manipulation of Image Databasesî - SPIE Storage and Retrieval Image and Video Databases II, No. 2185, Feb. 6-10, 1994, San Jose. [6] http://www.cs.berkeley.edu/~ginger/chabot.html
[7] M.S. Drew, J. Wei and Z.N. Li, ì Illumination-invariant color object recognition via compressed chromaticity histograms of color-channel-normalized imagesî , Proc. Int. Conf. on Computer Vision (ICCV 98), pp. 533-540, 1998. [8] http://www.excalib.com/ [9] Flores-Sintas, A., Cadenas, JosÈ M., Martin, F.: Membership Functions in the Fuzzy C-Means Algorithm. Elsevier Science B.V. Fuzzy Sets and Systems 101 (1999) 49-58. [10] Whitening: http://erin.mit.jyu.fi/projetcs/nda/usrman.old/node46.html. [11] R. C. Gonzalez, R. E. Woods, ìDigital Image Processingî , 1992. [12] R. Haralick, K. Shanmugam e ITSíHAK Dinstein, ì Textural Features For Image Classificationî , IEEE Transactions SMC vol smc-3, no 6, nov 1973. [13] M. K. Hu, ì Pattern Recognition by Moment Invariantsî - IRE Transactions on Information Theory, Vol. 17-8, N. 2, pp. 179-187, Fev. 1962. [14] A. P. Reeves et all, ì Three-Dimensional Shape Analysis Using Moments and Fourier Descriptorsî - IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 10, No. 6, Nov. 1988. [15] CorelDRAW 9 - CorelCorporation ñ http://www.corel.com.
Target Image 1st 2nd 3rd
Table 1. Result of 4 queries by image similarity.
Target Image 1st 2nd 3rd
Table 2. The returned three most similar images to the reference image.