---- Google Images indexing - how do these guys get so many?
tedster - 7:10 am on Jul 31, 2008 (gmt 0)
Nice find on that paper, piney. I notice that it focuses on "Product images" - I wonder if the algo is different for general images. I can see how it might make sense to create separate taxonomies with algo differences.
I also want to point to an earlier thread and the following information:
Google recently entered a patent application that offers a lot of clues on how images can be automatically processed for search. Here's the patent application [appft1.uspto.gov]. Notice these possibilities, especially when there is little or no data/metadata directly associated with an image:
Images can be auto-tagged according to shapes, colors, and textures. This may involve breaking down images into smaller tiles and tagging those tiles.
Images can be compared to other indexed images from around the web that have similar extracted features. Then keywords that are semantically related to those other images may be imported and used to tag the image that is being classified.
Google's challenge here is how to associate accurate keywords with images. They invented a two-person game to try to enhance this data, and they do some pretty adventurous keyword expansion across different domains. I think that's also part of the webmaster's challenge - making the mark-up extremely clear as to which text relates directly to the image.