Page is a not externally linkable
tedster - 6:49 pm on May 8, 2008 (gmt 0)
I doubt that all of this is currently in place, but some of it may well be getting folded into the mix. The main focus of the patent is breaking down images into tiles, scoring those tiles according to shape, color and texture - and then looking for similarly scored images around the web, using probability models. The keywords associated with those similar images can then be used to help score the original image. So there's a chance that an image might be filtered even though the real issue is with sites that host technically similar images. The process outlined in this patent has a kind of safety valve built in: [0038]The process then expands the set of keywords associated with each identified image by adding synonyms for the set of keywords (step 114). In one embodiment of the present invention, intelligent thesaurus tools are used to add synonyms for each keyword in the set. For example, keywords "sea" and "ocean" may appear in two sets of keywords for two identified images, respectively. After expanding the keywords this way, both images will be associated with both keywords. [0039]Next, the process performs comparisons between sets of keywords for those identified images, to identify intersecting keywords (step 116). Note that adding synonyms to the keywords increases the probability of identifying such intersections. So a technical similarity of images alone would not cause the filter to kick - but add in some intersection of the keyword sets and that could do it. I'd suggest a close look at keywords, image names, etc, with an eye to "synonym surprise." For example, it sounds like a photo filled with a pig's pink flesh and tail, coupled with a comment about the pig's "kinky" tail, just might tip the balance with this technology.
In addition to hotlinks from adult websites, another place I'd pay some attention is the new technical approach outlined in Google's Jan 2008 patent application, Method and apparatus for automatically annotating images [searchengineworld.com] [0037]After the multiple similar images are identified, the process obtains text surrounding these images...