Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Blocking googlebot but not google image bot

         

Sgt_Kickaxe

2:12 pm on Feb 4, 2012 (gmt 0)



This is a continuing from - [webmasterworld.com...]

I've run into a situation for a client where the text will be mostly duplicate but the images are his artwork based on the text, think cartoonist wanting to mock political speeches (in a good natured way).


He wants to rank well in google images but does not want to rank in google text based search, and does not want a duplicate penalty or ban because of text when his images are the only unique content.

The previous thread points out that Google loads the page behind the images in a frame and so I don't know if it's even possible, has anyone tested this yet? I can't test it on his site and it will take some time to test it otherwise, but HOW would you rank images in Google search while not being indexed at all in traditional search?

lucy24

9:01 pm on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would be asking two concrete questions:

If the page itself is marked "noindex" but not "nofollow", would g### skip over the page and still grab its images?

In this situation, would it read the alt attributes of the images? There has to be some way for it to know what it's a picture of, beyond the color-and-shape matching discussed elsewhere.

OK, rewind.

As I was typing, I realized that I myself have a page that fits this description: it's no-indexed, but it contains a great big picture. If I go to Google Image Search, enter the appropriate search term and ask for results constrained to my site, the picture does not come up-- although a bunch of less appropriate pictures do.

There's one glitch, though. All links to the page in question are marked nofollow, which is supposed to mean that law-abiding robots can't get there from here.

If nobody else has a ready-made test case, I'll take away the "nofollow" (but not the "noindex", which I really need) wait a while and see if it makes any difference.

1script

10:52 pm on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



lucy24, I think all it takes to break this logical equilibrium is someone hotlinking to your great picture from a forum site that has neither "noindex" nor "nofollow" (AFAIK nofollow is not even valid in <img src=""> context). I would not be surprised if you noindex your pages, theirs will come up instead, happily serving the hotlinked picture from your site, which would make you lose twice on one deal.

lucy24

12:13 am on Feb 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oops, you've misunderstood my point ;) I don't care about the picture one way or the other. My hotlinking routine will deal with it.* I just don't want the page itself to be indexed. (I've got an ironclad reason: The text consists almost entirely of search queries I've seen. So, almost by definition, it would be useless if it came up in response to a search.)

But that's an interesting added point for the OP and his client to consider: In the worst-case scenario, his own original pictures might show up in searches only if they are hotlinked.

I guess an even more important question is: How exactly do image searches work? I have one photo of a public figure that comes up in searches periodically. The searches are for "firstname lastname". The filename is "firstname.jpg" and the alt is simply "firstname". The full name only appears in the page text. So there has to be some cross-referencing between text and the image.


* In a pretty gratifying way, because the image is huge by my standards-- both in filesize and pixels-- so the NO HOTLINKS substitute graphic would be correspondingly vast. Even at thumbnail size it's effective (green and magenta on a black background).

Sgt_Kickaxe

6:02 pm on Feb 6, 2012 (gmt 0)



Not worried about hotlinking, I just need to rank his art images in G image search while not being indexed in G regular search.

- robots.txt doesn't seem to do the job even if I allow image bot but not googlebot, probably because google wants to load the page behind the image?

Does ANYONE have images indexed from pages that are not indexed?

tedster

7:25 pm on Feb 6, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This January change that Google just announced sounds like it will work against what you are hoping for.

  • Improved image search quality. [launch codename "endearo", project codename "Image Search"] This is a small improvement to our image search ranking algorithm. In particular, this change helps images with high-quality landing pages rank higher in our image search results.

    [insidesearch.blogspot.com...]
    If I had this challenge, I think I'd let go of worries about duplicate text content and just see how Google sorts things out naturally. Duplicate text is not the boogie man that some webmasters like to conjure up.
  • tangor

    9:28 pm on Feb 6, 2012 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Duplicate text is not the boogie man that some webmasters like to conjure up.

    Quite true. In fact, in some situations (book lists, for example) SOME duplication is EXPECTED... and failure to find what is expected might raise a flag.

    Sgt_Kickaxe

    7:50 am on Feb 7, 2012 (gmt 0)



    Looks like I won't have much choice but to allow both, or deny both if he chooses to use social (facebook/Twitter) to promote his images.

    That change isn't an improvement tedster. When I use image search I am looking at IMAGES, not text. Now I can expect it to be harder to find exceptional images if they happen to be on the type of site I'm working with right now, where the artist doesn't care about SEO and such and just wants his artwork online.

    Actually, the change is moronic, think about it for a second... Google is making changes that disable frame busting and preventing people who use image search to even see the website the image is on without a bunch of extra clicks (including closing the image which is not a natural reaction). WHY would you make the image ranking depend on something you will not easily show? If they applied the same logic to natural results and make it hard to rank text because of image quality the non-artist webmasters would not be happy either.

    I think I'm going to recommend he go with Twitter/Facebook at this point. I was hoping Google had thought this through and there was a solution, Google's loss.

    lucy24

    9:41 am on Feb 7, 2012 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    When I use image search I am looking at IMAGES, not text.

    You may think so, but while you're looking, g### has already loaded up the entire accompanying page, whether the user asked for it or not. (That's google specifically. Yandex just shows the image. Don't know how others work; I'm just going by what comes through in logs.)

    And you've still got the question: how does g### find the images if it can't get to the page they're on? You could link to image directories, but the images would have to have some pretty eloquent filenames. Even the Imagebot can't look at a picture in isolation and tell the searcher what it's a picture of.