homepage Welcome to WebmasterWorld Guest from 54.226.173.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 31 message thread spans 2 pages: 31 ( [1] 2 > >     
Google noindex and images
Naveenn




msg:4484011
 9:10 am on Aug 12, 2012 (gmt 0)

Hi,

I have an image gallery which opens large images in a "new page" which has little or no other content on the page other than the large image. So for better user experience and not to spam search engines, recently I had added the meta tag noindex,follow on these pages.

Within 1 month all these large image pages are deindexed from the google index, which is good, however I see that all the images attached to these pages are also not found in index. I even submitted a images map to google but it wont index my large images.

Also, the directly listing of these image folders is disabled, if that could be a reason? I don't want directory listing.

Please advice ways to get images indexed and keeping pages out of index.

regards,
Naveen

 

tedster




msg:4484275
 12:58 pm on Aug 13, 2012 (gmt 0)

This is an interesting report. It does seem to indicate that noindex meta tags apply to all page elements - something I never though too much about before.

Have you thought about removing the noindex meta tag and letting Google make that determination on their own? Alternately, you can link to just the large image itself, using "lightbox" technology or one of it's clones such as slimbox.

Naveenn




msg:4484280
 1:10 pm on Aug 13, 2012 (gmt 0)

I had removed the "image pages" form index was because my site rank fell badly since "panda" update(s), this was an experiment I tried to improve ranking of category pages, large image pages I thought were identified as "thin content pages" and hence the penalty.

The thumbnail images are still indexed as those are on category pages, but "large images" have disappeared on the ground of what I think is related to "noindex" tag on large image pages.

I have even submited an images map which contains category page link and all large images under it, but google won't index them.

If I enable directoy listing, would the images then get indexed?

tedster




msg:4484291
 2:01 pm on Aug 13, 2012 (gmt 0)

It looks like the noindex pages you created were the only link to the large images. If you create another link directly to those images, they should get indexed again.

Naveenn




msg:4484296
 2:12 pm on Aug 13, 2012 (gmt 0)

This is why I created a sitemap. If I create an "on site" image map, then wont that be treated as "duplicate content" or "thin page" or "doorway page"

Also, If there is already a tag on site to "noindex" a page (all elements as we learn) won't that content already be filtered even if I link those images to another page/map?

indyank




msg:4484300
 2:32 pm on Aug 13, 2012 (gmt 0)

Since you have used "noindex, follow", they should still follow the direct image links. Did you also "nofollow" all internal links to those pages that only have the individual large images? Have you ruled out all other reasons?

This example increases my suspicion on their treating pages with "noindex" or "noindex, follow" meta tags as equivalent to "noindex, nofollow" increases.

lucy24




msg:4484349
 4:29 pm on Aug 13, 2012 (gmt 0)

Since you have used "noindex, follow", they should still follow the direct image links.

I suspect it depends on whether the images are linked as <img src -- which they would be, if they're on pages of their own-- or as <a href -- no text, just the image file. I've got similar gallery pages myself. And I've seen robots treating the two differently, even when there's a jpg at the far end of both. Not just evil Ukrainians, but respectable* crawlers like the bing/msn imagebot.

Now, if you had a supplementary page that was tucked away somewhere, but human-accessible, and that page included <a href links to all the images that are normally linked with <img src ...

I've never heard about Duplicate Content in the context of images. Only text.


* Yes, OK, for a given definition of "respectable" ;)

phranque




msg:4484491
 3:30 am on Aug 14, 2012 (gmt 0)

perhaps this is a candidate for using the link rel canonical element to specify the image file as the canonical url for the image page and then allow the image page to be crawled and indexed.

rango




msg:4484614
 12:14 pm on Aug 14, 2012 (gmt 0)

Among other things, we have tried a similar thing to get out of panda. Noindexed several million image pages from our site. Only a select ~30,000 of the featured ones are now allowed to be indexed.

It made no difference wrt panda. Though it does mean less activity from googlebot on those pages which is a good thing for server load. So it was still worthwhile.

Naveenn




msg:4484630
 1:37 pm on Aug 14, 2012 (gmt 0)


perhaps this is a candidate for using the link rel canonical element

This seems like a good option but it will also mean indexing of the large image pages, which might be a cause for my panda penalty. Also I believe that even the "rel canonical" will come under "noindex" as it is present on the same page.

Currenlty I have the image as <img src=> (embedded) so if I add a link to this image, so that it links to itself, will that make google index it then? as that will be "noindex, follow"

indyank




msg:4484718
 4:31 pm on Aug 14, 2012 (gmt 0)

My guess is you are somehow blocking the directory that contains the large images through robots.txt or .htaccess.

It might also be that the problem is something else, say copied images etc.

As for the image only pages, you may remove the noindex meta tag and specify the parent page as the canonical url.

phranque




msg:4484742
 5:32 pm on Aug 14, 2012 (gmt 0)

Also I believe that even the "rel canonical" will come under "noindex" as it is present on the same page.


not if you...
allow the image page to be crawled and indexed

Naveenn




msg:4484902
 4:56 am on Aug 15, 2012 (gmt 0)

As for the image only pages, you may remove the noindex meta tag and specify the parent page as the canonical url.


I guess this is the right way of doing it, will this also mean that images pages are not indexed?

not if you...
allow the image page to be crawled and indexed

The sole purpose of this experiment is to not get image pages indexed, so wonder if that would help.

phranque




msg:4484910
 6:35 am on Aug 15, 2012 (gmt 0)

google "should" take your "preference/suggestion" in the link rel canonical element and index that in place of the crawled element.

if you specify a meta robots noindex i assume google will ignore the entire document, including the link rel canonical element.

john mu's post from this webmaster central help forum thread is 3 years old but might give you some things to think about.

Canonical conflicts with NOINDEX?:
http://productforums.google.com/forum/#!category-topic/webmasters/crawling-indexing--ranking/0sqRrolO_Ss%5B1-25%5D

indyank




msg:4485044
 1:51 pm on Aug 15, 2012 (gmt 0)

I guess this is the right way of doing it, will this also mean that images pages are not indexed?


Yes. You might also want to make sure that pages for images carry the same title tag as its parent.That will be an added hint to google.

Naveenn




msg:4485064
 2:37 pm on Aug 15, 2012 (gmt 0)

Thank you guys for all the help and discussion. I have now allowed image pages to be indexed and pointed the Canonical url to "category page". I have around 350 categories and 12000+ images. I believe 12000+ images pointing to 350 categories resp; should boost the category pages.

I will update you about anything positive/negative out of this.

Thanx,
Naveen

Nostalgic Dave




msg:4487455
 4:28 am on Aug 23, 2012 (gmt 0)

I have this exact same problem. I added meta noindex to my image attachment pages because my entire site was in Pandaville, and once those pages were out of the index, my site popped back up and has stayed there... but my images are also gone from the index. I don't want to have Google linking directly to the image URL because then my site's template/ads are not showing. I'll gladly accept no images in the index if it's what keeps me out of Pandaville!

I'm looking forward to seeing if this works for you Naveenn!

Naveenn




msg:4487458
 4:51 am on Aug 23, 2012 (gmt 0)

Hi Dave,

I have seen a smooth upward curve in last few days (google traffic), but that is just around 100-200 visitors, but so far it seems positive.

I want my images to be indexed as it does get me a lot of traffic to the site. I wonder why you are ok with images not being indexed?

Also thanx for pointing out that this approach helps with Panda!

Nostalgic Dave




msg:4487459
 5:00 am on Aug 23, 2012 (gmt 0)

Thanks for the update Naveenn.

I'd love to have my images indexed, as long as the page they are associated with is my image attachment page, and not just the image URL all by itself. Having Google index from an image site-map, wouldn't that result in the image URL's being the ones used? Meaning... a Google visitor comes to the *.jpg url, where there is no HTML at all?

indyank




msg:4487474
 5:47 am on Aug 23, 2012 (gmt 0)

I would like to add that there is something terrible going on with Images where they wrongly credit them to scraper sites. The scenario is like this:

1) You have images in those separate image attachment pages. You have the .htaccess restriction in place for all sites except the ones you desire like google.

2) Someone else and in this case a blogspot blog is scraping the entire set of images and displaying them on their blog by directly linking to those images thro. src attributes for img tags. They don't even credit back by linking back to those image attachment pages or the direct images using href. Google displays these scraper sites as the source for those images.


Coming back to what is in discussion, Google does have a separate "noimageindex" robots meta tag to tell Google images bot that the image attachment page should not be indexed as a reference page. See this - [support.google.com...]

But i am not sure why Google ignores indexing images when you simply add "noindex" and not "noimageindex". Are you sure that is the reason or is it that you have blocked images for others (except Google) using .htaccess. Can you two pls. confirm?

[edited by: indyank at 6:35 am (utc) on Aug 23, 2012]

Naveenn




msg:4487480
 6:16 am on Aug 23, 2012 (gmt 0)


But i am not sure why Google ignores indexing images when you simply add "noindex" and not "noimageindex". Are you sure that is the reason or is it that you have blocked images for others (except Google) using .htaccess. Can you two pls. confirm?


My images were well indexed when I did not have the "noindex" tag on large image pages, It was my bad that I waited too long to check the image index, all the images were gone! However thumbnail images are still indexed which are in same sub directory (images/thumbnails/cat123/) to (images/media/cat123/)

Also few large images are indexed in bing/yahoo. I also ran a google bot test from webmaster tools and for direct image links I get "Allowed" for googlebot, even fetch as google worked.


Having Google index from an image site-map, wouldn't that result in the image URL's being the ones used? Meaning... a Google visitor comes to the *.jpg url, where there is no HTML at all?


Image sitemaps do not work with simple links to the images. You need to define an html page which is taken as a parent to the images. You can have upto 1000 images per page. Search for "google imagesite map" and follow the google link.

indyank




msg:4487484
 6:32 am on Aug 23, 2012 (gmt 0)

My images were well indexed when I did not have the "noindex" tag on large image pages


Yes that is infact my question to you. Why should google images bot obey "noindex" tag when they have a separate "noimageindex" meta tag. My understanding is "noindex" is only for not indexing and displaying image attachment pages in google search and not for google image search.

You have not answered my question on whether you have blocked referrers other than Googlebot in your .htaccess file?

I also ran a google bot test from webmaster tools and for direct image links I get "Allowed" for googlebot, even fetch as google worked.


WMT tool only tests crawlability and not indexability.

indyank




msg:4487505
 7:24 am on Aug 23, 2012 (gmt 0)

I'd love to have my images indexed, as long as the page they are associated with is my image attachment page, and not just the image URL all by itself. Having Google index from an image site-map, wouldn't that result in the image URL's being the ones used? Meaning... a Google visitor comes to the *.jpg url, where there is no HTML at all?


Google always shows a referring html page as the source for the image and not the direct image. If you don't have a referring html page on your site or if you have added "noimageindex" to such pages Google would ignore them as referring pages but you will run the risk of they (google) crediting some other scraper site who give your images a referral html page on their domain by directly linking to the images hosted on your server!

Google will still show the large images in their index but the referral pages will be those of the scrapers and when you land on those pages after clicking close button of the image, people will not find any such image on the landing page as you would have restricted display on foreign domains (excluding google images) using .htaccess. Google is very selfish here as they don't bother about scraping or user's bad experience on the landing page as long as they find a referral page for the image.They will always show every image in their index. So never go by the assumption that your large images are out of their index. The fact is they are only crediting them to wrong sources.

Google images is clearly a case where google act as pure scrapers and don't care about who owns the images. May be, they would have felt that we are already showing the full image as an overlay by scraping it and why do you (their user) bother visiting the landing page by clicking the close button.

This is exactly what happened in the example scenario that I had explained above. But the only difference is I had earlier added only "noindex" meta tag to the image attachment pages and not "noimageindex". I am guessing google is using "noindex" meta tag as a directive for both Google search and google image search. So they have credited the scrapers pages as referral pages. I then went on to remove "noindex" meta tag on all those image attachment pages and instead used canonical tag to specify the parent page url as the canonical link. I am not yet sure on how this will work for google images.

Anyone else has any other ideas on how to handle this?

Nostalgic Dave




msg:4487720
 6:52 pm on Aug 23, 2012 (gmt 0)

Thanks guys! I'll get to work on an image-sitemap, and using rel canonical instead of meta noindex the image attachment page sounds interesting, but at the same time a bit risky. I REALLY don't want those thin pages back in the index where they could put me back into Pandaville.

lucy24




msg:4487780
 9:34 pm on Aug 23, 2012 (gmt 0)

I've started to deal with some image hotlinkers a different way. Like the OP, I've got a bunch of images that exist in parallel: a thumbnail version on a gallery page, and a full-size version that's either free-standing or linked from a separate page with minimal text.

The naming is absolutely consistent, so instead of the generic No Hotlinks image, they get a rewrite in the form

(paintings/\w+/)blowups/large(\w+\.jpg) /$1thumbs/small$2

The thumbnail is hardly bigger in filesize than the hotlinks image-- and it's physically too small to bother scraping (126px in the longer dimension). The practical benefit is that I don't have to decide which search engines go on the Allowed Exceptions list. Users can always see the tiny picture; to get the full-size version they have to go to the page.

Postscript: It goes without saying that the first time this new rule was deployed, it landed squarely on the only picture that doesn't match the naming pattern. So I had to go back and write an additional, more specific rule
:(

indyank




msg:4487846
 2:17 am on Aug 24, 2012 (gmt 0)

the question really is on this...

or linked from a separate page with minimal text


So how are you handling those pages (image attachment pages)? Are you adding "noindex" or using canonical tag pointing to the parent page?

The practical benefit is that I don't have to decide which search engines go on the Allowed Exceptions list. Users can always see the tiny picture; to get the full-size version they have to go to the page.


so you aren't even allowing the search engine bots to see those large images and that would mean you are preventing them from indexing those large images.

lucy24




msg:4487916
 8:40 am on Aug 24, 2012 (gmt 0)

so you aren't even allowing the search engine bots to see those large images and that would mean you are preventing them from indexing those large images

No, where are you getting that from? Most robots come in without a referer, so they can index to their heart's content. The rewrite-- not redirect-- only kicks in when humans view the image in a search. That's when you get a named referer.

Why should they hog my bandwidth loading up a full-size picture? The thumbnail is identical in every way, just smaller. If they're interested they can go to the page and see the real thing in its intended environment.

There are some areas I do have blocked because I find that most visitors are simply looking for hotlink fodder. But those aren't anywhere near the galleries.

Naveenn




msg:4491673
 6:16 pm on Sep 5, 2012 (gmt 0)

So none of the images have been indexed till now, I'm not sure if Google takes longer than this to update the images index for a site. Most of my pages are updated in the range of 5-7 days for the google cache.


WMT tool only tests crawlability and not indexability.

I also tested the fetch tool, which loads the file, I hope the "fetch tool" also follows the robots.txt


You have not answered my question on whether you have blocked referrers other than Googlebot in your .htaccess file?

I have not blocked any referrers via .htaccess, I have only disabled "directory listing" of image folder.

Naveenn




msg:4496283
 3:15 pm on Sep 17, 2012 (gmt 0)

It's been a month and google still did not index a single large image, even after canonical url of image pages to category pages. Google is indexing newly updated thumbnail images but not the large images.

I believe that I have to let google index my large image pages, to get these images indexed. :(

Naveenn




msg:4504369
 8:31 am on Oct 5, 2012 (gmt 0)

Final Update

So out of desperation I just allowed all large image pages to be indexed with it's own Canonical URL and not the Gallery page. No images were indexed till then.

It's been 2 days and I already see more than 200 images been indexed in the google images.

Is this pretty lame from Google?

I'm happy yet sad!

This 31 message thread spans 2 pages: 31 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved