homepage Welcome to WebmasterWorld Guest from 54.166.122.86
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google ignoring robots.txt and No index meta tags
cabbie

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4594042 posted 11:54 pm on Jul 17, 2013 (gmt 0)

A few months ago, I received a unnatural link warning from webmaster tools for one of my better sites.
It has many freely given links from .edu sites and also plenty of other probably lower quality links

I have never bought, begged or traded any links to this site so I am assumed I was penalised for giving links from other sites I run that are niche related.
Google can tell these sites belong to me as they all run the same adsense code.
I sent them a re inclusion request for this site explaining the situation and that I have now nofollowed all links (6) from my other sites.

They sent me a reply stating I was still in breach if their guidelines.
I sent some more reinclusion requests and got same response

This ticked me off so I decided to remove my adsense($60 a day) from the site and block all googlebots in my robots txt.I also put no index no follow in my meta tags.
By all rights Google should now, not be indexing my site as far as I know.
However 2 months later they are still showing my site for some main keyword searches with the snippet that "A description for this result is not available because of this site's robots.txt"

To me, this is another example of Google playing by their own rules and screw everyone else.

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4594042 posted 12:57 am on Jul 18, 2013 (gmt 0)

Sometimes you can answer a post by its subject line alone, without even reading the question. This is one of those times :(

If google can't crawl a page, it can't see the "noindex" tag.

That's assuming it really said "noindex" as in the subject line, rather than "nofollow" as in the body of the post.

netmeg

WebmasterWorld Senior Member netmeg us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4594042 posted 1:30 am on Jul 18, 2013 (gmt 0)

Leave the NOINDEX on each page, and get rid of the block in robots.txt, so Google can see the NOINDEX.

(You do realize that this will also take you out of Bing and Yahoo)

NOINDEX is the surest way to stay out of Google. But you'll also stay out of the other search engines.

If you use robots.txt, you could still be indexed (only without a meta description). You're just telling Google not to crawl your pages, but they could still discover them other ways, for example via links to you.

NOINDEX deals with indexing. robots.txt deals with crawling. They're not quite the same thing.

aakk9999

WebmasterWorld Administrator 5+ Year Member



 
Msg#: 4594042 posted 3:15 am on Jul 18, 2013 (gmt 0)

Leave the NOINDEX on each page, and get rid of the block in robots.txt, so Google can see the NOINDEX.

(You do realize that this will also take you out of Bing and Yahoo)

Good point, netmeg!

To remove pages from google index only, use meta name="googlebot" instead of meta name="robots"
Directing a robots meta tag specifically at Googlebot
To provide instruction for all search engines, set the meta name to "ROBOTS". To provide instruction for only Googlebot, set the meta name to "GOOGLEBOT".
[googlewebmastercentral.blogspot.co.uk...]

dethfire

5+ Year Member



 
Msg#: 4594042 posted 3:21 am on Jul 18, 2013 (gmt 0)

It takes a long time for google to process noindex. I added noindex to a bunch of my pages over a month ago and they still are in the index. I have no robots block either.

ken_b

WebmasterWorld Senior Member ken_b us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4594042 posted 3:27 am on Jul 18, 2013 (gmt 0)

aakk9999
Directing a robots meta tag specifically at Googlebot

Thanks for that, it answers a question I've been wondering about the last couple days.

.

Robert Charlton

WebmasterWorld Administrator robert_charlton us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4594042 posted 3:32 am on Jul 18, 2013 (gmt 0)

Here's a related thread that covers this topic over and over and over, until I think it finally made sense to most participants of the thread. It covers many aspects of the question, and I highly recommend it....

Pages are indexed even after blocking in robots.txt
http://www.webmasterworld.com/google/4490125.htm [webmasterworld.com]

cabbie

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4594042 posted 3:51 am on Jul 18, 2013 (gmt 0)

Thanks for the replies.
That probably explains it.
I have <meta name="googlebot" content="noindex"> and Bing treats and indexes it just fine.
but I guess I have to unblock the robots.txt.
Cheers

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4594042 posted 3:54 am on Jul 18, 2013 (gmt 0)

It takes a long time for google to process noindex. I added noindex to a bunch of my pages over a month ago and they still are in the index.

You may want to go into gwt and remove them explicitly-- especially if you're dealing with whole directories that can easily be block-removed.

Here's a related thread that covers this topic over and over and over, until I think it finally made sense to most participants of the thread.

For a given definition of "made sense", at least. "I don't like it, I don't understand it, but I accept it as fact."

viral



 
Msg#: 4594042 posted 5:48 am on Jul 18, 2013 (gmt 0)

I agree with dethfire. It takes Google ages to process these no index and nofollow tags. I did this to a user generated part of my site and 3 months later Google still have them in the index.

Also there is plenty of evidence that even though Google says they won't follow the "nofollowed" link they still do but just don't use it in the ranking algo. Even wikipedia mentions this effect on it's nofollow page. Of course it is hard to prove this as there maybe sites out there that you don't know about that have a follow link to this particular page. However some of these experiments blocked everything but Google bot in Apache config when the test pages were published and Google bot still crawled those pages.

I personally haven't done these experiments, has anyone here done any like this?

Robert Charlton

WebmasterWorld Administrator robert_charlton us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4594042 posted 7:08 am on Jul 18, 2013 (gmt 0)

For a given definition of "made sense", at least. "I don't like it, I don't understand it, but I accept it as fact."

lucy24 - And here I thought you'd come around after your first post above. ;)

I highly recommend that thread to anyone with lingering questions about the Google crawling or indexing process.

It takes a long time for google to process noindex.

dethfire - How often does Googlebot visit the pages you noindexed? That would affect implementation speed.

Also, regarding how fast noindex is implemented, getcooking makes a very interesting comment on this crawl allocation discussion about Google's crawling behavior on noindexed pages (more particularly about removing noindex rather than first implementing it, but there might be related behavior at the head end)...

Crawl allocation and duplicate content
http://www.webmasterworld.com/google/4593402.htm [webmasterworld.com]

I track all googlebot activity on my site. On my noindexed pages, google slows down the crawls over time. So, once it picks up the noindex tag and removes the page from the index it starts to spider that page less and less frequently (once a day, then once a week, then once a month, etc). It will still eat some of the crawl budget but Google seems to be good at reducing how much effort it puts into those pages. It's probably also why once you noindex a page it can be a long time before you can get it reindexed. That's been my experience anyway.

netmeg

WebmasterWorld Senior Member netmeg us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4594042 posted 12:29 pm on Jul 18, 2013 (gmt 0)

My noindex stuff tends to get dropped right away most of the time, but if it doesn't, I go take it out in GWT.

I think you used to be able to remove your entire site in GWT, but I haven't checked recently to see if that's still possible.

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4594042 posted 5:34 pm on Jul 18, 2013 (gmt 0)

Alright, get the flamethrowers out now, but I gotta ask:

Is that site making you more money now that you have gotten rid of adsense and are trying to get it unlisted from google?

I mean, what is the point of this except to tilt at windmills?

dethfire

5+ Year Member



 
Msg#: 4594042 posted 6:11 pm on Jul 18, 2013 (gmt 0)

I have 175k pages I need to noindex. I'm not doing that by hand :D

cabbie

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4594042 posted 8:06 pm on Jul 18, 2013 (gmt 0)

Is that site making you more money now that you have gotten rid of adsense and are trying to get it unlisted from google?

Nah.
It's making Bugga all with alt ads and lost 60% of traffic but I feel strongly that first of all, I was innocent of their accusations and that my site is the best on the subject and google serps look silly (imo) without my site.
I am a patient man and I can wait till Google come cap in hand and beg me forgiveness.
;)

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4594042 posted 10:12 pm on Jul 18, 2013 (gmt 0)

till Google come cap in hand and beg me forgiveness

Look! Up in the sky! It's a flying pig!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved