homepage Welcome to WebmasterWorld Guest from 54.227.56.174
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Domain reappearing in results after I blocked it
Google appears to ignore meta tags and robots file
wintercornuk




msg:3559005
 9:18 am on Jan 27, 2008 (gmt 0)

I've got a domain which for legal reasons needs to be removed from all search engines. This was successfully done three months ago. Now it suddenly appears in Googles index when searching for its single word domain. Also, the page title has changed to a series of words I've never used.

The meta tags are:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOARCHIVE">

The robots.txt file is:

User-agent: Googlebot
Disallow: /

User-agent: ia_archiver
Disallow: /

User-agent: *
Disallow: /

My questions are-

Could a third party re-insert it into the index with some kind of link bombing?

Why does google ignore the request not to index?

[edited by: Robert_Charlton at 9:47 am (utc) on Jan. 27, 2008]

 

Robert Charlton




msg:3559009
 10:00 am on Jan 27, 2008 (gmt 0)

Why does google ignore the request not to index?

The robots.txt disallow is fighting the robots meta tag noindex. By disallowing all (well behaved) bots, you're preventing the engines from seeing the noindex robots meta.

Unless Google sees the noindex robots meta, it will do its best to index "good references" it can find to a page, even if it hasn't crawled the page. So, if there are active links out there still pointing to your page, Google will index the url in the link, and it will sometimes rank it.

Also, the page title has changed to a series of words I've never used.

Does this look like it might be text in any way related to your page, as in a link anchor that might have been linking to you?

[edited by: Robert_Charlton at 10:06 am (utc) on Jan. 27, 2008]

wintercornuk




msg:3559041
 12:41 pm on Jan 27, 2008 (gmt 0)

So the best way forward is to remove the robots.txt file and just let the meta tags work?

The page title is in lower case (something I've never used on this site) and I don't think anyone has linked to it using that specific text. Very odd.

Halfdeck




msg:3559070
 2:00 pm on Jan 27, 2008 (gmt 0)

Yeah, remove the robots.txt disallow and those pages should disappear from the SERPs.

londrum




msg:3559073
 2:07 pm on Jan 27, 2008 (gmt 0)

maybe the second meta tag is overriding the first one. I don't think it's supposed to do that, but it pays to be safe.
i'd change

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOARCHIVE">

to
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW, NOARCHIVE">

or
<META NAME="ROBOTS" CONTENT="NONE">

You could also include a PHP header on the page...

header('X-Robots-Tag: noindex, nofollow, noarchive', TRUE);

jd01




msg:3559141
 5:21 pm on Jan 27, 2008 (gmt 0)

If you want to make sure it does not get indexed and have access to mod_rewrite, try:

RewriteEngine on
RewriteRule !^no-index\.html http://example.com/no-index.html [R=301,L]

Then make no-index.html the following:

<html>
<head>
<title>&nbsp;</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="robots" content="noindex,nofollow,noarchive" />
</head>
<body>
</body>
</html>

The RewriteRule will redirect any request that is not for example.com/no-index.html to example.com/no-index.html. The meta tag will prevent no-index.html from being indexed,followed,archived, and the site will disappear from the SERPs.

You should be able to safely remove the robots.txt.

Justin

<added>
Another mod_rewrite alternative is:
RewriteEngine on
RewriteRule .? - [F]

The preceding will serve a 'Forbidden' error any time any page is accessed. Basically, it says to *everyone* 'You do not have permission to access the site', and will cause it to be dropped from the indexes. (This one might be the easiest / most effective.)

DO NOT use either of these suggestions if you need to allow access to the site, because they will not allow any visitor (SE or human) to see it.
</added>

g1smd




msg:3559191
 7:23 pm on Jan 27, 2008 (gmt 0)

I have found that when using "forbidden", the pages take a very long time to drop out of the index.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved