Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Removing 302 Redirected Pages From Google

Using the "NOINDEX" Tag (even if they're not your pages)

         

JeffOstroff

3:04 am on Jul 5, 2006 (gmt 0)

10+ Year Member



Ok, let me clarify why you need to do the NOINDEX meta tag.

When you submit a URL to Google's Urgent URL Removal tool, the tool will INSTANTLY parse the URL in question and if it sees a NOINDEX command or if it sees a 404 error, it will remove the site from the index. Man I love this, and I have a lot of fun with this tool.

THIS IS IN MY OPINION, THE SINGLE MOST POWERFUL TOOL GOOGLE HAS GIVEN US WEBMASTERS AND SO FEW PEOPLE KNOW ABOUT IT.

By the way, this works awesome for you when doing searches on Google, if you find dead links, submit them to this tool, and they'll be out of the index in 2 days. Have you ever found dead links ranking higher than you? If so, quit whining about them on forums, and just remove them! It’s that easy! I do this all the time when checking my rank. I often find sites listed on the SERP that are no longer functional. So submit them when you find them. Like they sing in day care, “Clean Up, Clean Up, everybody do their share…”.

Ok, so we know how Google’s URL removal tool works on 404 error pages and dead links. But what about on 302 redirects to your site from a scammer site?

Here is the beauty of it folks. You submit the 302 URL string that appears on the scammer's web site which is linking to your site, to have it removed from the Google index so that Google no longer sees a 302 pointing to your home page. And Google will no longer penalize you for what looks like black hat SEO clocking techniques. It's like God "remembering your sins no longer" after you repent!

Here is an example of the form it can take, this is one I found this week:

http://www.example.com/cgi-bin/jump.cgi?ID=6051

So all those phony directory sites that appear to be linking to your site, do a view source on them, grab the URL that links to your site, and use a good header checking tool to verify if it is a 302 redirect.

BUT. Wait! The link is on the scammer's site, not your site, how can we get his link removed from Google? We have no control over his site, and we cannot FTP a NOINDEX meta tag on his site!

Follow along young grasshopper!

When you submit his 302 redirect string to the Google removal tool, Google spiders his page, sees the redirect and lands on your page instead! Thus, Google is expecting to see a NOINDEX command. So, make sure this meta tag is placed on your home page, seconds before you submit the URL to the tool for immediate scanning:

<META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">

This tells Google, please do not index my URL. Only in this case, “MYURL” is the scammer’s 302 redirect string that you entered into the tool, not your home page.

Then you submit the scammers 302 string URL, then Google goes to crawl the scammer’s string, looking for that NOINDEX command. Google then sees the 302 and says “oh, ok, I’m supposed to be over here” and lands on your page,. So Google then bounces over to your page where it sees the NOINDEX meta tag that you just placed, and is thus satisfied that it should remove the URL you submitted from the index..

Then Google queues up the scammer's link for removal. Once submitted, immediately remove this NOINDEX meta tag back off your page, to continue to allow Google to index your site. Remember you only added that meta tag there to trick Google into removing the other guy’s 302 redirect.

You see we tricked Google into thinking the 302 redirect URL was on our page, even though it's on the scammers page. Actually the scammer shot himself in the foot, I did not really trick Google. It’s a by product of how Google operates.

This is safe if used as directed. Don’t forget to remove the NOINDEX meta tag off your site AS SOON AS YOU SUBMIT THE URL. Other wise the next time Google comes to your site for a normal crawl, it will see the NOINDEX, and remove you from the index for 6 months.

Sorry for the length, but this is a lot of careful info to parse out.

trillianjedi

2:50 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Very interesting although I have to point out the boiler-plate warning here:-

Don't try this unless you know what you're doing and you don't mind risking 6 months out of the index just in case it goes wrong.

TJ

Iguana

3:09 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I used this same method when there was all the fuss about 302 hijacks. It didn't work for all for them though (I think some 'hijacks' redirected to a meta refresh page and then to my page)

JeffOstroff

3:14 pm on Jul 5, 2006 (gmt 0)

10+ Year Member



That's why you need to use a good header checker tool to see what URL the re-direct is going to.

At first we found the Google URL removal tool was rejecting some of the 302 URLs because it said it could not find a metatag NOINDEX command, even though we knew we placed the metatag on our home. Problem was, we had it on the wrong page! It does not always go on the home page, it might need to be placed on other pages on your site.

Most of the time the 302 redirects will point to your home page. But we have also found them attacking sub pages on our site too, different chapters of our web site. So you have to put the metatag on the correct target page of the 302.

For example you might find a 302 redirect pointing to www.example.com, so you put the meta tag on the index page and submit.

Or, you might find a 302 redirect pointing to www.example.com/page112.htm

Then you would put the metatag on page112.htm, submit the 302 URL to the Google removal tool, then remove the metatag from page112.htm again.

The key is knowing which page to put the metatag on. I prefer to use the metatag method, because I don't think the robots.txt method works with Google's URL removal tool.

Iguana

3:46 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Like I say, I couldn't get rid of one because it wasn't a 302 to my home page but to another of it's own pages and then to my home page even though I had NOINDEX on my home page.

I know everyone was worried about any 302 link such as furl/Alexa/Blogger profiles - but I got the distinct impression (lurking on darker forums than this) that a real hijack involved another step beyond a 302 redirect.

JeffOstroff

4:18 pm on Jul 5, 2006 (gmt 0)

10+ Year Member



ok, what about the 2nd link on the offending site then? If you were sent to another link on his site, then surely that 2nd link would be the 302 which then points to your site.

The header checking tool should be able to help you capture that other redirect if it goes by too quickly.

Iguana

4:25 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I found the page it was 302 redirecting to but that wasn't a second 302. It was something else and I couldn't figure out what (even turning meta refreshes off). It may well have involved some cloaking so I wouldn't see what was really going on without requesting from a Googlebot IP and/or changing my UserAgent.

If I get some time later I'll take a second look at it. In any case the home page is back and unhijacked - so I think that BH method no longer works.

I still have my doubts that simple 302s (as I mentioned earlier re: Blogger profiles/Alexa/furl links) were ever responsible for hijacking

incrediBILL

5:08 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW, you'll save yourself a lot of trouble if you bounce all incoming requests claiming to be Googlebot, mediabot, etc. that's crawling from outside of Google's IP ranges. This tends to be proxy servers that appear to be cloaking a directory to Google and deliberately enticing Google to crawl thru their proxy for the purpose of hijacking pages to generate traffic, so bouncing requests from Google through these sites from the start will solve a lot of this.

Why Google is so stupid it can't tell your page from a redirected page on a proxy server is a whole different debate they seem to have no interest in fixing.

JeffOstroff

5:12 pm on Jul 5, 2006 (gmt 0)

10+ Year Member



incrediBILL,

where can we read up more on how to block these IP addresses from these other crawlers?

incrediBILL

11:31 pm on Jul 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



from these other crawlers

It's not OTHER crawlers, it's GOOGLE themselves crawling thru a proxy.

If you're on Linux with Apache you can use an .htaccess file to validate the Google IPs for Googlebot and block fakes, examples may be found in WebmasterWorld in the Spider ID, Apache or Robots.txt forums.