Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's 302 Redirect Problem

         

ciml

4:17 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



(Continuing from Google's response to 302 Hijacking [webmasterworld.com] and 302 Redirects continues to be an issue [webmasterworld.com])

Sometimes, an HTTP status 302 redirect or an HTML META refresh causes Google to replace the redirect's destination URL with the redirect URL. The word "hijack" is commonly used to describe this problem, but redirects and refreshes are often implemented for click counting, and in some cases lead to a webmaster "hijacking" his or her own URLs.

Normally in these cases, a search for cache:[destination URL] in Google shows "This is G o o g l e's cache of [redirect URL]" and oftentimes site:[destination domain] lists the redirect URL as one of the pages in the domain.

Also link:[redirect URL] will show links to the destination URL, but this can happen for reasons other than "hijacking".

Searching Google for the destination URL will show the title and description from the destination URL, but the title will normally link to the redirect URL.

There has been much discussion on the topic, as can be seen from the links below.

How to Remove Hijacker Page Using Google Removal Tool [webmasterworld.com]
Google's response to 302 Hijacking [webmasterworld.com]
302 Redirects continues to be an issue [webmasterworld.com]
Hijackers & 302 Redirects [webmasterworld.com]
Solutions to 302 Hijacking [webmasterworld.com]
302 Redirects to/from Alexa? [webmasterworld.com]
The Redirect Problem - What Have You Tried? [webmasterworld.com]
I've been hijacked, what to do now? [webmasterworld.com]
The meta refresh bug and the URL removal tool [webmasterworld.com]
Dealing with hijacked sites [webmasterworld.com]
Are these two "bugs" related? [webmasterworld.com]
site:www.example.com Brings Up Other Domains [webmasterworld.com]
Incorrect URLs and Mirror URLs [webmasterworld.com]
302's - Page Jacking Revisited [webmasterworld.com]
Dupe content checker - 302's - Page Jacking - Meta Refreshes [webmasterworld.com]
Can site with a meta refresh hurt our ranking? [webmasterworld.com]
Google's response to: Redirected URL [webmasterworld.com]
Is there a new filter? [webmasterworld.com]
What about those redirects, copies and mirrors? [webmasterworld.com]
PR 7 - 0 and Address Nightmare [webmasterworld.com]
Meta Refresh leads to ... Replacement of the target URL! [webmasterworld.com]
302 redirects showing ultimate domain [webmasterworld.com]
Strange result in allinurl [webmasterworld.com]
Domain name mixup [webmasterworld.com]
Using redirects [webmasterworld.com]
redesigns, redirects, & google -- oh my [webmasterworld.com]
Not sure but I think it is Page Jacking [webmasterworld.com]
Duplicate content - a google bug? [webmasterworld.com]
How to nuke your opposition on Google? [webmasterworld.com] (January 2002 - when Google's treatment of redirects and META refreshes were worse than they are now)

Hijacked website [webmasterworld.com]
Serious help needed: Is there a rewrite solution to 302 hijackings? [webmasterworld.com]
How do you stop meta refresh hijackers? [webmasterworld.com]
Page hijacking: Beta can't handle simple redirects [webmasterworld.com] (MSN)

302 Hijacking solution [webmasterworld.com] (Supporters' Forum)
Location: versus hijacking [webmasterworld.com] (Supporters' Forum)
A way to end PageJacking? [webmasterworld.com] (Supporters' Forum)
Just got google-jacked [webmasterworld.com] (Supporters' Forum)
Our company Lisiting is being redirected [webmasterworld.com]

This thread is for further discussion of problems due to Google's 'canonicalisation' of URLs, when faced with HTTP redirects and HTML META refreshes. Note that each new idea for Google or webmasters to solve or help with this problem should be posted once to the Google 302 Redirect Ideas [webmasterworld.com] thread.

<Extra links added from the excellent post by Claus [webmasterworld.com]. Extra link added thanks to crobb305.>

[edited by: ciml at 11:45 am (utc) on Mar. 28, 2005]

claus

11:34 am on Apr 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Joeduck, you should ask for removal of these links from Google.

You will probably have to re-enter them in your robots.txt (*), but it will be easier if they have a generic componenet, as; when you request removal from Google with the URL-console, there's a limit to the size of your robots.txt file.

Also, you might have to enter them one-by-one in the url-console, which will take some time.

---
(*) That is, if you can't make them return html with the meta tag <meta value="robots" content="noindex">

joeduck

5:00 pm on Apr 4, 2005 (gmt 0)

10+ Year Member



Thanks Claus - this makes sense though I'm worried these are a symptom rather than the problem itself? The links all appear to be from our cgi and cf directories that are referenced to send people to our major affiliates. Hopefully can use wildcards in the Google process but I have not checked yet. We'd excluded these directories when the problem started, now we allow them.

vincentg

5:31 pm on Apr 4, 2005 (gmt 0)

10+ Year Member



Reid

I would be interested in seeing it

Email me the info so I can take a look at it.
I think if I can look at a few cases I maybe able to see if there is a way to find out if there is anything we can do to either stop it or identify it.

Vin

claus

5:34 pm on Apr 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, they are symptoms, but the problem is not on your end, it's on Google's. In this case they see an URL (your redirect URL) and assume that this equals a document, even though it does not.

Once an URL is indexed it will not be removed by putting it in "robots.txt" - this will only keep the spider from revisiting the URL. In order to get it removed you must specifically request removal.

If you've got your redirect script in some folder, like, say:

example.com/redir/redir.php?id=1234567890

... then you can just put "/redir/" (or "/redir/redir.php") in your "robots.txt", you don't need to put in every single redirect.

(i have removed such URLs from my own sites a few times, so i know the process)

joeduck

5:45 pm on Apr 4, 2005 (gmt 0)

10+ Year Member



Excellent and thanks for any advice. Claus - do you think these bogus "link pages" replace other legitimate pages?

g1smd

7:04 pm on Apr 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



They can do, I suspect, if Google sees that they are duplicates of something else (which may well happen with the screwy way that they treat some sorts of redirects these days).

claus

7:47 pm on Apr 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm with g1smd here: If SE's are allowed to follow those links you might be "hijacking" some of the target pages before you know it - i've got all mine robots.txt'ed for the same reason

(and the additional reason being that i like to have control over what is indexed - i especially don't want "internal" things or "errors" to be indexed. All i want in the index is my real pages and nothing more - one URI per page. For that reason i do remove all kinds of different stuff that should not be there whenever i see it. I like to keep things clean, as this helps me avoid "surprises" of many different kinds.)

Reid

12:30 am on Apr 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Claus this could be a real problem for people running adsense because your not allowed to exclude googlebot from any part of your site.
If you run a robotstxt file you have to allow google full access in the first line.

g1smd

12:49 am on Apr 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What about the rel="nofollow" (or was it rel="noindex") attribute that Google "invented" just a few months ago...

Can you use that? Would it work?

joeduck

1:10 am on Apr 5, 2005 (gmt 0)

10+ Year Member



Reid why are you saying that? We've had several excluded directories and have run adsense for some time. To Google's credit (but our frustration) our adsense reps have been nice talking about this but unable to help with our problems because they are very separated from search side of things.

RE: Nofollow - we've been discussing that and I favor placing them at most of our outbound links.

This 467 message thread spans 47 pages: 467