Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's 302 Redirect Problem

         

ciml

4:17 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



(Continuing from Google's response to 302 Hijacking [webmasterworld.com] and 302 Redirects continues to be an issue [webmasterworld.com])

Sometimes, an HTTP status 302 redirect or an HTML META refresh causes Google to replace the redirect's destination URL with the redirect URL. The word "hijack" is commonly used to describe this problem, but redirects and refreshes are often implemented for click counting, and in some cases lead to a webmaster "hijacking" his or her own URLs.

Normally in these cases, a search for cache:[destination URL] in Google shows "This is G o o g l e's cache of [redirect URL]" and oftentimes site:[destination domain] lists the redirect URL as one of the pages in the domain.

Also link:[redirect URL] will show links to the destination URL, but this can happen for reasons other than "hijacking".

Searching Google for the destination URL will show the title and description from the destination URL, but the title will normally link to the redirect URL.

There has been much discussion on the topic, as can be seen from the links below.

How to Remove Hijacker Page Using Google Removal Tool [webmasterworld.com]
Google's response to 302 Hijacking [webmasterworld.com]
302 Redirects continues to be an issue [webmasterworld.com]
Hijackers & 302 Redirects [webmasterworld.com]
Solutions to 302 Hijacking [webmasterworld.com]
302 Redirects to/from Alexa? [webmasterworld.com]
The Redirect Problem - What Have You Tried? [webmasterworld.com]
I've been hijacked, what to do now? [webmasterworld.com]
The meta refresh bug and the URL removal tool [webmasterworld.com]
Dealing with hijacked sites [webmasterworld.com]
Are these two "bugs" related? [webmasterworld.com]
site:www.example.com Brings Up Other Domains [webmasterworld.com]
Incorrect URLs and Mirror URLs [webmasterworld.com]
302's - Page Jacking Revisited [webmasterworld.com]
Dupe content checker - 302's - Page Jacking - Meta Refreshes [webmasterworld.com]
Can site with a meta refresh hurt our ranking? [webmasterworld.com]
Google's response to: Redirected URL [webmasterworld.com]
Is there a new filter? [webmasterworld.com]
What about those redirects, copies and mirrors? [webmasterworld.com]
PR 7 - 0 and Address Nightmare [webmasterworld.com]
Meta Refresh leads to ... Replacement of the target URL! [webmasterworld.com]
302 redirects showing ultimate domain [webmasterworld.com]
Strange result in allinurl [webmasterworld.com]
Domain name mixup [webmasterworld.com]
Using redirects [webmasterworld.com]
redesigns, redirects, & google -- oh my [webmasterworld.com]
Not sure but I think it is Page Jacking [webmasterworld.com]
Duplicate content - a google bug? [webmasterworld.com]
How to nuke your opposition on Google? [webmasterworld.com] (January 2002 - when Google's treatment of redirects and META refreshes were worse than they are now)

Hijacked website [webmasterworld.com]
Serious help needed: Is there a rewrite solution to 302 hijackings? [webmasterworld.com]
How do you stop meta refresh hijackers? [webmasterworld.com]
Page hijacking: Beta can't handle simple redirects [webmasterworld.com] (MSN)

302 Hijacking solution [webmasterworld.com] (Supporters' Forum)
Location: versus hijacking [webmasterworld.com] (Supporters' Forum)
A way to end PageJacking? [webmasterworld.com] (Supporters' Forum)
Just got google-jacked [webmasterworld.com] (Supporters' Forum)
Our company Lisiting is being redirected [webmasterworld.com]

This thread is for further discussion of problems due to Google's 'canonicalisation' of URLs, when faced with HTTP redirects and HTML META refreshes. Note that each new idea for Google or webmasters to solve or help with this problem should be posted once to the Google 302 Redirect Ideas [webmasterworld.com] thread.

<Extra links added from the excellent post by Claus [webmasterworld.com]. Extra link added thanks to crobb305.>

[edited by: ciml at 11:45 am (utc) on Mar. 28, 2005]

vincentg

11:39 pm on Mar 30, 2005 (gmt 0)

10+ Year Member



So what are you saying exactly?

Are you saying that a person has altered his server to produce a redirect which is not normal?

You are correct in that most people do not understand the codes and really do not need to since the server is producing them and not the website owner's web page.

The only time a person might instruct the server to issue a specific code is if they may want lets say a redirect to the root such a www.domain.com verses domain.com.

When a redirect takes place it is rare that anyone will code it anyway other than to just instruct the server to redirect to the requested URL.

If this is what you are saying - that in fact a website owner has to alter the redirect request by substituting the request code then only Google can check for this.

But I am still not 100% convinced that this is a problem since we do not know 100% for sure the Google Bot is being tripped up by an alteration such as this.

This is a big topic and it seems many are starting to believe it. Google on the other hand has and most likely will not comment on it.

One thing I do feel is important is for Major Search engines to at least provide a report if requested.
Now they do not have to give out info that will compromise the algo scripts.
But they should give out basic info such as penalties issued and show a break down of some sort.
They could charge for this service and I don't think anyone would mind paying for it.

With the present system they are the Judge and Jury of websites and worse the court is held behind closed doors not open to the public.

Vin

geff

12:35 am on Mar 31, 2005 (gmt 0)



With the present system they are the Judge and Jury of websites and worse the court is held behind closed doors not open to the public.

Well put

Reid

1:29 am on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you saying that a person has altered his server to produce a redirect which is not normal?

Google seems well able to handle normal 302 redirects. These are commonly used by tracking scripts to count the number of clicks made on an outgoing link. I'm not into the mechanics of tracking scripts but apparently they use a 302 redirect to add to counter mechanism and send the browser on it's way.

I thing the problem is not in server response codes but in the interpretation of the META refresh. This is interpreted like a 302 redirect from the page itself.

Typical use of a META refresh.
In Canada there are many bilingual websites. There is an 'intro' page which has 2 links, French or English. These intro pages also have a META refresh tag which will send the browser to a default location.

Google interprets this META refresh page as the 'intro' page and rather than sending the browser to a deeper page within the site it will send it to the 'intro' page for the choice to be made (ie french or english browser type, shockwave, flashplayer). This is part of googles algo to deliver the best entry point into that website. In other words to google the META refresh means "the information you are searching for can be found on the other side of this page".

Now if another website has a link with your name on it that points to a blank page with a META refresh pointing at your home page, google interprets this as "any information found on that website should be directed through this page".

This is only one variation of the 302 redirect problem.

A year ago this was only understood by a few but it is now being marketed as an SEO tecnique, a 'get rich quick with adsense" scheme. That is why we decided to bring it out into the open and cause an outcry for google to find a solution. Too many webmasters have lost years of hard work and patience to this undefendable attack. Unless we ALL start doing it to each other, then the web will be just one big nightmare.

theBear

2:10 am on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not too smart but since Google has in all cases the actual location of the target it could (pardon me being a thick headed old fool) just index target under target's url.

End of problem.

I don't really care what Google does with the orginal url provided they do not credit it with the content of the target or place it into the targets site view.

As far as I'm concerned Google doesn't even have to credit the target site with a link.

Reid

5:27 am on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I agree bear - it should be the webmasters problem wether the browser is compatible or the language.
If you want to run a bilingual site or a flash site, or whatever,then you should be responsible for directing all traffic to the intro page or dealing with it in some other ingenious way.

Shouldn't be the search engines problem they should just serve up the page and let the webmaster worry about compatability issues.

claus

9:04 am on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Great post accidentalGeek - you summed up the HTTP issues in a nice way there :) And you're right that most webmasters don't think about these issues, as normally they don't even have to know they exist.

>> Keep the target URL

This is the solution that i have also mentioned a couple of times (but only for 302's that cross to another domain). Yahoo does the exact same thing (with cross-domain 302's).

Google in particular is famous for very frequent spidering. If a URL is temporarily moved to another location, keeping the target URL will actually be an option to Google as the googlebot is likely to re-visit both URLs relatively quickly. With the spidering frequency of Google, this should pose a very little problem, if any.

vincentg

5:01 pm on Mar 31, 2005 (gmt 0)

10+ Year Member



Reid

How can you be sure Google is doing this?
Is this not just speculation?
Are we saying maybe Google is doing this?
Thing is I am still treating this as a rumor.

In the past writers came out with stuff presented as a documentary like The Devils Triangle or Aliens from out of space built the Pyramids.
These types of stories which are based on creative facts are science fiction and nothing more.
Of course you will have people believing it and a story such as this can't be disproved.

It's much the same as someone claiming there are aliens on the other side of the moon waiting to attack us.
The only way to prove this is false is to go to the other side of the moon.
The most recent one was a fellow from France claiming the World Trade Center Attack was fake.

Now this rumor has gotten to the point where people are reacting to it. You have website owners that are amateur programmers writing scripts for Bots to hunt down such sites.
You have people requesting their links be removed.
This can turn into a mess and it can be a big mess with more harm than good coming out of it.

Many are creating quite a stir based on what?
Oh there will be those that will be quick to give us a lecture on HTML codes.

But not one person can tell us what the Google bot is doing. And the reason is only a worker from Google can tell us that info.

Yes I know a few will say that I don't know what I am talking about.

But I remain convinced this is nothing more than a rumor until one person can show a list of sites with proof that this is taking place or a statement from someone in Google that this was or is a problem.

What is taking place here reminds me of these old western movies where a mob tries to hang a person for a suspected crime.

What I am saying is think about what you are posting.
Many are posting this as if it's fact where no proof exists at all.

In my book you can state we believe this is a problem.
We are not sure since we have no input from Google but it maybe true.

You can further ask those that feel they have been affected by such a possibility to allow those on this board to document it.

Show the print outs before and after.

It's not enough to just have people claim they have had a problem.

Lets try to avoid a Virtual Riot is what I am saying.

Hope this helps.

Vincent G.

Reid

8:41 pm on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How can you be sure Google is doing this?
Is this not just speculation?
Are we saying maybe Google is doing this?
Thing is I am still treating this as a rumor.

There is definitely a problem with some sites. I did see a few webmasters go overboard and talk about removing all 'strange looking' links that appear in an allinurl: search. I totally discouraged them from doing so.

When some webmasters actually did find some 302 hijacking URL's appearing in a site: search and I (or we)encouraged them to remove those ones (after all isn't this search revealing googles view of a perticular domain?) and lo and behold, after wondering for months why their website is non-existant in the google database it re-appears 7-14 days after removing the foreign URL's associated with the said domain.

True we don't know for sure how google works but we do see a recovery of these mysteriously penalized websites on a fairly consistant basis we can make some assumptions based on cause and effect.

You cant 'see' wind or electricity but it is still a scientific fact.

This thread started a year ago, see alll the threads listed at the beginning of this one. There is a mysterious problem sending websites into google oblivion and although this may not be the cause of every one of them this is definitely a problem.

We have determined how to fix this perticular problem and have seen more than a few recoveries of website suffering from it. The reason google needs to fix it is because (based on our assumptions) there is no way of preventing this from happening again.

Tell me - how does an adult or pharacutical site become a part of a travel domain just by linking to it?
How does that travel domain lose all its PR and fall so far in the SERP's that even a search for the specific domain name turns up 'no results'. How does this domain recover everything a week after removing these associations? You tell me.

Part of this thread has been used in discouraging a 'witch hunt' of 302 links but there are some authentic cases and some pretty happy webmasters who have managed to regain their lost SERP's.

Reid

9:04 pm on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Show the print outs before and after.

You can find many cut and paste examples and descriptions of results here. The real result is traffic loss and recovery.

I can show you some directories built upon this exploit who gather domain names and have thousands of them in 'categories'. Looks like a real directory too but don't cry to me about the drive-by installations you get from visiting them, how they replaced your google toolbar with their own or how you found a hidden server installed on your system.
While your there you can purchase the e-book on how to get rich quick by f*king other webmasters out of thier SERPs.

ciml

9:05 pm on Mar 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



accidentalGeek, thank you for a super introduction to HTTP and the redirect problem.

> any arbitrary URL that the attacker designates

Generally, the PageRank needs to be higher to accomplish this. While PageRank remains a useful trust metric on large data samples, it can be manipulated and I don't think it is the right answer for individual decisions such as in canonicalisation.

<side point: Mostly there is no 'attacker' as the Google 302 'hijack' is not intended>

> Because this is a protocol-level problem, I believe that effective solutions are to be found on the protocol level.

Though there have been reports of successfully using Google's canonicalisation to trick the manual removal tool, I agree with you fully that it makes sense to deal with this in HTTP.

The two main protocol-level solutions suggested are to keep the target URL (my favourite thus far) or to keep the target URL for 302's that point to a URL on another domain.

How about "Keep the target URL if the 302 points to a URL on another domain, and keep the URL with higher PageRank if the 302 points to a URL on the same domain".

This would largely solve the '302 hijack' problem, but would use PageRank to make the solution better than Yahoo!'s solution. Could that not keep both Google and webmasters happy? :-)

vincentg:
> only a worker from Google can tell us that info

When someone goes to http://www.google.com [google.com], enters a url and then sees it's title listed with the link going to some other URL (that happens to be a 302 redirect to the URL he or she just searched for); then that person will feel "sure" that something has happened - even if he or she is not a Google employee.

This 467 message thread spans 47 pages: 467