Welcome to WebmasterWorld Guest from 54.147.220.66

Message Too Old, No Replies

Google's 302 Redirect Problem

   
4:17 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member ciml is a WebmasterWorld Top Contributor of All Time 10+ Year Member



(Continuing from Google's response to 302 Hijacking [webmasterworld.com] and 302 Redirects continues to be an issue [webmasterworld.com])

Sometimes, an HTTP status 302 redirect or an HTML META refresh causes Google to replace the redirect's destination URL with the redirect URL. The word "hijack" is commonly used to describe this problem, but redirects and refreshes are often implemented for click counting, and in some cases lead to a webmaster "hijacking" his or her own URLs.

Normally in these cases, a search for cache:[destination URL] in Google shows "This is G o o g l e's cache of [redirect URL]" and oftentimes site:[destination domain] lists the redirect URL as one of the pages in the domain.

Also link:[redirect URL] will show links to the destination URL, but this can happen for reasons other than "hijacking".

Searching Google for the destination URL will show the title and description from the destination URL, but the title will normally link to the redirect URL.

There has been much discussion on the topic, as can be seen from the links below.

How to Remove Hijacker Page Using Google Removal Tool [webmasterworld.com]
Google's response to 302 Hijacking [webmasterworld.com]
302 Redirects continues to be an issue [webmasterworld.com]
Hijackers & 302 Redirects [webmasterworld.com]
Solutions to 302 Hijacking [webmasterworld.com]
302 Redirects to/from Alexa? [webmasterworld.com]
The Redirect Problem - What Have You Tried? [webmasterworld.com]
I've been hijacked, what to do now? [webmasterworld.com]
The meta refresh bug and the URL removal tool [webmasterworld.com]
Dealing with hijacked sites [webmasterworld.com]
Are these two "bugs" related? [webmasterworld.com]
site:www.example.com Brings Up Other Domains [webmasterworld.com]
Incorrect URLs and Mirror URLs [webmasterworld.com]
302's - Page Jacking Revisited [webmasterworld.com]
Dupe content checker - 302's - Page Jacking - Meta Refreshes [webmasterworld.com]
Can site with a meta refresh hurt our ranking? [webmasterworld.com]
Google's response to: Redirected URL [webmasterworld.com]
Is there a new filter? [webmasterworld.com]
What about those redirects, copies and mirrors? [webmasterworld.com]
PR 7 - 0 and Address Nightmare [webmasterworld.com]
Meta Refresh leads to ... Replacement of the target URL! [webmasterworld.com]
302 redirects showing ultimate domain [webmasterworld.com]
Strange result in allinurl [webmasterworld.com]
Domain name mixup [webmasterworld.com]
Using redirects [webmasterworld.com]
redesigns, redirects, & google -- oh my [webmasterworld.com]
Not sure but I think it is Page Jacking [webmasterworld.com]
Duplicate content - a google bug? [webmasterworld.com]
How to nuke your opposition on Google? [webmasterworld.com] (January 2002 - when Google's treatment of redirects and META refreshes were worse than they are now)

Hijacked website [webmasterworld.com]
Serious help needed: Is there a rewrite solution to 302 hijackings? [webmasterworld.com]
How do you stop meta refresh hijackers? [webmasterworld.com]
Page hijacking: Beta can't handle simple redirects [webmasterworld.com] (MSN)

302 Hijacking solution [webmasterworld.com] (Supporters' Forum)
Location: versus hijacking [webmasterworld.com] (Supporters' Forum)
A way to end PageJacking? [webmasterworld.com] (Supporters' Forum)
Just got google-jacked [webmasterworld.com] (Supporters' Forum)
Our company Lisiting is being redirected [webmasterworld.com]

This thread is for further discussion of problems due to Google's 'canonicalisation' of URLs, when faced with HTTP redirects and HTML META refreshes. Note that each new idea for Google or webmasters to solve or help with this problem should be posted once to the Google 302 Redirect Ideas [webmasterworld.com] thread.

<Extra links added from the excellent post by Claus [webmasterworld.com]. Extra link added thanks to crobb305.>

[edited by: ciml at 11:45 am (utc) on Mar. 28, 2005]

12:18 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello GG:

Thanks for a fast long awaited response.

Its good that other sites are removed from the site: search,
but it also kills our main way of finding malicious 302 redirects.

"We are also changing some of the core heuristics for the results for 302s."

That, I think, is the burning question!

Will the heuristics changes prevent a malicious site from scoring for my original content?
Will these changes pass PR (etc.) thru to the actual pages with the content?
Will the 302-jackers be derated if not penalized?

Since there remain legitimate uses for the 302 redirect,
is there a simple way to only credit those redirected to the same domain?
That alone might solve most of the problem. -Larry

12:24 am on Apr 19, 2005 (gmt 0)

10+ Year Member



I agree about the core changes and also agree that taking away the ability to see those other sites with the site: command might not be the best solution as many of us have used them to Googles benefit as well as our own to remove people that were scraping.

I also noticed that the algo for the site: command may have a little glitch as you can get the total number of results (pages for the site) on the first page (ex.125 pages) and that number reduces itself as you go deeper in the results - losing 1 or 2 pages each 10 results in the total pages number at the top. Ive tried it and replicated it across a few domains but of curse others may not see the same thing.

I was also wondering why the &filter=0 filter wording was changed - seems not to have the same effect anymore as it was originally designed and talked about here a little over a year ago - seems that Google still has a similar filter working but it's not accessible using that command? It was very useful in seeing if Google considered something from your site a duplicate result - again helping us find scrapers and report them.

12:31 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



>> We changed things so that site: won't return results from other sites in the supplemental results. <<

Why only in supplemental results? What about normal results too?

Not showing the sites in the search results is not the same as not having the rogue URLs in the database. Not the same by a very long way.

Much of the stuff that has been seen in recent results should not have even made it into your database.

Why can't this be fixed by going back to how things were a few years ago? And, dare I mention the logic that Yahoo applies to 301 and 302 redirects, and the different way that they treat onsite and offsite redirects now?

As for the search I mentioned above, is it really true that out of 1.2 million pages, Google only has 950 pages indexed for the ODP now (but says there are 11 million when you first look)? This filtering of the results served, rather than cleaning of the internal dataset, seems to have some flaws perhaps?

12:58 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



(copied from a dictionary site. -LH)

HEURISTICS - This describes a set of rules developed to attempt to solve problems when a specific algorithm cannot be designed.
For example, if the problem is "When do you eat food?", if you answer, "When I'm hungry" then you would have to eat immediately every single time you were hungry.
Instead, we follow heuristics to determine when to eat by gauging our hunger level, the situation we are in, and our ability to get food. As you can imagine, heuristics are very important for solving artificial intelligence problems.

1:06 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



g1smd, the changes had already been applied for the main results. The web has changed over time, but url canonicalization is definitely on our radar now--contacting at google.com/support with the title of "canonicalpage" will make it to engineers who read the reports and suggestions.

larryhatch, I believe the answers are yes, I'm not sure given the current heuristics, and yes. Marval, if someone is doing 302s to your site, you might be able to find redirecters by looking in your server logs for unusual referrers. I'll ask about filter=0. There's been some index changes lately, but I hadn't heard about any changes with filter=0.

1:15 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



GoogleGuy,

I was looking at a site today where as near as I could tell every page was marked supplemental this site up until recently had a large number of 302 leaches within its site: view.

It doesn't look like the owner of that site stands a chance.

Assuming (and I know what that does) that the problem was "duplicate content due to the 302's", how long will it take for his site to recover? Will it recover?

1:50 am on Apr 19, 2005 (gmt 0)

10+ Year Member



GG - been using that term in msgs to the engineers for almost 6 months now trying to provide some information that seems to being "missed" with an algo tweak performed back in August last year - I do get a lot of the thanks we'll forward the msg replies, but seems to be a black hole - not asking for specific or even "personal" feedback, but certainly would be nice to at least hear that some of the information we are providing is making it to the right ears.
1:54 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member crobb305 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Googleguy,

The unrelated urls that were once showing in site: search are gone. But, how long will dup penalties last? Are we still looking at 90 day penalties from this point on? As Dayo_UK says

steveb raises a point I am concerned with also.
If Google do sort the problems with the index will previously established sites be sandboxed.

Are provisions being made to allow sites to return to the serps after being penalized for problems beyond their control? :)

C

3:02 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



crobb305, in many of the cases that I've examined, a spam penalty comes first. That spam penalty causes the PageRank of a site to decrease. Since one of the heuristics to pick a canonical site was to take PageRank into account, the declining PageRank of a site was usually the root cause of the problem. That's what happened with your site, crobb305. So the right way forward for people who still have problems is to send us a reinclusion report.
3:09 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member steveb is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I don't know about usually, but I do know that part of the problem is that Google apparently believes "one of the heuristics to pick a canonical site was to take PageRank into account" when that is seldom the case. The key problem with every instance of hijacking I've seen is pagerank is ignored in judging the canonical page.
3:20 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member crobb305 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



GG, thanks for the info. Odd thing is, my PageRank dropped to Zero last September, but back to PR7 in December (at least based on toolbar). If declining page rank is the root of the problem, would one expect a reversal of spam penalty as PageRank returns?

If the home page is still indexed with title and description, do you still need to do a reinclusion request? Seems they will say "your site is aleady indexed"

C

3:34 am on Apr 19, 2005 (gmt 0)

10+ Year Member



GG - I am currious to know what you at Google think of search results lately. Many of searchers I know and webmasters have been complaining about results being of a lower quality.

Many, many, many webmasters have been adversly affected by recent updates also. Many of sites have went supplemental or even loosing bulk pages. Is this a short term side effect of the recent changes, 302 fixes, or should those webmasters look to their own site as the cause of the problem. Don't want to try and fix something that isn't broken.

I am also currious on old 301 redirects why new content is being cached/displayed under old the old redirected URL and not the new. Could this cause any problems?

3:35 am on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Silly question "reinclusion report" -- is this the same as Add Url?

My story:
Redirect dropped 2k pages - was showing the other site - now returning no results

(I had the redirect removed)

What do I do?

3:36 am on Apr 19, 2005 (gmt 0)

10+ Year Member



GG- BTW thank you so much for a long awaited response. These explainations will be of great help to us all.
4:01 am on Apr 19, 2005 (gmt 0)

10+ Year Member



As soon as I saw a massive drop in my traffic that was the first day that inurl: showed that I had been hijacked by 302 redirects. My pr was rising. So what you are saying GG doesn't make sense? I used the URL remover tool but I have done this with other sites and it doesn't seem to help me. If I was using site: I wouldn't have found the 302 hijackers?

As soon as a site gets to around 6000 plus page views it gets hijacked and killed? Nothing can ever do well in this environment and maybe this is the idea?

This 467 message thread spans 32 pages: 467