Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's 302 Redirect Problem

         

ciml

4:17 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



(Continuing from Google's response to 302 Hijacking [webmasterworld.com] and 302 Redirects continues to be an issue [webmasterworld.com])

Sometimes, an HTTP status 302 redirect or an HTML META refresh causes Google to replace the redirect's destination URL with the redirect URL. The word "hijack" is commonly used to describe this problem, but redirects and refreshes are often implemented for click counting, and in some cases lead to a webmaster "hijacking" his or her own URLs.

Normally in these cases, a search for cache:[destination URL] in Google shows "This is G o o g l e's cache of [redirect URL]" and oftentimes site:[destination domain] lists the redirect URL as one of the pages in the domain.

Also link:[redirect URL] will show links to the destination URL, but this can happen for reasons other than "hijacking".

Searching Google for the destination URL will show the title and description from the destination URL, but the title will normally link to the redirect URL.

There has been much discussion on the topic, as can be seen from the links below.

How to Remove Hijacker Page Using Google Removal Tool [webmasterworld.com]
Google's response to 302 Hijacking [webmasterworld.com]
302 Redirects continues to be an issue [webmasterworld.com]
Hijackers & 302 Redirects [webmasterworld.com]
Solutions to 302 Hijacking [webmasterworld.com]
302 Redirects to/from Alexa? [webmasterworld.com]
The Redirect Problem - What Have You Tried? [webmasterworld.com]
I've been hijacked, what to do now? [webmasterworld.com]
The meta refresh bug and the URL removal tool [webmasterworld.com]
Dealing with hijacked sites [webmasterworld.com]
Are these two "bugs" related? [webmasterworld.com]
site:www.example.com Brings Up Other Domains [webmasterworld.com]
Incorrect URLs and Mirror URLs [webmasterworld.com]
302's - Page Jacking Revisited [webmasterworld.com]
Dupe content checker - 302's - Page Jacking - Meta Refreshes [webmasterworld.com]
Can site with a meta refresh hurt our ranking? [webmasterworld.com]
Google's response to: Redirected URL [webmasterworld.com]
Is there a new filter? [webmasterworld.com]
What about those redirects, copies and mirrors? [webmasterworld.com]
PR 7 - 0 and Address Nightmare [webmasterworld.com]
Meta Refresh leads to ... Replacement of the target URL! [webmasterworld.com]
302 redirects showing ultimate domain [webmasterworld.com]
Strange result in allinurl [webmasterworld.com]
Domain name mixup [webmasterworld.com]
Using redirects [webmasterworld.com]
redesigns, redirects, & google -- oh my [webmasterworld.com]
Not sure but I think it is Page Jacking [webmasterworld.com]
Duplicate content - a google bug? [webmasterworld.com]
How to nuke your opposition on Google? [webmasterworld.com] (January 2002 - when Google's treatment of redirects and META refreshes were worse than they are now)

Hijacked website [webmasterworld.com]
Serious help needed: Is there a rewrite solution to 302 hijackings? [webmasterworld.com]
How do you stop meta refresh hijackers? [webmasterworld.com]
Page hijacking: Beta can't handle simple redirects [webmasterworld.com] (MSN)

302 Hijacking solution [webmasterworld.com] (Supporters' Forum)
Location: versus hijacking [webmasterworld.com] (Supporters' Forum)
A way to end PageJacking? [webmasterworld.com] (Supporters' Forum)
Just got google-jacked [webmasterworld.com] (Supporters' Forum)
Our company Lisiting is being redirected [webmasterworld.com]

This thread is for further discussion of problems due to Google's 'canonicalisation' of URLs, when faced with HTTP redirects and HTML META refreshes. Note that each new idea for Google or webmasters to solve or help with this problem should be posted once to the Google 302 Redirect Ideas [webmasterworld.com] thread.

<Extra links added from the excellent post by Claus [webmasterworld.com]. Extra link added thanks to crobb305.>

[edited by: ciml at 11:45 am (utc) on Mar. 28, 2005]

Atticus

9:30 pm on Mar 31, 2005 (gmt 0)



vincentg,

As discussed in the "New Google Patent Details Many Google Techniques" thread...

A Google patent application contains this: "A large spike in the quantity of back links may signal a topical phenomenon ...or signal attempts to spam a search engine ... by exchanging links, purchasing links, or gaining links from documents without editorial discretion on making links. Examples of documents that give links without editorial discretion include guest books, referrer logs, and "free for all" pages that let anyone add a link to a document."

So there it is, straight from the horse's mouth, links from sites without editorial discretion can hurt you. If Google is furthur confused by thinking that your content is now part of editorially challenged site you're going get hit even harder, I would suspect.

In fact it seems that any link, if added in a manner that is seen as unnatural, can possibly lead to problems. Something to think about (or not, if you wish to remain sane).

vincentg

4:22 am on Apr 2, 2005 (gmt 0)

10+ Year Member



Reid

Is it possible to somehow make public the URL's that have been noted so far as possible bad guys?

If there is enough evidence that a few websites have been effected by this then we have to assume there may well be a problem.

Can we not track the offending website to see if it's taking place over and over with other sites?

I know the web is becoming a jungle.
We have Hackers, Spammers, and from your experience which sounds reasonable this new problem.

So the question now is why are so few sites effected.
If this is a coding mistake by google why is it not more wide spread?

Vincent G

Reid

6:26 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know if 'few' sites are affected or wether many are affected and just don't know it.
I saw a few that just figured they were 'sandboxed' when in reality they had these type of links holding them down.
Also i think high PR makes you immune or less susceptible so the majority of affected sites are inexperienced webmasters who just don't get it or experienced webmasters who see one of their fledgling sites 'stuck in the sandbox' or 'penalized somehow'.

There are many sites that have fallen out of the SERP's or remained in the sandbox for various reasons, all I can say is to check for this symptom:

site:yoursite should not show backlinks, and if so they are being associated with your domain for some reason. Also If you are able to remove these links by making YOUR page generate a 404 and they are showing in site: then get rid of them, some sites have recovered by doing this.

Others tote the allinurl: test but this only shows results for the search term appearing in the URL.
when I do this I get the .com evuivalent of my .net domain, which has nothing to do with me other than having the same name.
Allinurl: can also reveal some appended type links such as somesite.yoursite.com but I don't know if these are harmful or not, all i can say is they should not show in site:yoursite otherwise there is a problem with domain association.

Reid

6:46 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If there is enough evidence that a few websites have been effected by this then we have to assume there may well be a problem.

As far as further investigation of these directories I found, I had a very bad experience there and spent the next day cleaning my system, they replaced my google toolbar and installed a server on my hard drive, I was running virus protection but it only found these things after a few reboots and checking out 'files not accessed' because of denied permissions. I only caught it because it was one of my own files doing this.
I don't want to go there again, at least not on a windows system. These guys are very elite dark-hats, at least that's what I was told and found out by going there.
After that i just left it to google to figure out, there's a network of them and i had problems with a similar network of black-hats before. My philosopy now is to just avoid them unless i find them in my backyard.

buckworks

6:53 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Reid, let me guess ... was that "directory" you mentioned on a .cz domain? And the site: command shows about 1.4 million pages?

larryhatch

7:34 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I can definitely affirm that there is one black hat site doing 302 redirects,
just as described, and for the obvious purpose of SERPS ratings.
I went to the trouble of calling HEAD on the many many URLs for pages affected.
The URLs look like www.badguy.com/sites/site1234.htm

HEAD for that site leads directly to a page of mine.
Googling for snippets in my text brought up /site1234 before mine.

This is not just one page, but several on my site alone.
Badguy did the same thing with numerous other high ranking sites in my niche.
He must have hijacked content-credit for a hundred pages, maybe 200.

Digging further on Badguy.com, I even found a few straight <a href= links,
but even those were buggered to prevent "PR leak". It looks very 'professional' indeed.

Badmaster has a weasel worded copyright statement absolving himself
since his ads are non profit! Hahahahahahaaaah.
"If you want your materials removed, email badguy@badsite.com"
No response of course, and no action taken on his end.

Badguy has TWO sites doing the same thing in the same field,
carefully linked and set up so one does not harm the other, just everybody else.

So, somebody wants proof? I have all the evidence I need right
in my arcane back yard. I can only guess how rampant this is
in the more heavily traveled parts of the web. - Larry

minesite

9:20 am on Apr 2, 2005 (gmt 0)

10+ Year Member



I have a similar problem with a site using the 302 redirect to my site and was wondering what would happen if you set up a HTTP_REFERER or htaccess file to keep the re direct going.

Re direct the Re direct to either a lower ranked site or page, or back to the offending site.

Is this feasible.?

Reid

9:48 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



actually I dont remember the URL but it was a .com

There's a bunch of em out there, they look like a directory but if you look closer you see that it is auto generated pages and pages and pages of links. Lots of get rich quick, casinos, adult content, pharmacuticals featured on the home page and the first few levels but as you go deeper there are hundreds of thousands of categories like it's auto generated.
The real give away is the 'bad neighborhoods' featured upfront (the big sell) with the cleanstuff buried beneath it and the drive-by installations making your cpu do overtime.
These guys seem to have no limit.

Reid

10:06 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Re direct the Re direct to either a lower ranked site or page, or back to the offending site.

This has been brought up many times.

a ----> 302 -----> b ------->301 ------>c = a--->302--->c
also how can you redirect visitors from your high rank home page to a low rank page?
if b is your high PR page how will googlebot index it after you bypass it with a redirect?

The best way to deal with this issue right now (until google fixes it) is to keep an eye on site:yoursite and deal with them as they come using the removal tool.
It's not that often depending on your niche.

googlebot has no referer string and does not follow the link anyway. It simply notes the 302 link and lists it as a part of the hijackers page, the damage is done. later another googlebot fetches your page for indexing the other page.

Reid

10:30 am on Apr 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is one solution if you are getting these type of links a lot. But I think it requires a high PR to begin with and it takes a bit of ingenuity to set up.

It works like this:
Each time a user-agent visits you home page it is 302 redirected to a randomly generated page which will only exist once, after that it will return a 410 gone.

This way when googlebot comes to fetch a previously-generated page that it got from a 302 on another domain it will get a 410 gone. if googlebot asks for your home page it will get a newly-generated valid page.
I think this would require a high PR to begin with to pass to the newly-generated pages.

This would thwart the auto-generated hijacks which are after your home page but would not prevent one of your other pages from being victimized.

This 467 message thread spans 47 pages: 467