Welcome to WebmasterWorld Guest from 3.214.184.124

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 Redirects continues to be an issue

     
6:23 pm on Feb 27, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 27, 2005
posts:93
votes: 0


recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

3:52 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 12, 2001
posts:1150
votes: 0


<<After the site removed their 302 redirect it was still listed as my site until today.>>

And I think this is because once a url is in Google's database, googlebot continues to go DIRECTLY back to that url for spidering.

IOW, if the original url is something like...
"foo.com/cgi-bin/linkto.pl?target=yourdomain.com"

...even though the offending site removes the "target=yourdomain.com," Google will continue to back the the full original url containing your url.

Google will continue to see the original url as an actual page as long as that "linkto.pl" script is in place.

Anyone agree?

noobie34

4:43 pm on Mar 10, 2005 (gmt 0)

Inactive Member
Account Expired

 
 


Is this an example of a page jacker?

[1-hit.com...]

It has basically only one link on the page, and the link is of the form:
[1-hit.com...]

5:51 pm on Mar 10, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 10, 2003
posts:351
votes: 0


I guess I'm having a dense day...

Can somebody answer these questions:

1. Is the problem that the "scraper" site is redirecting to a variation of your url that returns a page cannot be displayed?

2. Is the problem that they are stealing your content and putting it on their own site?

3. Is the problem that Google considers info at "http://domain.com/page.htm" and "http://www.domain.com/page.htm" to be duplicate content because one is missing the "www." in front of the address?

Am I anywhere close to describing the problem?

5:59 pm on Mar 10, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Dec 20, 2003
posts:268
votes: 0


"Am I anywhere close to describing the problem?"

If you don't read anything else read post #54. It is a pretty good summary of how this thing works.

6:03 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 3, 2003
posts:1246
votes: 0


I have no need for traffic from india, china, taiwan, korea, south america, or eastern europe. Therefore, I would have no problems simply banning them all from my current and future sites.

Anyone has list of ranges by country to accomodate this?

6:12 pm on Mar 10, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:May 13, 2003
posts:147
votes: 0


As for spyware stealing affiliate links,!?!, not heard of this one yet, please give a link to a thread or pm me a website with more info on how to deal with this.

Just do a search on Google for 'affiliate commission theft'

6:28 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 8, 2004
posts:865
votes: 0


Thanks wayne.

As for the hijackers using a snapshot of your site, wouldn't simply adding a date at the top of your page or page date show that your site was the newer updated one?

6:32 pm on Mar 10, 2005 (gmt 0)

System Operator from US 

incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14664
votes: 99


GO-PHP REDIRECTOR. A stoic and completely merciless script that can easily be modified and optimized to create havoc to googlebot

Please explain what they can possibly do with the script different than any other 302 redirect?

Based on your description, any unknowing person that slaps up a directory using off-the-shelf PHP software that uses this redirector is a black hat? I don't think so.

The problem is obviously not with the scripts or redirects, it's obviously Google's interpretation of the redirect. If people are exploiting the 302 bug, whether on purpose or inadvertently, you can't blame the technology they're using as it has never been the problem, the problem is Google.

So everyone should stop chasing Google to remove this link or that link which burns lots of Google resources and hammer on them to fix the global issue with their stinking 302 handling algorithm.

7:28 pm on Mar 10, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 9, 2004
posts:98
votes: 0


Spot on. The problem is Google. This has nothing to do with the RFC for 302s. The RFC doesn't tell a search engine how to credit page rank and discount duplicate content.

That's entirely Google and shame on them. And forget about the webmasters hurt by this, it is a shame for the users of Google who get cloaked to sites that Google recommends based on someone else's content.

Either Google can fix this issue immediately or they are morons. Simple: no credit to 302s when the supposed temporary URL has one non-302 link to it on its own site. That comports with the RFC, gives owners control over their domains and content and is the right thing to do for users.

Can't you just feel the class action lawsuit building?

7:30 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 16, 2004
posts:693
votes: 0


In laymens terms the problem is this.

Many websites have an intro page, some automatically check to see what type of browser you have or wether you have the plugins you need to view the website, language preference etc.
This way the website can send you the right page for you. For example if you have shockwave or flash software or not, the intro page will direct the browser to the compatible page. These type of intro pages typically use an automatic redirect where if you don't click on a link within a certain time the browser will automatically be directed to the proper page.
When you do a search in Google, the search engine does not want to send you directly inside the website to an incompatible page, therefore when google sees a page with the automatic redirect code, it assumes this is the intro page and sends surfers to the intro page for the content you are searching for. This way the website can check your browser for the appropriate software and provide you with the best possible surfing experience.
The google bug happens when website use this same type of redirect code to point to other websites. Most people do this for various legitimate reasons, usually it is a simple tracking method so they can record which links are being used and which are not. Google mistakes the other website as the intro page in these cases.
Some less scrupulous webmasters 'hijackers' took notice of this google bug and are exploiting it to the fullest, effectively hijacking other websites position within google search results. Google has gotten better recently, apparently fixing most of the accidental hijacks but the real hijackers have become very aware of this weakness and are taking it to another level using illegal or unethical methods. This is called 'google jacking'.
Although google has made changes to it's secret algythorim, a mathematical process used to determine which page is most relevant to a search query, webmasters have become very concerned about this issue. Many have lost their livlihood to 'google jackers' who uscupulously lure the surfer under false pretense of delivering the content described in Googles search results, into any number of less than ideal surfing experiences .

I think that pretty much sums up the issue in laymens language. Remember most surfers dont know what

<a href= 
even means, much less care.

This is just a draft, anyone care to make adjustments or should we send it off to the press?

edited - another speeling mistake

[edited by: Reid at 7:40 pm (utc) on Mar. 10, 2005]

7:32 pm on Mar 10, 2005 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 17, 2003
posts:717
votes: 27


Would it be possible to hijack the hijackers?!
7:41 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 10, 2003
posts:929
votes: 13


As an Adsense publisher being hijacked, I thought G might have had an insterest in fixing this ASAP as they would be losing money too. But I've now realized, most of the hijackers ARE adsense publishers too, so G probably has only seen it as a shift or JUMP in revenue.
We just had another domain fall prey. At the rate this is going, it looks like G will soon be obsolete, as the one dominant 'virus of a hijacker' grows big enough PR-wise to knock EVERYONE off and become the dominant scraper site. Google will be out of a reason to use it. At least then we'll ALL know who the culprits are.

[edited by: MikeNoLastName at 8:05 pm (utc) on Mar. 10, 2005]

7:42 pm on Mar 10, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 4, 2005
posts:70
votes: 0


Ok, I'm seeing a lot of misunderstandings here, and for some reason an ability for anyone to describe the problem in simple terms.

Here is is.

When you click on a hyperlink, your web browser asks the server for the page.

A "302" redirect is simply the server telling the browser to look elsewhere for the page that was requested. Your browser then automatically looks at the address which accompanied the "302" and asks that server for the page.

For example, in your browser you type in abc.com.
At abc.com, the server sends a 302 which says to your browser "the page you're looking for is actually at def.com"
Your browser then goes to def.com to get the page.

There's no copying of content.

Many web sites use 302 redirects to count how many people have clicked on a link, otherwise there is no way to know when someone clicks a link on your site. Also, there are many php scripts for building directories which use 302s, and many people using asp or asp.net who use built-in redirects in those systems are also using 302s. There are many many reasons for doing so, most of which are not evil.

The problem is that Google, when it follows a link that returns a 302, files the destination page under the original url. The end result is that *anyone* who links to you using a 302 gets their link added to google using *your* page's content as their content. This can cause problems with google's duplicate content filter, and can end up causing your original page to be demoted in the listing, while the page linking to yours gets recognized as the "original".

The problem is one that only google can fix, and some indications are that they are already working on a fix, but because no one there is talking, the speculation still runs wild.

How's that?

8:06 pm on Mar 10, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 8, 2002
posts:2335
votes: 0


Has googleguy or any official posted about this lately? I remember at some point Google finally addressed the issue somewhat. But clearly not well enough.
8:07 pm on Mar 10, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Jan 10, 2005
posts:236
votes: 0


Does google actually fetch the page twice? So, it fetches it once on the real URL, and it fetches it again by following the 302 redirect? Or does it see that the 302 points to a URL it has already harvested and just note that, without a second fetch?

If it does fetch twice you may be able to defend against this attack by randomly varying the content on your page. Say, randomly different quotes come up, or rotate news stories, etc., somewhere on the page.

The idea would be to defeat Google's duplicate conttent filter. If it doesn't think the pages are duplicates, it won't replace one with the other.

This 713 message thread spans 48 pages: 713
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members