Forum Moderators: Robert Charlton & goodroi
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
I need a website to upload it. Perhaps a folder called "google-302-hijacking-website" in an existing site.
Why don't you register a domain with an appropriate name, activate it, upload to the index.htm, post the news in this thread of what's underway, and ask anyone who is interested to link to the new site, (info via sticky in respect to TOS). It might stay sandboxed in G, no matter how many inbound links, but it would show-up in MSN soon enough, and Y a little later. The whole setup would cost peanuts.
Once it's rocking we can all see if we can kill it, as an experiment.
<edit> typo </edit>
You are close to being correct but minus a very subtle point. We are not saying that the 302 status code is the problem.
It is googlebot/s and msnbot/s
These bots are caching other sites pages and penalizing the legitimate site. Hence the duplicate pages and penalties.
So the problem is google's reticent reaction. They should say something about this matter. Better still, they should fix it.
Their only hope is to find a way that their bots follow a definative code to understand an existing site must not be cached twice.
Google has the resources to find a solution, hey, I would be happy to leave this alone if only I knew they were working on it.
Japanese - I will guarantee you get a lot of links, just keep the page clear, no ads
Yeah, man. Make it sole purpose, no ads, just a few links out to DMOZ or something so it isn't a bot dead-end. I'll give you a decent link and I'm sure a lot of the others here would too. Do a bit of simple SEO on it, text heavy, maybe a few pages... who knows, maybe we'll even get it into G? If so, we try to disappear it using suspect 302 methods.
<added>
I just registered an appropriate domain name.
Was typing when you posted that. Good stuff. Let me know the URL when things are set to go and I'll donate a solid link.
<added>
[edited by: Stefan at 12:37 am (utc) on Mar. 13, 2005]
Any volunteer much appreciated.
I will upload it and amend it a few times to be as easy to understand as possible.
I refuse to put adsense in it or any affiliate links.
I may also use the folder for potentially unethical and destructive methods of linking via a go-php or CGI redirector to cause the unpredictable status code conditions and multiple choice 300 301 302 303 redirects.
I may also use the folder for potentially unethical and destructive methods of linking via a go-php or CGI redirector to cause the unpredictable status code conditions and multiple choice 300 301 302 303 redirects.
Why not set up a second domain that will be for 302 hijacking purposes? That is, one site will serve as the target, that we all link to, and a second site, that the more brave and/or foolish of us link to, will be used for the hijack. The hijacking site is going to have to be strong enough to pull it off, so it will need inbounds.
(Mods: I will totally understand if this post gets edited out of the thread. It's the scientist in me... I want to see experimental proof.)
There are several redirect types 2 of them are "301 - Permantly Moved" and "302 - Page Temporarily Moved". Now someone puts up some links to your site using a PHP redirect script to track clicks. It looks something like this: redirectsite.com/go.php?1056, that link sends a 302 (Temp) redirect to yoursite.com. Google says "Oh, yoursite.com really belongs to redirectsite.com so I'm going to give redirectsite.com all the PageRank from yoursite.com". redirectsite.com now gets your search ranking and google removes your site from the rankings for having "duplicate content" of your own site.
Now I'm not 100% sure that it gives away your site's pagerank but it surely seems like it, it definitley assigns a duplicate content penalty on the home page. In my case filter=0 brings up my home page on the old search terms while the sub pages only come up with a quoted search on unique text.
The evidence pointing to lost PR in my case is that the DMOZ listing shows a PR1 now where it used to be a PR3. Not a lot I know but it used to get some traffic.
Yes, you maybe right.
Our member is putting up a hurriedly typed process of hijacking as we speak. Keep an eye on this thread.
We must help people understand the process. Action is now a requisite. All talk and no action will not get results.
Google must fix this problem... Period
They did not, so now we put up the method for everybody to benefit if they want to. It is not illegal to do 302 redirects.
We will describe how to do it efficiently as possible with the maximum effect.
Thanks.
We will see a glimps of its construction in a few minutes. Keep your eyes peeled.
We will be working on its construction and terminology until it looks as good as we can get it to be.
The process is axactly as how I described on the main thread of this thread with slight amendments and more details so that Joe Public understands it.
To top it off the "link" in the cached version was in the format: http ://www. [sitename] .com/dynamic-frameset.html?http ://www. [mysite] .com/[mypage].html
I wonder if they're trying to do something similar to the php 302 to my page with a frame.
If nothing else it looks like they're cloaking googlebot.