Forum Moderators: Robert Charlton & goodroi
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
Unless you can prove it, it's not going anywhere. With so many other penalty possibilities you can't prove anything
I believe you could be right Walkman
IMHO G is already working on it (and has been since Allegra) hence all the DC activity.
All we can do is try and speed up the process by making a bit of noise ;o)
Dazz
"oh no, those guys don't know what they're talking about. Rankings are not affected at all..."
Sounds like the email I received from Google on this issue! Google acting like there is no penalty on my sites that were GoogleJacked, duplicated, and penalized.
Something has got to be done and I think a press release would be the best way to go or a good PR optimized Blog that reporters tune into... Reporters seem to be reading lots of blogs :)
the only thing i can think of is that this is a way for the hijacker to boot my site off the index- but it doesn't make so much sense, because i see hijackers in completely different areas hijacking certain sites.
any ideas?
what kind of value? definitely not clicks. hell, the page isnt even counted as their page- ie, if you do site:hijacker you won't even see the hijacked pages in it.
regarding pr: let's say (even though it sounds far-fetched to me) that they get the pr of the hijacked page. what value does it give to them? the page doesn't belong to them, they can't monetize on it's pr or value- unless they'd stopped the 302 and set up a regular page instead.
can you shed more light?
People kill other people for a few hundred dollars, linking with a 302 is perfectly legal.
---
"I have to say: i dont believe in theory of hijackers. "
Unless you can prove it, it's not going anywhere. With so many other penalty possibilities you can't prove anything.
I can't think of any scenario where a site that MENTIONS another would have more relevance than the actual site. Even if a small hobby site named handmadebluewidgets.com was mentioned on CNN, a search for handmade blue widgets should still return the site itself, not CNN.
Just my opinion, I suppose everyone has a different expectation of what's relevant and what's not, but when I search for a site by name I expect to find that site, not sites that mention it.
The question I have is WHY is Google sticking with their present system of handling 302sPersonally, I don't see any value in a 302 redirect and I think the HTTP spec is flawed, but there may be some legitimate uses that I simply haven't encountered. Technically speaking, www.example.com is a subdomain of example.com, and Google is simply indexing both domains. When you register example.com, you are free to use either. It appears Google is unwilling to assume and associate both versions with the same site, to Google, two unique domains = two unique sites.
If webmasters don't take steps to control BOTH as I suggested earlier, they have left the door open for hijackers.
I changed my htaccess just as soon as I saw your message. However, from what I'm seeing in the serps, it may be too late for my particular site.
I am truly amazed at how fast a 10,000+ page site can be brought down.
And I have to say that this appears to be the first time in history that one could say that Yahoo is more technically advanced than Google. I don't see ANY evidence of this problem anywhere in Yahoo.