Forum Moderators: Robert Charlton & goodroi
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
I've just seen this thread today, as well as this one [webmasterworld.com] and this one [webmasterworld.com], which are, i believe, all three threads you have posted in sofar - all on the same subject (at least with that name - no offence intendend).
I must say that you have done your homework, which is rare for a new poster. It's a subject i've been following for a few years, and i do appreciate that awareness is increased. No doubt you have seen some posts of mine on this subject, otherwise i can point to a few:
Here's one from December 2003 [webmasterworld.com] (#36) - note Brett Tabkes comment in msg #31, he's 100% right there and has been proven even more right later.
Here's another, from May 2004 [webmasterworld.com] (#1) - at the bottom you will find a collection of no less than 24 different related threads dating back to june 2003.
So, this is not new, and there's been plenty of discussions and complaints, literally for years now. Sofar Google has done absolutely nothing to remedy it for a few years.
Anyway, i just wanted to say hi, and that i do appreciate that you raise more awareness on these issues. I haven't read all post of the mentioned three threads yet, but i'm getting there.
Many thanks.
This is a major topic that affects many site owners. I thank you and the other senior members here for allowing me to make sure that the problem yet to be solved is still very much on the agenda.
A democratic exchange of ideas is by far the best method to seek answers to a major problem such as this. A dilemma that has left thousands of website owners in a quandary, if only for people who cannot understand why their sites have disappeared into total oblivion we should raise the stakes.
You and I know almost exactly why. But I can assure you that most website owners simply cannot work out the intricacies and the illusory and deceptive methods deployed against their vulnerable websites.
Claus, This problem has really killed off many website owners aspirations regarding the internet. They are at a complete loss as to why their site is in total oblivion. Google is a very secretive company and very reticent with it.. Google sheds no tears to the lady who spent £10,000 dollars on her site, many, many hours of building her site, complied with every ethical doctrine google outlines, only to be swallowed up by a simple 302 directive that adulterates her dreams of being a website owner.
I know why her site is in oblivion and I know exactly how it got there. I am not prepared to let this issue go. I will debate things on her behalf.
there are plenty of copywrite lawsuits currently being won against google. heres what i'm proposing
a company (there are now many) who's site no longer shows up for thier trademarked company name has a few offshore friends buy adwords for that name. a copywrite lawsuit follows and costs google a few hundred thousand. not only are they making money off of your brand name but they are allowing for confusion by not ranking your site even inthe organic results.
there are undoubtedly a few hundred, if not thousand poeople in this situation, this could make quite a dent in the bottom line, and would probably force google to rectify this situation.
* I did notice one thing that makes it look like it would be so easy for google to figure out how to know which page is the original. *
When you click on the google cache of the redirect site link and check the properties of the images or links on the page it gets worked out to referringsite/uniqueimage.jpg or referringsite/dir/link.html
Those links would all return 404 page not found errors when google would index them so it seems to me that an easy fix for googlebot would be a formula like this:
if page.links = mostly 404s
then
page = not original
don't credit referring url to content
else
page = probably the original
Even if googlebot didn't keep the variables needed you could run some sort of process over the index to check for invalid links. I don't see why a search engine would want to list pages with tons of 404s anyway.
A less process intensive fix would be to just give us a new robots.txt entry that says "contentdomain=validsite.com".
Is it just me or wouldn't it be just that easy?
I'm sure that every webmaster who's been affected by this problem would be more than happy to add a line to robots.txt.
My 2 Cents
The only way to force Google to fix this is with bad publicity. If this story made it onto CNN or whatever, the problem would be fixed within 72 hours.but this story by itself is not newsworthy, perhaps some sort of protest type action could get enough publicity and draw attention, maybe a sit in or something at the googleplex?
Google's public engine, software and hardware are tools being used to assist users in violating copyright laws.
I am certainly far from a legal rep, but my question: Would Google not be an accessory in helping someone violate copyright laws?
If I went to the public library and used their computers and hardware to reproduce copyright material in mass quantities for profit, don't you think action taken against the library would result in action against myself?
It would be an interesting argument to engage, however I don’t have deep pockets so I will sit back for the ride.
:)
Agree... seems Google is only worried about their bottom-line, and not search results...
Need a weather forecast, map/address, or some silly blog - then Google's your site. Outside of that, I'd personally focus my attention more on MSN or Yahoo.
Kind of ironic GoogleGuy can comment on the Google Desktop search ( quickly might I add ) - but seems to be incognito when it comes to Google faults ... especially since this subject has been discussed for six, or more months now - and yet there appears to be nothing done.
But heck... GoogleGuy says "People have already written some cool plug-ins" for Desktop Search... how about some cool code to fix the 301/302 problem?
but this story by itself is not newsworthy
This is actually not the case. If the story is put in such a way that doesn't mention, robots.txt, skyscraper sties, 301 & 302 redirect, php scripts etc then the journalist would decide whether it is a story.
To make it a story, some people from different disciplines, who know what the story consists of could outine the story for the relevant journalists.
There is a story in everything and everything is newsworthy somewhere!
The info given to journalists needs to give a clear and easily understandable overview of the problem, some examples and consequences of the problem, along with what happened when Google was notified. Bear in mind, the tech reporter might be unavailable and someone else without the needed skills might be filling in.
The person sending the press release needs to include his or her full name, address, phone number and email info in case the reporter wants more info. It would also be a good idea to provide Google's phone number so the reporter can contact them for their side of the story.
Info on how to write a press release: [lunareclipse.net...]
Let's see where this goes.
This means for me I dont see any solution to this topic, so maybe we most face the music and start to build pages with some redirecting scripts to good sites and then our own content, that way the scripts will be a form of SEO.
or
I will try to look for some tech news that have posted something about google, which have also been shown in CNBC and other business news channel, I will start now, then we could give them the links to webmasterworld which is about this topic and some good text which should be written by some here that are better with text then I am.
I could post some names here which could be a good email worthy.
As said I dont think we will see a solution, then we can also go to the google complex with some banners.