Welcome to WebmasterWorld Guest from 126.96.36.199
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
I need a "Joe Public" description of the problem.
I'm an experienced webmaster and seo, and although I understand this thread, I'm not able to explain this problem to my employees (much less my wife) at this point.
I would also like to submit it to cnet, drudgereport, other threads, etc, but again, would anyone understand the nature of the problem?
Can anyone summerize this in layman's terms, maybe with some examples?
joined:Dec 29, 2003
joined:Dec 29, 2003
"I would also like to submit it to cnet, drudgereport, other threads, etc, but again, would anyone understand the nature of the problem?"
This has nothing to do with 302s. You have penalty X and it's because you did Y. How do you answer that?
I think it's easy to say a site doesn't rank because they've done something to earn a penalty, then disregard any of the other factors that might be a consideration. Your average searcher doesn't care about Google penalties, if they even know such a thing exists. All they know is someone told them about this great site about hand made blue widgets, but they couldn't remember the URL. So, they go to Google and type in "hand made blue widgets". What do they get in the SERPs? Do they get handmadebluewidgets.com? No, they get 50 or 60 directories that have scraped sentences from handmadebluewidgets.com, then 30-40 other sites that might be related that mention or link to handmadebluewidgets.com, or they get several dozen sites that mention blue, widgets, hand made, hand, etc.
That is NOT a relevant search. Especially when the site being searched for is buried at #256 for its own name. It's only a matter of time before the public notices this is happening, even now I've had people tell me they feel like they're wasting their time at Google. How long before the switch to a new SE happens?
Face it, there are some very basic SE criteria that Google is failing to deliver right now. And if Google is going to penalize sites for some unknown factor, and replace the penalized site with a hundred or so sites that MENTION that site instead, how is that relevant? Google isn't helping people to find what they want, they're throwing up detours and barriers instead.
It sounds interesting to me. My years-old site has just started dropping like it's hot, one term at a time, as of a few weeks ago. After reading your msg, I realized that a change I made around that same time eliminated a randomly-generated list of products that appeared on each page. I just put this code back in and will report what happens, if anything.
Please excuse the rushed explanation, I will post a mega example soon in more detail. This is for joe public.
Let me explain again how it is done. This time in more layman terms.
1, A blackhat reciprocates your link with a syntax on his page, it may at first appear to you that it is something like this;
<a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F">Target Site</a>
Do not be fooled by the above URL as being a link that points to your website. IT DOES NOT……Think
Look at where the href begins, it is pointing to the blackhat webmasters domain. The above link will look totally innocent to you when viewed from a browser. Take another look after the .com/…..This is the killer, it points to a variant of the NukeModule GO-PHP REDIRECTOR. A stoic and completely merciless script that can easily be modified and optimized to create havoc to googlebot.
Look closer still, you will see target-site as being the destination of the innocent looking link. Wrong again, that is cometics, the below is a version that that will do the exact job of the above.
<a href="http://the-killer-site.blackhat.com/go-php?=%2F%2FI-AM-GOING-TO-DESTROY-YOUR-SITE-BECAUSE-I-WANT-YOU-DESTROYED-IN-GOOGLE-AND-YOU-CAN-KISS-YOUR-RANKING-GOODBY-HA-HA-HA-HA-HA%2F">Target Site</a>
There is no difference at all about the two URL’s above. They will both look like this in a browser;
The 2 URL’s above are Identical so far as an end user is concerned. And they are totally harmless when a end user browses the page and totally harmless when it is clicked. But it is a trip wire….Don’t forget….IT IS A TRIP WIRE FOR ROBOTS, Especially for googlebot and msnbot.
Your browser does not collect information on a website to present it to google’s databases and your browser is unable to make an exact copy of the page limited to 101K in size. So no harm can come of the above links when people click away at it all day long.
THE TRIP WIRE
The blackhat webmaster has tweaked his go-php redirector to have no mercy on the site it points to by creating the conditions that confuse googlebot on the serverside directive.
A link is placed at another strategic page pointing to the page of the blackhats page that contains the 2 syntaxes above. And do not foreget, the 2 URL’s above are not links that are pointing out. No sir, do not get confused that it is a link…The 2 URL’s above point internally to where the blackhats go-php redirector resides. It is here where a surreptitious and dastardly sinister script goes to work. The trip wires for googlebot are the innocent looking links that look like “Target Site” to the average end user. But when googlebot follows it, it goes into action.
The go-php redirector tells googlebot that this is the target sites URL… [the-killer-site.blackhat.com...]
The go-php redirector tells googlebot via the serverside directive protocol that the location of the URL is temporary and it resides in the location protocol. Hence the 302 header information.
Remember this… “Target Site”… Its real syntax should have been <a href=”http://www.yoursite.com”>Target Site</a>
So, now googlebot has enough information to leave the blackhats site and deposit the gathered info to a temporary holding place at googleplex or wherever it is to dump the info. All of this occurred in a split second. But “”””WAIT”””” The doom of the Target Site is not yet been sealed, another more sinister event had taken place simultaneously as the 302 directive was dished out to googlebot, an unbelievably dirty trick has also been enacted to the detriment of the target site, the go-php redirector kicking into action and in unison with another deceptive method a META REFRESH pointing to Target Site had also been generated and this is a residual effect that will not go away no matter what you do to. A solid HTML page with its sole purpose to refresh in ZERO seconds to your site
Google’s databases now have a new URL waiting to be processed and it is <a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F">
Its LOCATION is [yoursite.com...]
One of the googlebots is given instructions to go fetch a snapshot of <a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F"> and that the location is [yoursite.com...]
The bot goes to find a short url version of [yousite.com...] in other words it is looking for [yoursite.com...] after its normall procedure to very domain existence the bot finds it to not resolve to the www version, but it exists and proceeds to approach the apache server with the resquest, it is given a 200 GET, takes a snapshot of the page returns the info for indexing in google.
The title of your index page goes here. And its url is the blackhats
The snippets of textual content goes here.
the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F cache similar pages
Assuming the patented duplicate content filter of google has detected that another page in the index has identical content, then your site is in all sorts of trouble and this process is totally unpredictable with dire consequences to the established site. Google cannot and has vever declared whether it can penalize a page that does not exist. The above Hijacked example certainly does not exist, it was generated by manipulating the loophole in google that needs urgent modifications.
Thousands of websites have disappeared in the google results because of the above procedure. Meta refresh is not always produced but the results are often the same.
The variables are too immense to contemplate as to why very high pagerank sites are not affected. No amount of defence will protect your site from this kind of sabotage.
.htaccess, robots text, NO-INDEX, nor any other known defense mechanisms are able to stop the above process. Doing a 301 to resolve the short url will not help.
The past year has seen thousands of the short urls appear in google’s index with no title or content, no snapshot to help find out what caused it. Googles patented duplicate content filter can be tied in very closely to this anomaly based on the patent not been that old and the explosion in numbers of the short urls are almost same age.
Most such redirection methods are not done with malice in mind.
But getting a few competitors to be demoted seems a simple procedure so long as google does not implement modifications to block the loophole.
Googlebot is a virtual camera.
Now you have an awsome tool that is destructive becuase a patented google duplicate content filter does not seem to work in harmony with the bot for the betterment of its results but seems to favour the newer page that does not exist.
After the site removed their 302 redirect it was still listed as my site until today.
I figured that since nothing was pointing to their redirect code anymore that it would take a while to drop the duplicate page from the index so I submitted it via addurl last night.
This morning a search for "my unique text in site" no longer shows the redirect!
I'm still filter=0'd but I think that will correct itself in time.
Too bad that in my panic I managed to remove about 300 backlinks from my own site :(
I didn't take the time to read this whole thread, so
maybe this has been mentioned already, but a lot of
affiliates use redirects to hide their affiliate
links, to stop all of the spyware on peoples computers
from stealing their commissions. An affiliate almost
has to use a redirect nowadays if they want to get
any commissions, there is no harmful intent. If they
just use their regular affiliate link to the merchant, they will
have a lot of their commissions stolen from them.
I only think the problem exists like japanese said, not when someone uses a 302 to link to your site but when they steal a snapshot of your site and try to say it's at some new location.
As for spyware stealing affiliate links,!?!, not heard of this one yet, please give a link to a thread or pm me a website with more info on how to deal with this.