Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- 302 Redirects continues to be an issue


japanese - 3:04 pm on Mar 10, 2005 (gmt 0)


HIJACKING IS INDEED GOING ON

Please excuse the rushed explanation, I will post a mega example soon in more detail. This is for joe public.

Let me explain again how it is done. This time in more layman terms.

1, A blackhat reciprocates your link with a syntax on his page, it may at first appear to you that it is something like this;

<a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F">Target Site</a>

Do not be fooled by the above URL as being a link that points to your website. IT DOES NOT……Think

Look at where the href begins, it is pointing to the blackhat webmasters domain. The above link will look totally innocent to you when viewed from a browser. Take another look after the .com/…..This is the killer, it points to a variant of the NukeModule GO-PHP REDIRECTOR. A stoic and completely merciless script that can easily be modified and optimized to create havoc to googlebot.

Look closer still, you will see target-site as being the destination of the innocent looking link. Wrong again, that is cometics, the below is a version that that will do the exact job of the above.

<a href="http://the-killer-site.blackhat.com/go-php?=%2F%2FI-AM-GOING-TO-DESTROY-YOUR-SITE-BECAUSE-I-WANT-YOU-DESTROYED-IN-GOOGLE-AND-YOU-CAN-KISS-YOUR-RANKING-GOODBY-HA-HA-HA-HA-HA%2F">Target Site</a>

There is no difference at all about the two URL’s above. They will both look like this in a browser;

Target Site

The 2 URL’s above are Identical so far as an end user is concerned. And they are totally harmless when a end user browses the page and totally harmless when it is clicked. But it is a trip wire….Don’t forget….IT IS A TRIP WIRE FOR ROBOTS, Especially for googlebot and msnbot.

Your browser does not collect information on a website to present it to google’s databases and your browser is unable to make an exact copy of the page limited to 101K in size. So no harm can come of the above links when people click away at it all day long.

THE TRIP WIRE
The blackhat webmaster has tweaked his go-php redirector to have no mercy on the site it points to by creating the conditions that confuse googlebot on the serverside directive.

A link is placed at another strategic page pointing to the page of the blackhats page that contains the 2 syntaxes above. And do not foreget, the 2 URL’s above are not links that are pointing out. No sir, do not get confused that it is a link…The 2 URL’s above point internally to where the blackhats go-php redirector resides. It is here where a surreptitious and dastardly sinister script goes to work. The trip wires for googlebot are the innocent looking links that look like “Target Site” to the average end user. But when googlebot follows it, it goes into action.
The go-php redirector tells googlebot that this is the target sites URL… [the-killer-site.blackhat.com...]

The go-php redirector tells googlebot via the serverside directive protocol that the location of the URL is temporary and it resides in the location protocol. Hence the 302 header information.

LOCATION= [yoursite.com...]

Remember this… “Target Site”… Its real syntax should have been <a href=”http://www.yoursite.com”>Target Site</a>

So, now googlebot has enough information to leave the blackhats site and deposit the gathered info to a temporary holding place at googleplex or wherever it is to dump the info. All of this occurred in a split second. But “”””WAIT”””” The doom of the Target Site is not yet been sealed, another more sinister event had taken place simultaneously as the 302 directive was dished out to googlebot, an unbelievably dirty trick has also been enacted to the detriment of the target site, the go-php redirector kicking into action and in unison with another deceptive method a META REFRESH pointing to Target Site had also been generated and this is a residual effect that will not go away no matter what you do to. A solid HTML page with its sole purpose to refresh in ZERO seconds to your site

Google’s databases now have a new URL waiting to be processed and it is <a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F">

Its LOCATION is [yoursite.com...]

One of the googlebots is given instructions to go fetch a snapshot of <a href="http://the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F"> and that the location is [yoursite.com...]

The bot goes to find a short url version of [yousite.com...] in other words it is looking for [yoursite.com...] after its normall procedure to very domain existence the bot finds it to not resolve to the www version, but it exists and proceeds to approach the apache server with the resquest, it is given a 200 GET, takes a snapshot of the page returns the info for indexing in google.

RESULT
The title of your index page goes here. And its url is the blackhats
The snippets of textual content goes here.
the-killer-site.blackhat.com/go-php?=%2F%2Ftarget-site.com%2F cache similar pages

Assuming the patented duplicate content filter of google has detected that another page in the index has identical content, then your site is in all sorts of trouble and this process is totally unpredictable with dire consequences to the established site. Google cannot and has vever declared whether it can penalize a page that does not exist. The above Hijacked example certainly does not exist, it was generated by manipulating the loophole in google that needs urgent modifications.

Thousands of websites have disappeared in the google results because of the above procedure. Meta refresh is not always produced but the results are often the same.
The variables are too immense to contemplate as to why very high pagerank sites are not affected. No amount of defence will protect your site from this kind of sabotage.

.htaccess, robots text, NO-INDEX, nor any other known defense mechanisms are able to stop the above process. Doing a 301 to resolve the short url will not help.

The past year has seen thousands of the short urls appear in google’s index with no title or content, no snapshot to help find out what caused it. Googles patented duplicate content filter can be tied in very closely to this anomaly based on the patent not been that old and the explosion in numbers of the short urls are almost same age.


Thread source:: http://www.webmasterworld.com/google/28329.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com