Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 Redirects continues to be an issue

         

japanese

6:23 pm on Feb 27, 2005 (gmt 0)

10+ Year Member



recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

hdpt00

12:13 am on Mar 12, 2005 (gmt 0)



I think the only thing google can do (and should do) about this is treat cross-domain 302's differently.
Sure there are a lot of people with multiple domains but they would just have to adapt, at least they have control of their own domains.

I like this idea and it is something that could be easily implemented I imagine. On a further update I got a canned response after complaining about three (3) 302s that said follow the google guidelines for good site design. They think I'm mental or something, that is an insult to send me that. I would have preferred no response.

mblair

1:37 am on Mar 12, 2005 (gmt 0)

10+ Year Member



I've been reading through this thread with interest and still trying to get my hands around it. What I am confused about is wouldn't some kind of standards-based or google proprietary expansion of the robots.txt protocol help to acheive hijacking protection?

Imagine if search engines ignored by default all 302 redirects entirely unless there was an entry in robots.txt at the destination of the redirect that specifically permitted the spidering of a permitted list of 302redirecting domains?

Something like:

302Authorized: www.valid-redirecting-domain.com

I applogize if this has already been shot down in this thread -- I read through it and didn't pick up on it.

japanese

2:15 am on Mar 12, 2005 (gmt 0)

10+ Year Member



mblair,

Your suggestion, including all suggestions made in this thread is better than google's.

Google has rarely made a suggestion of any sort about how its bots handle 302 redirects, except in rare cases at seminars and press gatherings where they proudly and sonorously pontificate that their bots are a marvel of human engineering accomplishment of gargantuan proportions that rival any of the eight wonders of the world, or at least they mean it that way, nor has it indicated how its bot behaves within the environment of a 302. In particular, it has never stated what happens immediately after their bot passes through serverside with a 302.

Does it for sure deposit the instruction at a database/centre, does it go directly to the target and if so what will it deem and do. Will it uphold the hijackers url and take a snapshot of your index page and attribute the contents as belonging to the hijacker?

This and many more questions remain unanswered.

For sure, and without a shadow of a doubt, you can manipulate googles’s bots by simply applying your own internal link to point to your chosen redirector script. There, with googlebot carrying your internal URL, you can indeed instruct googlebot that the URL it is carrying is the one to keep, but the contents of the URL temporarily resides at another site.

The stupid bot is now a virtual Polaroid camera at the behest of the hijacker. It will appoint the contents of the location page to the hijackers URL. Hence the contents of your index page showing in the cache of google under the hijackers url.

To add salt to wounds, google's so called patented filter now has to apply a penalty. Google has never disclosed if its algo is capable of applying a penalty to a page that does not exist, In this case the snapshot of your index page the hijacker has produced. It does not take a rocket scientist to work this one out. The penalty can only be applied to a figure in hand that is capable of absorbing a reduction, your actual index page.

I am now beginning to think that more money is being lost by websites than google is worth in the stock exchange. The number of sites in oblivion is far greater than we can immagine. How many website owners understand the procedure of a website hijacking?

surfgatinho

2:39 am on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google did a rollback yesterday (march 10) see posts in this thread #198 to #270

So I didn't just imagine that.
I have a fairly clear idea of where my sites "should be" and they were there momentarily but are gone again.
I'm sick of being told my SEO skills are lacing or it must have been something I did to offend.
Can anyone else confirm this?!

japanese

2:53 am on Mar 12, 2005 (gmt 0)

10+ Year Member



"""""OH MY GOD"""""

I cannot believe my eyes. Google has really done some very naughty things indeed.

Another reader/member asked me to look at their site.

A URL that should not be in google's index containing the page of the poor chaps index page.

Produced in google by exactly the method I describe on this threads innitial post. This is a true hijacking, possibly intentional/unintentional but certainly inconsiderate and discraceful, of a another sites page and now 100% evidence and it is cast iron fact.

I hope the poor chap makes a post here and discloses his plight. He has truely been hijacked and it will not be long until he gets a penalty for duplicate content.

yankee

3:25 am on Mar 12, 2005 (gmt 0)

10+ Year Member



How can I tell if this is happening to my sites?

Emmett

3:30 am on Mar 12, 2005 (gmt 0)

10+ Year Member




How can I tell if this is happening to my sites?

Do an "allinurl:yoursitename" search on google. Then look for sites that aren't your domain name that have your content in the cache.

yankee

3:40 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Thanks...it has happening to me. This explains why my traffic dropped 90%. What can I do to fix this?

Emmett

3:47 am on Mar 12, 2005 (gmt 0)

10+ Year Member



yankee,

Have a look at post #310 and some others in this thread.

Trawler

3:55 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Quick Note on 302's

I have quite a few that I have setup to direct traffic. I have watched them for months.

Goggles bot, picks up the parent domain ( "the 302e ") either from existing links on the web or the whois, follows it to the target, and indexes the target page under the "302e" domain. The target gets the penalty and here is the kicker. If the "302e" has good backlinks, the now stolen page ranks at the top.

This is insane, it just goes to show just how screwed up they are. They need to get off their kick on just because you have a lot of links your great. What about site content relative to user.

Most users could care less who links to you, they only care that you provide what they are seeking. If you don't, they walk with their mouse.

GOOGLE IS BROKEN and will remain so until they simply treat a link for what it really is. A shortcut to another destination. Nothing more. No big pie in the sky about this site is better than that site because they got links.

Here is a simple example. I have a domain that has 18 backlinks from a PR8 site. It points to a one page site of mine in an area totally different from the site linking to my domain. I can holt number 1 spot out of 3 million for any search term I want.

If I was to point that domain at anyone,s domain I could hit the top for anything out there.

The links? Oh I forgot to tell you, they are from pages that were published in 1996. By a person who dropped out and traveled the world on a motorcycle writing his memoirs. Apparently, he was good because the links stick.

Now if that isn't just crazy I don't know what is.

Trawler

tombo

3:55 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Hello,

The following method of implementing 302 redirects seems to work well for us.

The stats/redirect script is called using the onClick javascript method and the href attribute of the anchor tag has a direct link. The spiders will follow the direct link but javascript enabled browsers will follow the redirect.

We originally implemented this so that the spiders would not create false clicks.

<script language="JavaScript">
<!--
function redirect(href)
{
var nw=window.open('http://www.mysite.com/cgi-bin/stats.cgi?'where=href);
}
// -->
</script>

<a onClick="redirect(this.href); return false;" href= 'http://www.sitelink.com'>click to visit website</a>

Tom

Ledfish

4:06 am on Mar 12, 2005 (gmt 0)

10+ Year Member



I think it admirable for all those that are trying to find a solution to this problem and I hope that if you do, you make it freely availalbe to all who want to implement the solution so that those who use these redirects for there own financial greed at the expense of others get a taste of the wrath of PO'd webmasters.

As far as Google solving it, I doubt it because if your site has been hijacked and while you are waiting for a solution, from Google or otherwise, your only way to keep you site alive is to spend on adwords. Of course that help Google meet it's Wall Street demanded revenue goals, so in short, it's bad for us and it's good for Google. So why would they want to solve a problem that is in essence helping increase their revenue?

After all, Google has been aware of this problem for quite sometime and probably aware of how devasting it would be. Do you really think the brain trust at the Plex couldn't solve this problem or at least minimize it?

metrostang

4:35 am on Mar 12, 2005 (gmt 0)

10+ Year Member



allinurl:www.mysite.com yeilds over 19,000 pages indexed by Google. I don't see any pages other than mine until I get to 1,000.

The page listed as 1,001 is in this format: theothersite.com/cgi-bin/fwd.cgi?http://www.mysite.com+42590+h+8
This other site is an auction site that has no working links and appears to not have been updated since 2003.

No other pages are listed after that page. When I click on the link, it takes me to my home page. The source code appears to be all mine and my url shows. What does this mean. Have I been hijacked?

AndyA

4:46 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Google should be ashamed of its results pages right now. In most instances, I find directories, useless sites that are basically link mazes or traps, where you just keep clicking on links but never seem to actually *GO* anywhere! If we're to believe Google's results are relevant, sites that mention another by name are more relevant than the actual site being searched for!

For instance, I'm searching for the history of blue widgets, and I find a site on waste disposal that mentioned a blue widget was once found in the waste. That waste site ranks higher than the blue widget history site! Not relevant.

Google is a puppet for spammers right now, and their results are worthless. Sites with hidden text are running rampant, as are sites that cloak. I'm finding a lot of caches that don't match the page. How long before the stockholders discover this as well and start dumping their Google stock?

It's really shameful that Google has allowed themselves to be compromised to this level. Where is the integrity? Where is the desire to provide the best search results in the world? Google has surely dropped the ball...

[Spelling corrections - I must be tired]

[edited by: AndyA at 4:49 am (utc) on Mar. 12, 2005]

johnafrid

4:46 am on Mar 12, 2005 (gmt 0)

10+ Year Member



I have something strange happening to my site as well. I have about 35,700 pages indexed by google. After seeing this thread, I ran a allinurl:yoursitename and I had all my pages indexed as links without any details. After checking it again 4 hours later, I have about 400 pages back but the rest are still the same old links..

If anyone can help to tell me on whats going on, that would be really nice..please check my site in my profile.

This 713 message thread spans 48 pages: 713