Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 Redirects continues to be an issue

         

japanese

6:23 pm on Feb 27, 2005 (gmt 0)

10+ Year Member



recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

theBear

3:55 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Now here is one way to for Google to nail all of these.

They have the 302 page labeled as being in your domain but the url isn't.

Simply delete the entry from the database and don't allow any such entries to be placed in the database.

That takes care of the dup content part of the problem, within the site.

deanril

4:20 am on Mar 13, 2005 (gmt 0)

10+ Year Member



TheBear - do you possess a PHD? j/k

japanese

4:20 am on Mar 13, 2005 (gmt 0)

10+ Year Member



Bret,

Thanks for joining in.

kenmcd,

I have been asked by many innocent victims of this fiasco to try and help them. I am powerless. Totally powerless and have no idea how google's bots behave within the actual process of a 302 status code, especially with so many variant scripts.

This story is near impossible to turn into a Joe Public story.

Best I can offer is to raise some dust on behalf of confused webmasters and website owners.

I tested the water by suggesting a site dedicated to pump out redirects. I feel guilty just by saying it.

I was always under the impression that the 302 status code is a temporary holding directive to robots and that meant that the bots should continue to visit the redirecting url for any changes. You only did this 302 to point to an authorized site or to your own site or alternative site that you owned, not somebody else.

But it seems using the 302 as an alternative method of linking is now the name of the game. Wow think of it, 300, 301, 302 and 303 combinations. The internet is going to get complicated, very complicated for the average user and he is going to get swallowed up in a complex procedure to the point that his website is overwhelmed by the dexterous handlers of these status codes.

fischermx

5:39 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

I'm sorry, I gave up about the msg #300 and I still not getting it about the 302 redirects.

Please, see :
I have a little directory, I have an script on aspx to count the clicks outs.
The script uses the standard Response.Redirect from asp library, it looks like :
[mysite.com...]

If I examine the link with an HTTP viewer it says
HTTP Status Code: HTTP/1.1 302 Found
the code generated by my redirect looks this way in the http viewer :

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href='http://www.theothersite.com/'>here</a>.</h2>
</body></html>

Now, my site is indexed by google and it has something like PR1. The sites I'm linking to, still indexed and well ranked on google.

So, where is the problem?

japanese

6:17 am on Mar 13, 2005 (gmt 0)

10+ Year Member



A Possible Solution.

If google adjusted their bots to ignor the LOCATION FIELD instruction in the 3** range of redirects and proceeded to cache the code generated page, still at the redirect status. The visit to the location page will be null and void. Thus avoiding any damage to an unsuspecting site.

THIS TRUELY MUST BE THE ANSWER
==============================

In theory and in practice the above should always have been the case. The best a redirecting site could do would be a code generated page with a hyperlink to the target site. This is what a robot should do. Not go to the target site. If a meta refresh is detected then googlebot should totaly ignore it. Let the guy have his meta refresh.

The above would render any redirect harmless.
============================================

The cost to innocent sites is now far beyond a joke.

If inexperience of webmasters and tactical hijacking using these 302 status codes is going to continue to be a problem with the bots, and will continue to be ignored by google, then we must work together to find a sensible solution.

What do you think of the above? Simple and effective? darn right silly? flawed?

steveb

8:02 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"not Yahoo, MSN, Jeeves, etc..."

As mentioned above, both Google and MSN have this problem.

crobb305

8:11 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"not Yahoo, MSN, Jeeves, etc..."

SteveB's response.....
As mentioned above, both Google and MSN have this problem.

Yahoo had this problem early in 2004. I recall them being concerned, with some questions posted by Tim and Yahoo_Mike in the Yahoo Forum. My site disappeared, as it did around the same time in Google. But with Yahoo, the issue was resolved within 12 weeks. Unfortunately, Google doesn't seem to care. They have the man power and the intelligence to resolve the problem, and would have done so many moons ago if they really cared. Email after email to Google and to a special Google Groups location by many members have resolved nothing. Sad to say that those of us who have suffered from the 302 hijacking may never reappear in the serps.

Thanks to Brett and SteveB for chiming in. :)

Have a good weekend.
Chris

crobb305

9:05 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I suspect that at least some of the issues in this thread that people are blaming on 302s and scraper sites are probably more likely due to recent algo changes

At one point during the past 9 months, there were over 40 tracker2/302 redirects to my site. Searching for pages within my site using the site:mysite.com command was showing many of those redirect urls! Google was actually associating those 302s with my site and was listing them as part of my site evident in the site:mysite.com search. This is proof of the problem. If Google lists 20 UNRELATED urls as being part of my site then there is clearly a bug that needs to be resolved. Again, as I just stated above, they are very very aware of this problem.

BTW, for newbies here, the site: command is supposed to show ONLY pages that are really and truely part of your site (i.e., home page, internal pages, etc.). So, if Google lists other urls that are NOT part of your site, then there is a problem.

Yes, 302s have existed for years, but only recently did they start to pose a problem to Google.

C

claus

9:43 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> as long as that goto.php SCRIPT still is installed, the link to your site will still work and Google will see it as a live page.

jk3210, that's a very important point. As long as the link is in Googles database it will get spidered. One time there, always there - unless removed by URL-console or returning a 404 or 410 for a sustained period of time. And, as long as the script works it will have the desired effect.

So, it is not even enough to get the link to the script removed from a page, you must make sure that the script no longer works for your URL. Or, that it returns a 404, a 410, or a page with this meta tag:

<meta name="robots" value="noindex">

>> Email after email to Google and to a special Google Groups location by many members have resolved nothing

It's true that we don't even know for sure if Google (or MSN) is working on solving this problem or not. I do think that at least Google is, as otherwise there would be no reason to ask us to:

send examples to webmaster (at) google.com with "canonicalpage" (all as one word) in the email title

(quoted from memory, might not be verbatim)

As for why Yahoo could solve it quickly while Google does not seem to be able to, i think there's a difference between how these two firms organize their data. It might in fact be very complicated although it seems easy on this side of the table, we just don't know unless we work there.

It would still be nice to get some sort of semi-official indication that they were actually working on "something" (or even "thinking about some possible improvements to ...whatever"). I don't consider it likely that we will get such an indication i must say.

I personally hope that if google does find a solution, the solution will be published just like Yahoo! did, so that webmasters can see how the various kinds of redirects are interpreted. I even remember Yahoo! asking for comments on their set of rules, which was a very nice move. I don't remember if i personally had any comments (if i had i might have disagreed a little), but i think Yahoo has really taken the webmaster community seriously here.

Still, we need to continue to push this issue so that more webmasters (and Search Engines) become aware of it. Even though Yahoo! has got a solution in place (and kudos to Yahoo for that), we still need Google and MSN to do something about this issue.

The ideal situation would of course be that all three major engines (+ Ask of course) came together on this issue and coordinated their efforts, so that we would know that these techniques were interpreted the same way in all these engines. While this would be nice, it would probably also take ages and involve a lot of red tape, so i guess it's not very likely at present.

stargeek

10:51 am on Mar 13, 2005 (gmt 0)

10+ Year Member



I don't consider it likely that we will get such an indication i must say.

Why do you think this is? Is it perhaps in its current form this "problem" is really a positive for thier bottom line? everytime a website looses free traffic there's a good chance they'll get an adwords customer.

activeco

11:04 am on Mar 13, 2005 (gmt 0)

10+ Year Member



Aside from a possible solution from Google side and I am sure there are many of them (a simple one would be introducing "referrer" header from a bot), what I don't understand is logic behind their current 302 behavior.
According to RFC2616 all 3xx redirects are basically submissions of itself to somewhere else.
It is not that you are entitled to something with 302, you pass your "credentials" to another place.
In other words if someone does 302, all his links, PR etc. should be passed to target place and not otherwise.
If target place already has more value, then keep it that way or even add source value to it.
301 is handled correctly, why not 302 too?

Reid

11:48 am on Mar 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is not that you are entitled to something with 302, you pass your "credentials" to another place.
In other words is someone does 302, all his links, PR etc. should be passed to target place and not otherwise.

Intro pages. Most intro pages use META refresh header tag which is a 302 redirect. Once the intro page checks for browser queries etc. Remember there are many reasons for intro pages, check for WAI issues such as screen resolution, voice browser or braille, also flash or shockwave.

If a site uses Meta refesh on its home page, Google likes to send the browser to the intro rather than deeper within the site, bypassing browser queries and risking a misinterpreted WAI compliant page.
The real problem is when there is a cross-domain redirect.

It is not Googles misinterpretation of 302, it is webmasters misuse of it. A link to another domain should not pass a server code at all - the browser is leaving that domain.

So Google really needs to sort out the domains to conteract webmasters common misuse of 302 redirects.

also thebear you are right - sites are being poisoned by these 302's, which is bad too - I was jsut trying to distinguish between the common innocent 302's and the purposefully sinister type.

I had a 302 split my site in half by pointing to the index page of my photo gallery. site:mysite showed the 302 link and my site without the photo gallery which was found in Links:mysite associated with that same url.

Bobby

11:56 am on Mar 13, 2005 (gmt 0)

10+ Year Member



Aside from a possible solution from Google side and I am sure there are many of them (a simple one would be introducing "referrer" header from a bot)

activeco, these are my thoughts too (hmm...maybe we've stumbled upon a new form of hijacking - thoughtjacking).

A referrer string from the bot would allow each webmaster to manually block the bot originating from known hijackers, a sort of after the crime solution.

I don't know the complexities involved with applying a referrer header for Google (though I suspect it's unlikely they would ever do so) but it seems to me that the crime must be prevented rather than corrected.

Assuming they did apply a referrer header it would require each webmaster to first deduce which domains are hijackers and then apply changes, something that just ain't gonna happen in most cases.

What we really need is to work together with knowledgeable representatives at Google thru this or some other forum to find a solution that is practical to all.

activeco

12:00 pm on Mar 13, 2005 (gmt 0)

10+ Year Member



A link to another domain should not pass a server code at all - the browser is leaving that domain.

True and I didn't say that.
What I said is that Google's internal measures of a site such as PR or links should be passed along to the target page, if the value of the source page being more valuable.

Bobby

12:33 pm on Mar 13, 2005 (gmt 0)

10+ Year Member



Searching for pages within my site using the site:mysite.com command was showing many of those redirect urls!

That's amazing! I've never seen that before. How long did that situation remain?

This 713 message thread spans 48 pages: 713