Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 Redirects continues to be an issue

         

japanese

6:23 pm on Feb 27, 2005 (gmt 0)

10+ Year Member



recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

stargeek

9:09 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Google never was broken. Had some technical difficulties and still has some.

To acurately debate this point we must define broken, one possible criteria is that it does not follow its own guidelines. Google has and does say that a website's ranking cannot be harmed by another site. This is simply not the case now and perhaps never was, if a toaster says you cannot be electrocuted by touching it and you clearly can be it is broken.
Google's mechanism is assigning content belonging to one url incorrectly to another url, functionally this is broken is it not?
If you have another definition of broken that does not include this situation I'd be interested to here it.


You don't like it, Go somewhere else. easy as is.

Google has an almost complete monopoly on web searches, this means that at least morally and probably legally they have an obligation to at least follow through with what they say. While one unstatisfied webmaster may take his traffic elsewhere, the vast majority of users on the web are using a broken engine.

Google uses its spin masters and the obviously tighly restricted spokesperson "GoogleGuy" to give the impresion that they care about users, webmasters or thier results, given thier very noticeable silence on this issue recently it is painfully obvious that this is simply untrue.

kwngian

9:17 am on Mar 12, 2005 (gmt 0)

10+ Year Member




Perhaps it is also time to block those unknown, uncommon spiders that comes by your sites and grap those new pages that you have before googlebot or those from the main stream search engines.

Reid

9:33 am on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Google uses its spin masters and the obviously tighly restricted spokesperson "GoogleGuy" to give the impresion that they care about users, webmasters or thier results, given thier very noticeable silence on this issue recently it is painfully obvious that this is simply untrue.

Cmon leave Googleguy alone. He tries to be helpful but Google does have strict policies about it's algythorim. Obviously theres a lot going on at the plex right now. Maybe they got teams of engineers working overtime on this very thing as we speak.

The other thing I noticed is that instead of using snippets from my pages as usual they have switched to using my actual META descriptions. Never saw google do that before. Something is up at the plex.

try searching Yahoo "google denies 302 redirect problem" #1 result - this thread.

activeco

10:04 am on Mar 12, 2005 (gmt 0)

10+ Year Member



Cmon leave Googleguy alone. He tries to be helpful but Google does have strict policies about it's algythorim.

Peter is a good guy, no doubts about it.
However, being helpful is another story.
He is here not to help you, but to help Google in the first place.

Whenever he jumps in, there is immediate shouting and bowing: "Thank you GoogleGuy, Thank You" and a guarantee of at least 10 new reply pages before the wave stops.
You have to understand that his interest here is to protect Google and possibly to get rid of most of the people here.
SEO in its current form is an enemy for Google, webmaster guidelines are only meant to make life easier for Google, not to you.
They rarely adapt to the web, the web adapts to them.
How many sites today use frames?
Since they acknowledged they (you) have trouble dealing with frames, such a wonderful feature belongs to endangerous species now.

If he (they) want to give constructive help, they could at least confirm exactly how many times they follow 302 consecutive redirections.
It would not endanger their algo's in any way.

arras

11:17 am on Mar 12, 2005 (gmt 0)



"Google never was broken. Had some technical difficulties and still has some"
it seems to me that or is broken and out of control because of greediness the common illnes of the American industry (biger and biger....so lets have 8.000.000.000 pages in our index...but to do that we have to include any kind of cr%^&p is out there,that's why you find your pages under a crapy chinese cr$%^p) or they try to fix it,as i have sayed in many messages i follow the last 20 days all DC's and as many probably you have noticed results changing every hour,and is not a usual google dance with some ups and downs in SERPS like #3-#5 or #6-#4
we can see pre alegra and post alegra pages coming up and desapear every hour.Pages of PR 3 that have the PR5 from another page because if you click on the cached shot is another page ,i could give you examples of such pages but i can't because of the TOS .To me is somehow clear that they have problems and they try to fix them,after all they are not a bunch of idiots or a bunch of brains.

DaveAtIFG

12:00 pm on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A possible indication of a fix on some datacenters?
deanril: I the same phenomenon on Liane's site a few weeks ago and posted about it at Danny's forum. I think it is part of a fix.

Liane uses absolute addressing and a 301 from non-www to www. If memory serves, your site uses absolute addressing?

It's strange that people would rather debate "Google is broke" than fix their sites, but there are many things I'll never understand.

arras

12:03 pm on Mar 12, 2005 (gmt 0)



"It's strange that people would rather debate "Google is broke" than fix their sites, but there are many things I'll never understand"
how you can fix an 8000000000 cr&^%$p?

larryhatch

1:50 pm on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What on Earth is a "8000000000 cr&^%$p?"
Come to think of it, what is an "algythorim"?

I can spell better than that half bombed. [burp!] -Larry

wiseapple

2:03 pm on Mar 12, 2005 (gmt 0)

10+ Year Member



Has anyone used the google automatice removal tool with the robots "no index" tag to remove offending sites? Just curious if this method works...

Example:
- Locate offending URL.
- Put robots "no index" tag into your page.
- Log into automatic url removal tool at Google.
- Input the offending url to be removed. Google should pick up the tag that such and such URL should not be indexed.
- Remove "no index" tag from your site.

Anyone have thoughts if this theory holds true?

If it does work, this would also be a hole in the system where I could remove anyones site that points to me with a 302.

Thanks.

tallguy

2:25 pm on Mar 12, 2005 (gmt 0)

10+ Year Member



It would be nice if some Google representative logs in here and shares some advice on how to tackle this hijacking problem & if they could give us some email address where to report offending sites.
This would help a lot.

japanese

2:27 pm on Mar 12, 2005 (gmt 0)

10+ Year Member



wiseapple,

Tried an even more intricate procedure by blocking the IP of the console’s robot, via .htaccess, php script to produce a 404 on the target index page, serverside and robots text etc.

Only one got the desired result, the console’s robot IP.

Request was accepted and later denied.

Do not attempt the above, bear in mind you are tampering with the target page, your index page. Disaster awaits a single mistake. Your index page represents your entire URL. You could erase your site in google for 90 days.

jk3210

2:29 pm on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Clicking on the "cache" from this search [google.com] would indicate that Google is handling (at least these) 302s correctly.

walkman

2:47 pm on Mar 12, 2005 (gmt 0)



"Clicking on the "cache" from this search would indicate that Google is handling (at least these) 302s correctly."

it just means that Google hasn't indexed them yet, that's all. It may just happen on the next update...

wayne

2:54 pm on Mar 12, 2005 (gmt 0)

10+ Year Member



[googleguide.com...]

File Format filetype:

Occurrences
in the title of the page -> allintitle:
in the text of the page -> allintext:
in the URL of the page -> allinurl:
in the links to the page -> allinanchor:

Domain -> site:
Similar -> related:
Links -> link:

*****************

If I want to see how many pages of just my site are
indexed, I use a combination of:
allinurl:mysite.com site:mysite.com

jk3210

6:58 pm on Mar 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<<it just means that Google hasn't indexed them yet, that's all. It may just happen on the next update... >>

Huh?

The cache from those links you see is what Google found when it followed those links.

Hijacking is indicated when content OTHER THAN that belonging to those links is found and cached.

That's NOT what Google is doing in this case.

This 713 message thread spans 48 pages: 713