Welcome to WebmasterWorld Guest from 54.162.139.105

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 Redirects continues to be an issue

     
6:23 pm on Feb 27, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 27, 2005
posts:93
votes: 0


recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

6:44 am on Mar 11, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 26, 2003
posts:705
votes: 0


How about bringing down the IRS web site.;)
6:45 am on Mar 11, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 20, 2003
posts:197
votes: 0



How about bringing down the IRS web site.wink

the IRS doesn't rely on google traffic to heavily.

7:06 am on Mar 11, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 26, 2003
posts:705
votes: 0


"the IRS doesn't rely on google traffic to heavily."

May be true, but it would be noticed .. tax season is coming .. would surely make the IRS angry .. BIG publicity .. IRS has megapower.

Should make major headlines

7:20 am on Mar 11, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 20, 2003
posts:197
votes: 0


May be true, but it would be noticed .. tax season is coming .. would surely make the IRS angry .. BIG publicity .. IRS has megapower.

Should make major headlines

ok, point taken.

japanese, what do you think about the possibility of this happening?

7:31 am on Mar 11, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 6, 2002
posts:742
votes: 0


Removed
8:59 am on Mar 11, 2005 (gmt 0)

New User

10+ Year Member

joined:May 15, 2002
posts:37
votes: 0


The one I've seen recently is where affiliate monsters are loading Google Serps dynamically into their webpages because the "snippets" from the Google results contain lots of keywords, so it boosts the Keyword Weighting of the page.

As a result these moster affilaite domains, 10000s pages are getting assigned high PR and are in fact being returned in the results ahead of the main domain.

It's all lowcost, lowoverhead twisted jacking at the end of the day to try and take commission. It won't stop until either there's a law against it or it costs a lot of money to do!

10:07 am on Mar 11, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Feb 14, 2003
posts:236
votes: 0


there are laws against it, but its like in finance, if you steal from a lot of small guys you eventually will not be torn to court as it is too much effort for a small guy.

which laws?: unfair competition, service-/trademark violations, copyright infringement, ...

10:21 am on Mar 11, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 10, 2003
posts:79
votes: 0


One site was hijacking my site and carrying adsense.

I contacted Google Adsense and here is the answer:

"Google AdSense is a program for web publishers who want to display
advertising on web pages they control. By placing AdSense code on their
web pages, the publisher can display text-based Google ads that are
relevant to the content readers see on the pages. Publishers, not Google,
control what pages have ads and the content of those pages.

Google is a provider of information, not a mediator. We serve ads targeted
to certain web pages, but we don't control the content of these pages. For
these kinds of questions or comments, it is best to directly address the
webmaster of the page in question.

To uphold the quality and reputation of Google AdSense, please note that
all AdSense participants are held to our program policies (
[google.com...] ) and Terms and Conditions (
[google.com...]

If this website is found to be in violation of any of these policies, we
will take the appropriate action on the account."

11:14 am on Mar 11, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 15, 2003
posts:2412
votes: 5


It seems like every time i look at this thread i have to spend a few hours just reading the latest posts, so again i'm not really able to give feedback on all the posts, i'm sorry about that.



>> (base href): Claus, if you use only absolute referencing in your pages and no relative, should you still install this?

Ownerrim, i agree that this makes no sense.

No, you shouldn't have to. And you shouldn't have to use absolute referencing either. You should just build your pages as you see fit, making sure they conform to the relevant standards for what they are supposed to do, but not really anything else. Both of the above are optionals, it's not anything you're required to do as a webmaster by any official body whatsoever. Of course, you shouldn't even have to worry about either using redirects or being redirected to either.

So, use one, or both, or neither, as you choose. I can't promise that any option will work, though - i wouldn't like to create false hope here. Nowadays, it's sometimes good to do a lot of stuff that you really shouldn't have to do.


>> bring some site down

First, that idea has been tried before at another forum with "search engine" as part of the title. I believe they had a site that simply volunteered to be hijacked, which in my view is very important, if not the most important issue at all.

Second,

We must have the backing of Claus and Brett

By posting here, I (as well as everyone else) have agreed to conform to the TOS of WebmasterWorld [webmasterworld.com]. It's not always that i (or anyone else) remember the TOS verbatim, so of course sometimes there are misses, but basically Brett has set some rules for this place that should be followed by the members.

I can't really speak for Brett, but i doubt he would back it, as i just looked up what the TOS of this board has to say about the matter:

#26: Claims of action, flames, and calls to action against any company or person will be removed.

Hint: The word "removed" does not exactly sound like "backed up" to me ;)

Third, what i would very much encourage you all to do is to follow this piece of advice for every single example you know about:

If people want to send specifics (i.e. "site A appears to have duplicate pages from, or is doing a 301/302/whatever to site B, and Google is wrongly picking site A as canonical", with actual values for A and B), I'd be happy to hear them. Drop an email to webmaster [at] google.com with the keyword "canonicalpage" (all as one word)

Include all the specifics you can find (like URL's, server headers, and whatever) but keep it factual. It's no use asking questions as you probably won't get an answer, so don't expect that. Just the facts, nothing else. Send it off into the big G webmaster inbox and expect nothing in return.

Last, you can always do your own write-ups about the situation on your websites and blogs and whatever. If you can find space for it, do include the quote under "third" above, please, as that's the only "confirmed tool" we have to remedy the problem. Spread the message as you see fit. Also, i hereby cancel and reverse what i wrote in post #54 of this thread about it not being intended for republishing:

Limited public license of right to copy (copyright):

Feel free to copy the whole or any parts of post #54 of this thread [webmasterworld.com] as authored by me to any web site, blog, or other medium of choice
- as long as you do all five of these things:
  1. you do not edit it so that it changes meaning or context
  2. you clearly state that you did not write it
  3. you provide a link to this thread and mention the post number so that it can be found by anyone wishing to examine the original
  4. you do not use the post to encourage, endorse or justify any action that it does not encourage, endorse or justify in and by itself, as it is.
  5. Specifically, you must explicitly state that you oppose to any kind of hijacking and that you do not encourage this.

What you don't need to do:
If you do the above you don't need to mention my nickname or real name (see profile), but i would appreciate it very much of course. I would also appreciate a link to this license (or post number) to accompany any quote, so that everybody can see that you have in fact been permitted to post it, but i'm not requiring that. You specifically don't have to link to any web site of mine.

That should be easy, right? I think/hope Brett will okay this, as it's one post only, he gets a backlink, and after all it is me that's the author. Let it fly...



Sorry to be such a boring conformist here. Hope the last part will help a bit, though.

Also, i see that there are some members that have seen some improvement on some datacenters. That's nice, but i also feel it's too early to say if this is really being solved. It could be all kinds of coincidences.

11:42 am on Mar 11, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 10, 2003
posts:351
votes: 0


anyone who has been hijacked start sending then e-mails of known examples

Can someone give me a simple method to determine if my page(s) have been hijacked?

I'm still going for the simple explanation angle here...

11:42 am on Mar 11, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 7, 2003
posts:1406
votes: 0


In this thread, the best bids sofar seems to be:
  • always redirect non-www to www (or the other way round)
  • use absolute internal linking (ie. include your domain name in internal links)
  • include a bit of always updated content on your page (eg. a random quote, a timestamp, or whatever)
  • use the <base href=""> meta tag on all your pages
  • Trying to determine just how effective these points are. As mentioned earlier I have not been affected by this 302 problem, and I do have 302 redirects pointing to my site. (They have been contacted and asked for removal)

    Being a bit of a perfectionist, when I designed the make-over for my site I replaced all the links with absolute links. I added a bit of always updated content to many of my pages. My htaccess points all domain requests to www.domain. Most of this was done before the 302 problem began to surface.

    In the past 18 months I have not seen my site hijacked. So, the question is, do these methods help prevent hijacking? If you were hijacked, do you use these methods? Or did you just start using them after you were hijacked? I understand that once hijacked it's a bit late to start employing different tactics - when you're gone you're gone until the G-Gods see fit to re-include you. But were you using these methods before you were hijacked? Your answers just might help another webmaster sleep a little easier.

    11:44 am on Mar 11, 2005 (gmt 0)

    Preferred Member

    10+ Year Member

    joined:Dec 8, 2003
    posts:548
    votes: 0


    GoogleGuy, let me just throw in an idea. I understand that it must be hard from a SE's perspective to distinguish between legitimate and illegitimate temporary redirects (302s). One possible solution is not to follow cross-domain 302s at all. Another more elegant approach is not to follow redirects that cross domains with different owners although that might be expensive (computationally or bandwidth-wise). Thus I favour another solution.

    I know that GoogleBot can’t provide the referrer when crawling a page because a) one page might have many links to it and hence more than one possible referrer and b) because the GoogleBot instance that crawls the source page of a link might not be the instance that crawls the target page of a link. BUT: Why not make an exception for 302 redirects? An adequate procedure to accomplish this can be laid out as follows:

    1) GoogleBot crawls the redirecting page (source page) and gets a 302 along with a Location: header containing the URL of the target page.

    2) GoogleBot adds or updates the document record for the target URL. If source and target URL belong to different domains, the record will also contain a reference to the document entry of the redirecting URL. When there is more than one redirect to the target page, the last fetched source URL wins.

    3) When another GoogleBot crawls the target URL, it sets the referrer to the source URL. If the server sends a successful response, that GoogleBot indexes the returned content and attributes it to the source URL. The target URL will never appear in the SERPS or only appear among supplementary results. If, OTOH, the server responds with a 404 or any other well-defined error condition, GoogleBot removes the source page from the index and re-fetches the target page as usual, i.e. without passing the referrer.

    This solution a) informs webmasters that there are redirects to their pages, b) allows them to decide whether the redirect is legitimate or not and c) lets them disallow illegitimate redirects if necessary. Also note that it never duplicates indexed content under more that one URL. The content is either attributed to the source or to the target URL of a 302 redirect. The necessary overhead consists of one additional field in document records (to hold the source URL of redirects) and one additional request for pages that are targets of disallowed redirects.

    12:00 pm on Mar 11, 2005 (gmt 0)

    Preferred Member

    10+ Year Member

    joined:Apr 27, 2004
    posts:368
    votes: 0


    For what it is worth my site
    PR 7 site
    6 years old
    used to return on google for the search: site:www.mydomain.com - "some specific info from my site"

    this


    www.someotherdomain.com/ page.asp?title=target+keyword&url=http://www.mydomain.com

    other-domain.com/cgi-bin/tabi/navi/navi.cgi?links=82

    www.anotherdomain.com/Redirect. asp?ID=188&url=http://www.mydomain.com%2F

    today the site's title and description returned when searching for mydomain.com
    and
    site:www.mydomain.com - "some specific info from my site"
    now shows my site again with the redirecting sites in the "Supplemental Results"

    some traffic is back (well some of it anyway) and the cache date is the 10th of March.

    The only bad news is the Google Toolbar pagerank which is grey, but I'm not too worries about this yet.

    12:03 pm on Mar 11, 2005 (gmt 0)

    Junior Member

    10+ Year Member

    joined:Nov 27, 2003
    posts:153
    votes: 0


    Why the hijacking why that cr-p why all those problems we never had before. it's all because of the bloody ADSENCE program of google that will destroy the internet and google as well IF it goes as it is today>out of control<.Is like a gold fever and far west films that my life or your life.
    it looks to me that google has the philosophy "make a quick back today and "forget domani"(domani is tomorrow in Italian) like the song of Dean Martin".
    lets wait and see in a year or so who will control the net
    ...Google I don't thing so
    there index is already broken in 2 pieces ,the alegra update was the beginning of the end, they don't want to admit that because they supposed to be a multidollar company...so what the big deal shell and volks wagen are multi dollar companies but they are stable, so why should I invest my money at a company that has no platform and material like steel or oil and is based on every scam that will do anything for some backs from google's only source of income (ADSENCE),they know that they try to survive that's why every second all data centers changing there serps like rats on a wheel.Billy MSN Gates has the money and the brains to make the new MSN search as Google use to be 1-2 years ago if that will happen then is Goodbuy Google and if I am wrong let someone prove it.
    that's all folks.

    PS ther more to write and analyse here but it should be a good idea for someone to write a book about how the money fever can destroy something that a couple of years ago (the internet) use to be the best infodata we ever had in the planet.I can see people going back to the librarie very soon.

    12:04 pm on Mar 11, 2005 (gmt 0)

    Preferred Member from GB 

    10+ Year Member

    joined:July 17, 2003
    posts:598
    votes: 4


    Having said my site is back it seems specific to data centres. From my home PC I can find my site yet when I get to my office, no sign?!

    My nerves can't take it!

    This 713 message thread spans 48 pages: 713