Welcome to WebmasterWorld Guest from 18.104.22.168
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
May be true, but it would be noticed .. tax season is coming .. would surely make the IRS angry .. BIG publicity .. IRS has megapower.
Should make major headlines
ok, point taken.
japanese, what do you think about the possibility of this happening?
As a result these moster affilaite domains, 10000s pages are getting assigned high PR and are in fact being returned in the results ahead of the main domain.
It's all lowcost, lowoverhead twisted jacking at the end of the day to try and take commission. It won't stop until either there's a law against it or it costs a lot of money to do!
which laws?: unfair competition, service-/trademark violations, copyright infringement, ...
I contacted Google Adsense and here is the answer:
"Google AdSense is a program for web publishers who want to display
advertising on web pages they control. By placing AdSense code on their
web pages, the publisher can display text-based Google ads that are
relevant to the content readers see on the pages. Publishers, not Google,
control what pages have ads and the content of those pages.
Google is a provider of information, not a mediator. We serve ads targeted
to certain web pages, but we don't control the content of these pages. For
these kinds of questions or comments, it is best to directly address the
webmaster of the page in question.
If this website is found to be in violation of any of these policies, we
will take the appropriate action on the account."
Ownerrim, i agree that this makes no sense.
No, you shouldn't have to. And you shouldn't have to use absolute referencing either. You should just build your pages as you see fit, making sure they conform to the relevant standards for what they are supposed to do, but not really anything else. Both of the above are optionals, it's not anything you're required to do as a webmaster by any official body whatsoever. Of course, you shouldn't even have to worry about either using redirects or being redirected to either.
So, use one, or both, or neither, as you choose. I can't promise that any option will work, though - i wouldn't like to create false hope here. Nowadays, it's sometimes good to do a lot of stuff that you really shouldn't have to do.
First, that idea has been tried before at another forum with "search engine" as part of the title. I believe they had a site that simply volunteered to be hijacked, which in my view is very important, if not the most important issue at all.
We must have the backing of Claus and Brett
By posting here, I (as well as everyone else) have agreed to conform to the TOS of WebmasterWorld [webmasterworld.com]. It's not always that i (or anyone else) remember the TOS verbatim, so of course sometimes there are misses, but basically Brett has set some rules for this place that should be followed by the members.
I can't really speak for Brett, but i doubt he would back it, as i just looked up what the TOS of this board has to say about the matter:
#26: Claims of action, flames, and calls to action against any company or person will be removed.
Hint: The word "removed" does not exactly sound like "backed up" to me ;)
Third, what i would very much encourage you all to do is to follow this piece of advice for every single example you know about:
If people want to send specifics (i.e. "site A appears to have duplicate pages from, or is doing a 301/302/whatever to site B, and Google is wrongly picking site A as canonical", with actual values for A and B), I'd be happy to hear them. Drop an email to webmaster [at] google.com with the keyword "canonicalpage" (all as one word)
Include all the specifics you can find (like URL's, server headers, and whatever) but keep it factual. It's no use asking questions as you probably won't get an answer, so don't expect that. Just the facts, nothing else. Send it off into the big G webmaster inbox and expect nothing in return.
Last, you can always do your own write-ups about the situation on your websites and blogs and whatever. If you can find space for it, do include the quote under "third" above, please, as that's the only "confirmed tool" we have to remedy the problem. Spread the message as you see fit. Also, i hereby cancel and reverse what i wrote in post #54 of this thread about it not being intended for republishing:
Limited public license of right to copy (copyright):
Feel free to copy the whole or any parts of post #54 of this thread [webmasterworld.com] as authored by me to any web site, blog, or other medium of choice
- as long as you do all five of these things:
- you do not edit it so that it changes meaning or context
- you clearly state that you did not write it
- you provide a link to this thread and mention the post number so that it can be found by anyone wishing to examine the original
- you do not use the post to encourage, endorse or justify any action that it does not encourage, endorse or justify in and by itself, as it is.
- Specifically, you must explicitly state that you oppose to any kind of hijacking and that you do not encourage this.
What you don't need to do:
If you do the above you don't need to mention my nickname or real name (see profile), but i would appreciate it very much of course. I would also appreciate a link to this license (or post number) to accompany any quote, so that everybody can see that you have in fact been permitted to post it, but i'm not requiring that. You specifically don't have to link to any web site of mine.
That should be easy, right? I think/hope Brett will okay this, as it's one post only, he gets a backlink, and after all it is me that's the author. Let it fly...
Also, i see that there are some members that have seen some improvement on some datacenters. That's nice, but i also feel it's too early to say if this is really being solved. It could be all kinds of coincidences.
In this thread, the best bids sofar seems to be:
always redirect non-www to www (or the other way round) use absolute internal linking (ie. include your domain name in internal links) include a bit of always updated content on your page (eg. a random quote, a timestamp, or whatever) use the <base href=""> meta tag on all your pages
Trying to determine just how effective these points are. As mentioned earlier I have not been affected by this 302 problem, and I do have 302 redirects pointing to my site. (They have been contacted and asked for removal)
Being a bit of a perfectionist, when I designed the make-over for my site I replaced all the links with absolute links. I added a bit of always updated content to many of my pages. My htaccess points all domain requests to www.domain. Most of this was done before the 302 problem began to surface.
In the past 18 months I have not seen my site hijacked. So, the question is, do these methods help prevent hijacking? If you were hijacked, do you use these methods? Or did you just start using them after you were hijacked? I understand that once hijacked it's a bit late to start employing different tactics - when you're gone you're gone until the G-Gods see fit to re-include you. But were you using these methods before you were hijacked? Your answers just might help another webmaster sleep a little easier.
I know that GoogleBot can’t provide the referrer when crawling a page because a) one page might have many links to it and hence more than one possible referrer and b) because the GoogleBot instance that crawls the source page of a link might not be the instance that crawls the target page of a link. BUT: Why not make an exception for 302 redirects? An adequate procedure to accomplish this can be laid out as follows:
1) GoogleBot crawls the redirecting page (source page) and gets a 302 along with a Location: header containing the URL of the target page.
2) GoogleBot adds or updates the document record for the target URL. If source and target URL belong to different domains, the record will also contain a reference to the document entry of the redirecting URL. When there is more than one redirect to the target page, the last fetched source URL wins.
3) When another GoogleBot crawls the target URL, it sets the referrer to the source URL. If the server sends a successful response, that GoogleBot indexes the returned content and attributes it to the source URL. The target URL will never appear in the SERPS or only appear among supplementary results. If, OTOH, the server responds with a 404 or any other well-defined error condition, GoogleBot removes the source page from the index and re-fetches the target page as usual, i.e. without passing the referrer.
This solution a) informs webmasters that there are redirects to their pages, b) allows them to decide whether the redirect is legitimate or not and c) lets them disallow illegitimate redirects if necessary. Also note that it never duplicates indexed content under more that one URL. The content is either attributed to the source or to the target URL of a 302 redirect. The necessary overhead consists of one additional field in document records (to hold the source URL of redirects) and one additional request for pages that are targets of disallowed redirects.
today the site's title and description returned when searching for mydomain.com
site:www.mydomain.com - "some specific info from my site"
now shows my site again with the redirecting sites in the "Supplemental Results"
some traffic is back (well some of it anyway) and the cache date is the 10th of March.
The only bad news is the Google Toolbar pagerank which is grey, but I'm not too worries about this yet.
PS ther more to write and analyse here but it should be a good idea for someone to write a book about how the money fever can destroy something that a couple of years ago (the internet) use to be the best infodata we ever had in the planet.I can see people going back to the librarie very soon.
We must abide to the etiquette Brett and Claus respectively commend.
This in its self is a testament that it is wrong to target a site for bombardment with 302 redirects.
It is a thing that plays on the conscience. We know of the dire consequences my tentatively proposed action against an innocent site could have. So we feel guilty and it is against our principles to deliberately point 302 redirects to another site. But doing so in the thousands a day by the blackhats and big affiliate companies does seem to bring about a feeling of anger.
We are in a catch 22 situation. Damned if you do, Damned if you don’t.
However, like I mentioned in an earlier post. I subjected a big affiliate company with a threat that I had prepared them as my target for 302 bombardment. They threatened me with legal action, it was my method that they were afraid of that caused them to remove their 302 redirects to my sites.
I am convinced and without in no shadow of a doubt that many sites are in oblivion in google because of redirects. I respect all comments in this thread, I also understand the dilemma that many sites are in.
...can destroy something that a couple of years ago (the internet) use to be the best infodata we ever had in the planet.I can see people going back to the librarie very soon.
Hmmm....I doubt that very much. Considering every other story is about the RISING influence of the internet (in politics, research, etc.) Like any other new medium, it has problems to deal with, which I have no doubt will be solved eventually thorugh technical means.
Having said my site is back it seems specific to data centres. From my home PC I can find my site yet when I get to my office, no sign?!
My nerves can't take it!
Then stop looking every 5 minutes. Wait a month, and have another look.
i say again and i want answers maybe from goooogleguy
Don't hold our breath...rest assured, as google is now a publicly traded company, each and every public comment is vetted through their PR machine. I doubt we'll EVER hear another comment of google having technical problems having to do with the quality of their search. How would you like ot be the guy that caused your friends net worth to drop 6 or 7 digits because you responded to complaints on a message board?
We must abide to the etiquette Brett and Claus respectively commend."
I *very respectfully* disagree, Japanese. Perhaps this is the wrong forum to discuss "call to action", but IMHO, something needs to *demonstrate* the problem or Google can simply keep denying the extent of the problem.
If a site it brought down it can apparently be restored just at the big site mentioned earlier was "rolled back".
"This in its self is a testament that it is wrong to target a site for bombardment with 302 redirects."
It is the opinion of 2 people, one of whom might be protecting himself legally with TOS. It is best done in more private discussions.
"We are in a catch 22 situation. Damned if you do, Damned if you don’t."
Might I suggest that we are stuck in an ongoing predicament waiting for hackers to destroy our sites if we do nothing. If we do something to demonstrate the problem we will have taken the first step in approaching a solution .. or a mitigation.
The targetted site must be called by telephone to agree that we blast them with 302 redirects controlled via scripts of our choosing, either CGI, GOP-PHP etc and will be configured to have undesirable affects regarding the targeted site. The 302 method will be no different from any other 302 redirect. It will provide serverside info for googlebot with a different url to the target site but pointing to their location in our attempts to create duplicate contents of their most important pages in the google index of results. We will attempt to cause the site duplicate content penalty in google and to bring about the websites demise in results.
The site must be a commercial site that pays a few wages and that a reasonably big expenditure had been invested in that site. They must not protect their site by excluding robots and their pages must allow all robots. They are not allowed to make modifications to protect their network by using .htaccess to deny robots. Their websites must be left vulnerable to our attack. They are not allowed to cheat by denying robots access in any way.
They must agree to be bombarded 6 levels deep of their entire spiderable pages and network. Our aim will be to bring about unpredictable and adverse effects to their website by our minapulation abilities of googlebot and msnbot. Once we have achieved at least one duplicate page in the results of google our onslaught of 302 serverside redirects aimed and configured to disrupt their internet visibility, we will cease the experiment and that we would have been satisfied of the effectiveness of our ability to bring their website to its knees.
They must also agree that after the attack they may never regain their ranking in google and that we cannot be held responsible for any damages of any sort.
They must agree in a manner that does not make us liable for any legal action against us.
They should accept this offer because I see no reason that they should not. They are already accepting the exact same as above so it would be very unreasonable of them to deny us the opportunity to enact the process of bringing down their website.
If they refuse, put pressure on them to explain ON WHAT GROUNDS do they refuse. We have every right to try and do what we want and it really is none of their business.
Persist by contacting another similar site until you find a suitable volunteer. But like I suggested, do net accept a refusal easily, what right do they have to deny us this opportunity to excersise our ability to bring their websites to its knees. They have the reasurance also from google that nobody can affact their ranking.
TEY INFORMED ME 10 MINUTES LATER BY TELEPHONE
Warning me that my voice was recorded and my earlier e-mail containg my proposal had been passed on to their in house lawyer.
Believe you me......They blocked my IP to their network, Moved over to my other computer and I started spidering their html pages, after I got them I called them back demanding that I be allowed to pass as many 302 to them as my server could handle.
I relented because I was threatened with legal action if I processed a single 302 against them.
Boy, this thread has been popping up pretty consistently for the past several years. I was heavily involved with the last very long thread at [webmasterworld.com...] which was about this very thing.
Finally got my sites back. How you ask? I threatened the offenders with everything I could. Threatened to contact every one of the sites in the offending directory. Threatened to post the offenders on all forums that would allow it. Threatened to dedicate a site specifically to bring attention to what they were doing. Threatened to report them to G, Y and A,B,C too... etc. etc. etc.
Finally got all their links removed and my sites are back to their original positions.
Like DaveAtIFG said, this has been going on since '98. It comes up here everytime someone gets ticked enough not to leave it alone. That is pretty much what I did.
As for the G engineers knowing of the problem... They know. My sites were submitted to them as actual examples. Not sure why it wasn't fixed. It would seem an easy fix, but then again, I am not a G engineer nor do I know what is involved.
I got a stickey recently from someone who wanted to know of one of the directories that had hijacked one of my sites. Guess what? The site is gone. No longer there. This directory originally had thousands of sites listed with links that all used the hijack hack. Not sure if they were found out and moved or if they just went out of business. Too bad :-)
One thing to note,though (and I found this out the hard way), do not think that if you are hijacked that the offenders have done this deliberately. Here is where you can get in trouble. There seems to be some off-the-shelf type or downloadable code for directory builders that utilize php that have an apparent problem with this. The links seem to be automatically built this way, unbeknownst to whoever is using these codes. So before you flame or defame, please make sure it was deliberate before you end up with egg on your face. That happened to me.
Anyway, keep up the fight people. Just thought I would chime in and give my 2 cents worth.
The site must be a commercial site that pays a few wages and that a reasonably big expenditure had been invested in that site.
great some poor people lose their jobs and/or investment!
They must also agree that after the attack they may never regain their ranking in google and that we cannot be held responsible for any damages of any sort.
lose their jobs permanently!´
They should accept this offer because I see no reason that they should not.
You have made some great posts, I agree with most of what you say in your other posts but this is maybe going a bit far.
you cannot seriously expect a company to let you bankrupt them just to prove a point.
I know all this is frustrating (not to mention monetary loss) but IMHO killing off some poor company's website is going over the top.
try and find a way to bring G to its knees instead of some innocent company.
if you are talking about hijacking a site that is doing it to YOU then I totally agree.
just my thoughts
[edited by: diddlydazz at 5:36 pm (utc) on Mar. 11, 2005]
There seems to be some off-the-shelf type or downloadable code for directory builders that utilize php that have an apparent problem with this. The links seem to be automatically built this way, unbeknownst to whoever is using these codes.
webdude, my client is in the market for a dir script… can you or someone else list these offending scripts so unknowing webmasters can asses their situation & those purchasing scripts are aware of this issue? Or are these deliberate script modifications to knowingly exploit the 302 loopholes?
[edited by: soquinn at 6:04 pm (utc) on Mar. 11, 2005]
You are 100% correct. It is against my principle to knowingly do it in cold blood. But the post just explains what could possibly be at stake.
Many sites are going down and the only difference is that they have had no warning. Imagine the lady that approached me a couple of months ago. She used to sell dolls at her website, avarage 10 a week times £10.00 profit on each. Small income from adwords..... All now gone, her site dropped from number 8 in her top keyword to total oblivion. I warned the offending sites that managed 2 go=php based scripts to cause duplicate content of her index page to deactivate their links so that I used the google URL-CONSOLE... They promptly removed the links. I did not mention that I knew about one of them also had a meta refresh to her index page. I then contacted the webmaster and warned him that a dos script was pointing at his site and he had 5 minutes to prove to me that nothing at his site contained the the lady's URL. I told him to check his raw logs and he would see his entire html pages had already been spidered. He then removed the meta refresh.
I ask you, they guy removed the redirect but had to be warned again to remove the meta refresh. The lady's site is still in oblivion. What person outside SEO capability would have known what to do. Her site was totally at the mercy of the sites causing her redirects. I admit, she reciprocated links with them. That makes their trick even dirtier.
Perhaps we may think of a way to bring this thread to google's attention.