Welcome to WebmasterWorld Guest from 18.104.22.168
Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.
Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.
Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.
Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.
There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.
Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.
Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.
I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.
If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.
I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.
[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]
IF G is TRULY in the process of fixing the problem, after 18 months, there really is no further need to ponder the 'problem'.
>The direct consequence is that ...and no longer get spidered.
Have you TRIED resubmitting them recently?
However, it is my supposition that I had a domain name server glitch back in January and this failure gave the sites redirecting to mine added weight and thus Google saw these sites as my sites. The direct consequence is that my top four sites are now in Google’s supplemental index in a state of perpetual dormancy and no longer get spidered.
Sounds plausible. The site of mine that lost its home page does go down every now and then.
My other site which has had 1 sub page 302'd in the index hasn't gone down as far as I know, though at heavy traffic times it may have problems.
Compliance = (Adwords.budget.profit – company.size/competitor.Adwords.budget)
The back bone are the webmasters who affect the compliance. Adwords or not. If you do evil – every one should make you go broke and not you should make you competitor go broke. The 302 should only effect the pages on the inside structure of your own domain structure – otherwise I should borrow $500.000 to invest into 5 “nuts” to bring all of my competition and officially be sued for monopoly on selling widgets on the internet due to software compliance of major SEs.
Duplicate content should only count if ……they make the rule….
To people at Google – I want to see a listing of all URLs that point to my site – not just the most important once as “you” think. I also want to see the exact number of pages indexed on my site – all 4714 of them – with exact URL Indexed, Cached Copy and the correct Time when you were here(not 1969). I want to see all WebPages that have my domain name in them, sites that advertise for my domain name, not GEO Content delivery(just got off the phone with my ex-advertiser - she is in LA, I am in NJ, I see the Her Adwords, she does not, she calls her NY Office, They don’t see it, I say go to Google Software to check, she says Ohh. - I want to know when some one Advertises for my domain name in the keywords at all times – I AM DA OWNER – YOU MAKE MONEY)
Do I depend on Google –NO.
Then why would I comply – you come to my site – get content – then say its good for nothing. Last Year We spent more that 2Cents Per URL in G*Index.
My 2Cents Again.
But I would really like to see evidence for the "the URL with higher PR wins" theory.
I want to say that some posts here use really confusing terminology like 'syntax'. A syntax is a set of rules describing how to form sentences in a language. Also, I try to avoid the term page because nowadays with all the dynamically generated content and redirects it can become very fuzzy. I prefer the terms URL and content.
Now, if there was something you could do about this as a webmaster, i would know it, and i would have posted it. Not in this thread, but more than a year ago.
In this thread, the best bids sofar seems to be:
I personally very much doubt that any of these will fix the problem for a page that is already hit, but OTOH they will positively not damage you, so if you feel like trying them you will lose nothing by doing so.
This can only be fixed at the source of the problem, which is neither you nor the page doing the hijack - the source of the problem is Google, MSN and whatever other engine that mixes this up.
- On most DCs we're back on page one for our site name. Even saw it at #1 on a live google.com search. (A pleasent surprise after a month of pages 3 to 11.) Surrounding results are much more relevant than the junk that had been showing up post-Allegra.
- 302s previously ranking for our site name seem to have been beaten down in the SERPs, but still in the index. (But we're still being topped for inurl:domain.com by a site that's framing our home page...that may be a whole other problem.)
- A number of old URLs that I had 301'd and successfully deleted with the remove URL tool have found their way back into the index (hmm...rollback?), though mostly as URL-only or supplemental results.
I am in dispute with alexa about 2 links in google of theirs that contain my URL.
One is a download and the other goes to the popularity page regarding my site with no description, just a long address. And both are displayed in google inurl: but my actual URL is missing from the results. This is simply not on and I am sure the existance of the alexa links above to be damaging to my site.
As of yet no reply by them. I am sure because my url was detected via their 302 somehow involved in the links, I just cannot work out exactly but am working on it. I will for sure give alexa one last chance to deactivate those links or I will retaliate with my spare server.
My site should appear in results for inurl: it does not and I cannot believe that all is ok. Its name is unique and I do not expect anything else to appear in an inurl: result other than my website.
Completely useless to be in google results but never the less. This only happened because googlebot deemed they should be.
These are the last 2 links I am trying to remove from google for one of my sites that disappeared after being number 1 for many keywords for 2 years when the go-php hijackers killed it. 3 meta refreshes were pointing at the same site that contained my index page cache. One scraper site closed down pending a fraud investigation. Anything you clicked on that site the webmaster made money. It had no e-mail and whois was private. But as you know, its easy to bypass private whois.
What actions did you take that may have affected this?
None. When Alegra hit I promised I'd site on my hands for a couple months and just ride it out.
Okay, that's not entirely accurate. I did redesign the home page a couple weeks ago, though for non-SEO purposes. But the SERP changes I'm seeing for our site name search are pretty wholesale...much more shuffling going on than I could possibly affect with any on-page changes.
Thanks for the braying colloquium in syntax. Perhaps you can enlighten us as to how one would exemplify a multifarious procedure like I described in great detail in layman’s terms so that everybody understood it as best as possible.
You took a colossal thread and picked out a single flaw, and you thought that the terminology was more requisite than the content that many others thought was a revelation, and some regarded as a startling exposure of the processes that cause a problem to their sites.
I should perhaps contribute less in a thread that contains ridicule at others expense. This sort of thing really is uncalled for and you seem to expect that full computer jargon would be understood by the many people that read that post.
In an attempt to make Joe public understand the procedure there seem to be a small flaw in terminology in your eyes, Do you honestly think that all who read the post were unix or apache server engineers? With pedantic code of ethics and dogmatic vocabulary principles.
Most who read it were people interested to find an answer to their problems, they were website owners, webmasters etc, not pontificating erudite gurus of paradigm.
It is possible. Their is no doubt about it. Many things can cause havoc.
But I stress that no matter how angry we are we must stay as cool as a cucumber.
I was threatened with legal action by a giant affiliate company if I carried out my threat to cause massive 302's to their entire network of webpages 6 levels deep.
They removed all redirects pointing to one of my sites and a truce was established.
Need I say more. All it took was a threat of how I intended to do it, they acted very fast to remove the meta refresh and 302 redirect to my site from their clients website who was the one that actually did the script changes.
Extra credit for style points as well. I like the explanation you gave for the regular folks just fine. I participated ad nauseum about this topic about a year and a half ago calling this the biggest issue nobody wanted to talk about and as I remember Claus was about the first senior here to acknowledge this was an issue. I am glad to see this issue aired out, but I also remember saying the exact thing maybe about a year ago.
[edited by: idoc at 2:36 am (utc) on Mar. 11, 2005]
Forgot to mention.
It is actually against the law to cause damage of any sort to a large company.
I guarantee you this. Call a large company tomorrow and tell them you are going to cause 302 redirects to their websites and that they should inform their IT department to explain the consequences and what 302 means.
You will be warned of legal action if you proceed.
It is actually against the law to cause damage of any sort to a large company.
The issue here would be mens rea, intention, to cause harm. If the site we made was generated automatically, like alexa does, and the issue in question can't even been prooven to exist conclusivly i can't imagine this would be illegal. simply linking to a site with a 302 redirect certainly isn't illegal, google is currently doing it on thier serps.
How about a new non-profit company called <insert name> that webhosting services have to join to become a member off. Once those webhosts are a member of said company they, without question, recieve top rankings in serps over any non-member company for any search engine that wants to be labeled <insert name> guaranteed results. Of course any search engine or webhost could opt out of this service with no reprocusion. (this would also include companies that host their own sites, they would have to certified just like the webhosting companies)
Now anybody who wants to host a website on one of these new certified webhosting companies has to pay a mandatory $50 fee per domain to the non-profit company that will go into an escrow account, and after either one year or when they cancel hosting they have this money returned. On the other hand if they are caught making malicious websites they lose the $50 and are kicked off any and all certified webhosting companies for one month and when they come back they will have to pay a $500 deposit. Of course if a webhost is warned that one of it's customers is doing malicious things and they don't kick them off then they simply lose their certification.
I'm not talking about monopolizing the internet, search engines could simply give people a choice to select to only recieve certified results or a mix of both.
Too many people work to damn hard to create their websites and content to have it simply taken away from them. If any yahoo, msn or google reps come across this post, understand this, whichever one of these three were to set up a system like this that worked I would forget about the other two real quickly unless they joined in. I would also be much more happy to see my advertising dollars being used to help promote good content from the people who actually created it and not some scum who is stealing other peoples hard work. These 302 and china problems would all be a thing of the past. We would all talk about the crazy days of the web when people were even afraid to mention their website online for fear of being attacked, hacked or hijacked.
Many thanks for your kind words. I take inspiration from your comments.
Yes, I am trying to help whoever I can to understand what is going on with this problem. I have seen website after website disappear in google for no reason other than this.
There was a pattern that stood out like a sore thumb. Long established commercial sites, suddenly drop, no real explanation other than all having many strange long URL's with their URL's trailing in the link. On many of these links there are no title or description, they exist with the lost websites index page as the google cache, meta refresh to the lost site or framed.
I ask you, in the name of fairness, some of these websites I saw this happen to were small commercial websites that went bankrupt. Or the very least a disheartening feeling besets the owner or confused webmaster.
Google must sort this out.
Indeed, you are correct, it does not appear to be illegal to do a 302 redirect to whatever website or page you want. Provided you do not disclose you are doing it.
Knowing the harm it could cause, I feel guilty to point these redirects to sites that do not belong to me.
Try it. I bet you will have a sense of guilt pointing a 302 deliberately at a site you do not own. And the slim chance that you can destroy that site will wear you down as a contentious person. The act will be against your principles and you will not be able to sleep knowing you pointed a gun at someone innocent.
How about the damage others are causing to SMALL companies... is that NOT against the law?
If I were going to retaliate, I'd start with a Google subsidiary or something they own a controlling stock interest in. That way, not only will they fully understand the ramifications without need for lengthy explanations, but they'll also have a vested interest in FIXING it quick AND noone to blame but themselves!
As far as another company taking legal action against you, I think you'd have a good argument that it wasn't your fault at all, but rather G's. You didn't write the SE algorithm. I don't recall seeing any law against using 302.