Two things come to mind: proxy servers and/or scraper sites. Check out this thread and see if that lines up with everything you're seeing.
Hijacked Rankings - and Proxy Servers [webmasterworld.com]
Apart from tedster says, it is very likely that you have been banned. Have you done something blackhat (not just recently but ever before)? Maybe G just caught you.
Thanks for the replies.
I checked out the threads about hijacking but it doesn't seem to fit what I am seeing. As I mentioned before, site:example.com, doesn't return anything at all. inurl:example.com returns only two listings, both of which are harmless "Alexa type" website rating sites. (The site gets good marks from them, if you were wondering.) Also, searching for unique phrases found on my site doesn't turn up anything at all.
One odd thing is that link:example.com only returns one result. It used to return a lot more than that, though I don't have an exact number. Yahoo has the link count at about 300, many of which are from "good" sites, like DMOZ and Wikipedia.
If I search for my domain name on Google, I get more than 100 listings of sites linking to me. Nearly all of which have nothing but good things to say about the site. I haven't solicited links for this site in more than two years. It's just gotten them naturally once the site saw some exposure.
As to blackhat stuff, I've never been into that stuff and I don't have any interest to be either. Mostly, I just try to write articles that people will like and leave at that.
This is so irritating because I know that people like this site. I ran across numerous times people saying things like "This is the best site on the net to find..." The site has also been mentioned in a fair share of smaller publications and on the radio. This probably doesn't seem relevant but I just want to stress that this site is really well liked and not the least bit blackhat.
Got any other ideas? Anybody else have this happen recently?
This might be a serious situation rather than a short term bug. Do you have a Webmaster Tools account? If so, do you see anything in those reports that helps you make sense of what the problem might be? If not, I urge you to open one so you have a validated channel to communicate with Google.
Robots.txt problem perhaps?
forget about link:example.com
try the www version
I do have a Webmaster tools account. The strangest thing is that it says my index page was last successfully accessed on Jan 29th. Although, I can see from my logs that it was crawled a couple of dags ago.
The diagnostics tab shows no errors of any kind.
The stats page is quite strange though as it still shows the top phrases I used to rank well for.
Under the "What Googlebot sees" section it says in the phrases section, "In external links to your site Data is not available at this time." The keywords section is spot on though. Content and encoding also seem fine.
Crawl stats offers nothing meaningful.
The Links section has 0 "Pages that link to yours" and 0 "Pages with internal links".
The sitemap section seems fine, showing that it was last accessed 20 hours ago.
Under "crawl rate" in the the tools section, it only shows up until early September. (Is that normal?) One interesting thing is that the last few entries on the number of pages crawled per day graph are all below the average and there is a big spike in time spent downloading a page, all the way up to 1527 for the final entry on the graph. That's 3 times longer the average of 500.
When I check the robots.txt file it says "
Detected as a directory; specific files may have different restrictions "
And, link:www.example.com also returns the same one link that link:example.com returns.
the same thing happened to my site, been de indexed for over 6 months, googlebot visits every other day. how would you know if your site has been "proxy hijacked?" as for google webmaster tools there have been no messages as to why & when I went to submit a re inclusion request, the part where "choose site" has been disabled?
We had this happen to a site about 2 years ago, same stats as you in webmaster tools etc but still smoked in Google. I asked someone I know at Google about it and they could not figure it out and she is a real life friend who would not lie. The site is very popular via word of mouth so I could care less and have left the site be to see if Google ever corrects itself, which by the way it hasnt.
Well, certainly the last couple of posts don't sound very promising but I do appreciate you guys sharing your experiences.
I did think about making a reinclusion request. The site in question is available in the drop down box. (Does that mean the site has definately been banned?) But, I have to acknowledge, "I believe this site has violated Google's quality guidelines in the past." Which I do not believe to be true.
The only thing I can possibly think of that might be seen by Google as being "bad" is that I used WordTracker to get the ideas for most of the articles. I would use it dig up topics and then wrote articles about the topics I found. It wasn't for SEO purposes, it was just to find out what people wanted to know about the general subject. Could that be seen as wrong by Google on some level?
I am having a hard time accepting that this could be the problem though. I feel that the site would easily pass a manual review.
At this point, I'm wondering if I shouldn't just buy a new domain name, put the site on it and do a redirect from the old domain name. Maybe that wouldn't help much though. It would certainly be months before the site got anywhere near it's previous traffic levels, if I did.
Dial_d, someone suggested taking a look at robots.txt.
If you have weird dtuff in robots.txt you *will* drop out of google. I know this 100% first-hand personally.
And...your robots.txt, as you described it, looked very weird. Post the entire contents of your robots.txt file, making it clear where the actual content of the file begins and ends so we don't confuse it with comments from yourself.
Dial-d PM'd me his robots.txt and it looks fine to me, if a little on the long side :-) - certainly nothing to cause a problem.
My only other idea is to check your site has not been hacked into, and pages changed.
Same thing that happened to me as I've described in another post. Apparently at the same time - around last day(-s) of Sept.
Statistics show similar strange stuff - reports on crawl long time ago but that is not true according to my logs, report saying basically "we have no info at all on your site, whatsoever" - but that is not true either. If site is not in index - how come then the "How Google sees the site" is accurate.
Anyway, I will now start moving over my zapped sites to new site names (similar but not identic) and new hosting accounts (= IP's) and new owners etc etc. If Google mean what they say - they don't have any index of my sites, then there shouldn't be a duplicate problem either (closing down the old sites one by one slightly ahead of the change, just in case).
Reinclusion is not an option as it seems unlikely to happen and waiting 2 years to find out seems silly since there was a good Adsense revenue I would like to have back. I had a low PageRank but was indexed well anyway (niche sites) so moving old content into new domains seems to be a way out of the problem.
We'll see if moving over is painless or not...
I have gone over in mind everything that I could possibly think of that could have somehow gotten me banned but the simple fact is that I have done nothing to deserve being flat out banned. (I checked closely to see if I had been hacked, but no everything is fine.)
It's frustrating because I don't know what to do about it without hearing from someone at Google. Anything at this point would just be a stab in the dark. Never-the-less, I plan to wait until December maybe and if nothing changes move everything to a new domain name.
Do any of you guys think getting a high PR link to my site might help? It gets crawled daily but maybe a PR 7-8 (or two) might cause G change their minds? I feel certain that this is an algo thing and not a manual ban.
> I checked closely to see if I had been hacked, but no everything is fine.
Some hacks can be clever enough that you will never figure them out. Have you used firefox live http headers to see if your site redirects when the referral is a search engine?
And if you happen to be banned, what would be the effect of a 301 from such site to a new domain? What if the bad karma is also passed along?
Can you be more specific.
I used the firefox extension on some of our sites and I've found some super weird HTTP headers poping up:
GET /safebrowsing/update?client=navclient-auto-ffox&appver=188.8.131.52&version=goog-white-domain:1:23,goog-white-url:1:371,goog-black-url:1:[a number here],goog-black-enchash:1:[a number here] HTTP/1.1
Which looks like the google query for phishing sites. I am not a security expert so I don't know.
What kind of trick could be used?
I used the Firefox addon you mentioned but it did not turn up anything suspicious. Aside from that, I checked the file size and last modified dates of all the files on the server against the copy I have on my hard drive. Everything matched up perfectly.
As to redirecteding bad karma...
Of course that could happen (hell it might even be the most likely thing) but I don't want to dump all my existing links and abandon all the users who are still using the site.
If I do put up a new domain, I would have to remove the old one because I wouldn't want to risk having a duplicate site in Google's index if the orginal site ever did come back. A redirect is really the only viable option that I can see, since I do not want to abandon my current users.
One thing I suppose could be a possible explanation for this is that I use a form on my feeback page rather than my email address. Every now and then I catch a spammer trying to exploit it and use it send out spam emails. As far I know, they have never been successful but it's hard to know for sure. Maybe if someone did manage to send out a bunch of spam emails using that form it could have caused a ban?
I reckon that explanation is pretty far fetched though. Mostly because the site is not blacklisted anywhere that I can find and I use a pretty good filter that doesn't send an email if it catches something, instead it dumps it into a db. Although, it appears to the user like everything worked as normal.
> Maybe if someone did manage to send out a bunch of spam emails using that form it could have caused a ban?
What has Google search to do with email spam? These are independent channels. Help me find the relationship by which webspam would punish a site for email spam please. I know of a site banned like yours that wasn't using any forms to communicate and had no open holes and no databases that could be exploited but was using a java script to hide the contact email address. But can't see this either as a reason for an eternal, most unpleasant ban.
Anyway, let us know how the 301 solve your problems or not. Good luck!
My thought was that if the site did get blacklisted by some big name anti-spamming group that perhaps Google would discover that and ban the site.
However, I just spent the last couple of hours going through all the major ones I could find and everything was green. It was a long shot to begin with but at this point long shots are about all I got.
I guess I will try sending Google another email about it. Maybe if I keep at them, they will answer my questions. As someone mentioned earlier, I doubt that even someone working for Google could explain it. They might be able to fix it but I doubt they could explain it unless I have developed the biggest blind spot in the world.
Really, there aren't *that* many legitimate reasons for a popular site to suddenly get banned.
Ref redirecting and bad Karma;
There is another way of doing it, I've tried that with fairly good result.
Replace your old site with one single page. On this page, link to many authorities in your field (20-30 or so). Put these links at the bottom of the page, make them same color as background if you wish, you are not bothered by being listed by Google here. In fact you couldn't care less if they ban the site again.
In the upper visible screen area of the page - put one big JPEG picture saying something like:
"Site has moved - read all about exiting new gizmos HERE"
Link the JPEG to your new site.
User sees "Oops, click here to get to the site".
Google sees "link page with lots of stuff, one of them being your new site". Dilutes bad karma. Do it with CSS so the authority links come first in the HTML but last (visibly) to a human user.
Variations: Make it more pictures so many other pictures link to some authority site. Make sure these pictures are very bland, maybe even squares of same color as background. This makes your linked pic stand out less to Google.
Variation no 2 (slight risk): Make this page REALLY black hat. Use everything forbidden, who cares. Add another 50 of these pages, all totally black hat. All linking to a page on your new site but also to a LOT of other stuff. You will get 50-80% click-through from users who arrive via a search engine and probably near 100% from old users accessing the old site directly. Risk is that someone complains to Google who then decides to ban the new site as well.
Hey, thanks for the suggestions.
I'm not sure if I would do any of that but I do appreciate you sharing your knowledge with me. It certainly gave me a lot to think about.
It sounds like you have some experience with situations like these. Do you have any suggestions as to what could have caused this to happen?
I have the same problem with one of my sites. Analytics says the last time it was crawled was June, but that is not accurate.
Is your page rank greyed out? The pr on mine is not greyed out which makes me wonder if it is a glitch?
Analytics is saying that I have an error on the sitemap, but honestly is that enough to get it banned?
I am going to look into the hijacking possibility, but if that were the case, I wouldn't have been seeing *any* traffic from Google, correct? I have seen quite a bit up until the past few days. Now we are completely de-listed.
The PR on a few pages is greyed out (I think it has always been like that) otherwise I have seen no changes in the PR of the site in general. Index page and most of internals have the same PR they have had for the last year or so.
I'm not sure if it would show right away though if you lost your PR. Wouldn't we have to wait until the next PR update to know for sure?
Like you, I suspect that this is some kind of glitch and I don't think your sitemap has anything to do with it. Never-the-less, I would fix it if you can. Have you had a sitemap up for very long? I'm sure my problem isn't related to that because I didn't put one up until this problem surfaced.
Do you have a Webmaster tools account? If so, does it show any errors when crawling your site?
I have had the sitemap running for well over a year. I do have a webmaster account and no, there are no errors showing. I am just as baffled as you are right now. I am going to fix the sitemap, and look into the hijacking possibility, but I honestly don't think either of those things is the issue.
Did you take a look at your GWT internal links?
Ever since we had a proxy hijack many pages never reapeared in GWT. And the latest Google.com site: query shows that many pages are totally gone awol again.
It's like we are penalized now for what others did to us long time ago.
You may be in the same situation. I too went through the process of checking our servers, hired security experts and everything, found nothing.
Internal links in GWT is 0. I assume this is because they have no pages indexed though.