Forum Moderators: Robert Charlton & goodroi
About 2-3 weeks ago, one of my sites was completely de-indexed from Google. site:example.com returns 0 results. Searching for my site's domain name provides results from plenty of sites that link to me but not my own site.
Looking in Google's webmaster tools, it says that my site was last crawled Jan 29th. which is no where near correct. I went ahead and submitted a site map but that hasn't changed anything. Although Google's bot has been to my site since I submitted it.
This site before becoming de-indexed more or less dominated it niche. It's a micro-niche you might say and my site is the only authority site to speak of.
Does anybody have any suggestions as to why this might have happened? I have sent G an email about it but have not recieved a response.
Thanks in advance to anyone who responses.
Hijacked Rankings - and Proxy Servers [webmasterworld.com]
I checked out the threads about hijacking but it doesn't seem to fit what I am seeing. As I mentioned before, site:example.com, doesn't return anything at all. inurl:example.com returns only two listings, both of which are harmless "Alexa type" website rating sites. (The site gets good marks from them, if you were wondering.) Also, searching for unique phrases found on my site doesn't turn up anything at all.
One odd thing is that link:example.com only returns one result. It used to return a lot more than that, though I don't have an exact number. Yahoo has the link count at about 300, many of which are from "good" sites, like DMOZ and Wikipedia.
If I search for my domain name on Google, I get more than 100 listings of sites linking to me. Nearly all of which have nothing but good things to say about the site. I haven't solicited links for this site in more than two years. It's just gotten them naturally once the site saw some exposure.
As to blackhat stuff, I've never been into that stuff and I don't have any interest to be either. Mostly, I just try to write articles that people will like and leave at that.
This is so irritating because I know that people like this site. I ran across numerous times people saying things like "This is the best site on the net to find..." The site has also been mentioned in a fair share of smaller publications and on the radio. This probably doesn't seem relevant but I just want to stress that this site is really well liked and not the least bit blackhat.
Got any other ideas? Anybody else have this happen recently?
The diagnostics tab shows no errors of any kind.
The stats page is quite strange though as it still shows the top phrases I used to rank well for.
Under the "What Googlebot sees" section it says in the phrases section, "In external links to your site Data is not available at this time." The keywords section is spot on though. Content and encoding also seem fine.
Crawl stats offers nothing meaningful.
The Links section has 0 "Pages that link to yours" and 0 "Pages with internal links".
The sitemap section seems fine, showing that it was last accessed 20 hours ago.
Under "crawl rate" in the the tools section, it only shows up until early September. (Is that normal?) One interesting thing is that the last few entries on the number of pages crawled per day graph are all below the average and there is a big spike in time spent downloading a page, all the way up to 1527 for the final entry on the graph. That's 3 times longer the average of 500.
When I check the robots.txt file it says "
Allowed
Detected as a directory; specific files may have different restrictions "
And, link:www.example.com also returns the same one link that link:example.com returns.
I did think about making a reinclusion request. The site in question is available in the drop down box. (Does that mean the site has definately been banned?) But, I have to acknowledge, "I believe this site has violated Google's quality guidelines in the past." Which I do not believe to be true.
The only thing I can possibly think of that might be seen by Google as being "bad" is that I used WordTracker to get the ideas for most of the articles. I would use it dig up topics and then wrote articles about the topics I found. It wasn't for SEO purposes, it was just to find out what people wanted to know about the general subject. Could that be seen as wrong by Google on some level?
I am having a hard time accepting that this could be the problem though. I feel that the site would easily pass a manual review.
At this point, I'm wondering if I shouldn't just buy a new domain name, put the site on it and do a redirect from the old domain name. Maybe that wouldn't help much though. It would certainly be months before the site got anywhere near it's previous traffic levels, if I did.
If you have weird dtuff in robots.txt you *will* drop out of google. I know this 100% first-hand personally.
And...your robots.txt, as you described it, looked very weird. Post the entire contents of your robots.txt file, making it clear where the actual content of the file begins and ends so we don't confuse it with comments from yourself.
Statistics show similar strange stuff - reports on crawl long time ago but that is not true according to my logs, report saying basically "we have no info at all on your site, whatsoever" - but that is not true either. If site is not in index - how come then the "How Google sees the site" is accurate.
Anyway, I will now start moving over my zapped sites to new site names (similar but not identic) and new hosting accounts (= IP's) and new owners etc etc. If Google mean what they say - they don't have any index of my sites, then there shouldn't be a duplicate problem either (closing down the old sites one by one slightly ahead of the change, just in case).
Reinclusion is not an option as it seems unlikely to happen and waiting 2 years to find out seems silly since there was a good Adsense revenue I would like to have back. I had a low PageRank but was indexed well anyway (niche sites) so moving old content into new domains seems to be a way out of the problem.
We'll see if moving over is painless or not...
It's frustrating because I don't know what to do about it without hearing from someone at Google. Anything at this point would just be a stab in the dark. Never-the-less, I plan to wait until December maybe and if nothing changes move everything to a new domain name.
Do any of you guys think getting a high PR link to my site might help? It gets crawled daily but maybe a PR 7-8 (or two) might cause G change their minds? I feel certain that this is an algo thing and not a manual ban.
Some hacks can be clever enough that you will never figure them out. Have you used firefox live http headers to see if your site redirects when the referral is a search engine?
And if you happen to be banned, what would be the effect of a 301 from such site to a new domain? What if the bad karma is also passed along?
Can you be more specific.
I used the firefox extension on some of our sites and I've found some super weird HTTP headers poping up:
GET /safebrowsing/update?client=navclient-auto-ffox&appver=2.0.0.5&version=goog-white-domain:1:23,goog-white-url:1:371,goog-black-url:1:[a number here],goog-black-enchash:1:[a number here] HTTP/1.1
Which looks like the google query for phishing sites. I am not a security expert so I don't know.
What kind of trick could be used?
As to redirecteding bad karma...
Of course that could happen (hell it might even be the most likely thing) but I don't want to dump all my existing links and abandon all the users who are still using the site.
If I do put up a new domain, I would have to remove the old one because I wouldn't want to risk having a duplicate site in Google's index if the orginal site ever did come back. A redirect is really the only viable option that I can see, since I do not want to abandon my current users.
One thing I suppose could be a possible explanation for this is that I use a form on my feeback page rather than my email address. Every now and then I catch a spammer trying to exploit it and use it send out spam emails. As far I know, they have never been successful but it's hard to know for sure. Maybe if someone did manage to send out a bunch of spam emails using that form it could have caused a ban?
I reckon that explanation is pretty far fetched though. Mostly because the site is not blacklisted anywhere that I can find and I use a pretty good filter that doesn't send an email if it catches something, instead it dumps it into a db. Although, it appears to the user like everything worked as normal.
What has Google search to do with email spam? These are independent channels. Help me find the relationship by which webspam would punish a site for email spam please. I know of a site banned like yours that wasn't using any forms to communicate and had no open holes and no databases that could be exploited but was using a java script to hide the contact email address. But can't see this either as a reason for an eternal, most unpleasant ban.
Anyway, let us know how the 301 solve your problems or not. Good luck!
However, I just spent the last couple of hours going through all the major ones I could find and everything was green. It was a long shot to begin with but at this point long shots are about all I got.
I guess I will try sending Google another email about it. Maybe if I keep at them, they will answer my questions. As someone mentioned earlier, I doubt that even someone working for Google could explain it. They might be able to fix it but I doubt they could explain it unless I have developed the biggest blind spot in the world.
Really, there aren't *that* many legitimate reasons for a popular site to suddenly get banned.
There is another way of doing it, I've tried that with fairly good result.
Replace your old site with one single page. On this page, link to many authorities in your field (20-30 or so). Put these links at the bottom of the page, make them same color as background if you wish, you are not bothered by being listed by Google here. In fact you couldn't care less if they ban the site again.
In the upper visible screen area of the page - put one big JPEG picture saying something like:
"Site has moved - read all about exiting new gizmos HERE"
Link the JPEG to your new site.
User sees "Oops, click here to get to the site".
Google sees "link page with lots of stuff, one of them being your new site". Dilutes bad karma. Do it with CSS so the authority links come first in the HTML but last (visibly) to a human user.
Variations: Make it more pictures so many other pictures link to some authority site. Make sure these pictures are very bland, maybe even squares of same color as background. This makes your linked pic stand out less to Google.
Variation no 2 (slight risk): Make this page REALLY black hat. Use everything forbidden, who cares. Add another 50 of these pages, all totally black hat. All linking to a page on your new site but also to a LOT of other stuff. You will get 50-80% click-through from users who arrive via a search engine and probably near 100% from old users accessing the old site directly. Risk is that someone complains to Google who then decides to ban the new site as well.
I'm not sure if I would do any of that but I do appreciate you sharing your knowledge with me. It certainly gave me a lot to think about.
It sounds like you have some experience with situations like these. Do you have any suggestions as to what could have caused this to happen?
Is your page rank greyed out? The pr on mine is not greyed out which makes me wonder if it is a glitch?
Analytics is saying that I have an error on the sitemap, but honestly is that enough to get it banned?
I am going to look into the hijacking possibility, but if that were the case, I wouldn't have been seeing *any* traffic from Google, correct? I have seen quite a bit up until the past few days. Now we are completely de-listed.
I'm not sure if it would show right away though if you lost your PR. Wouldn't we have to wait until the next PR update to know for sure?
Like you, I suspect that this is some kind of glitch and I don't think your sitemap has anything to do with it. Never-the-less, I would fix it if you can. Have you had a sitemap up for very long? I'm sure my problem isn't related to that because I didn't put one up until this problem surfaced.
Do you have a Webmaster tools account? If so, does it show any errors when crawling your site?
Ever since we had a proxy hijack many pages never reapeared in GWT. And the latest Google.com site: query shows that many pages are totally gone awol again.
It's like we are penalized now for what others did to us long time ago.
You may be in the same situation. I too went through the process of checking our servers, hired security experts and everything, found nothing.
In reality, any page on the site is directly accessible from any other page. I wonder if that could actually be a contributing factor? Do any of you use a similar navigation system? (Mine uses Javascript and divs to make it possible.)
Google does now "look at" some javascript, but only as text. It does not execute that script -- so if your site's internal linking depends on javascript for the user to see it, then googlebot doesn't really see it either, at least not as links.
Javascript is client-side, executed on the client's end. Turn off javascript and see what your pages look like. That's a lot like what a spider would see.
Even if it did, when Javascript is disabled the navigation works like a bread crub style navagation.