Forum Moderators: Robert Charlton & goodroi
On Sunday Google appears to have crawled our entire site and crippled all our results by just having them show as URLs. We're a PR5 site, running for 8 years, so we have been around for a long time. This never before happened.
Somebody mentioned a "design error" that can cause that. But we didn't made any even by far changes that could be related.
Our SE traffic went down to 50% of what it was. We were having about 65,000 uniques per day, and have yesterday had 28,900 uniques, which is really a bad thing isn't it... :)
If anybody have any words of comfort or any suggestion I would really appreciate it.
What is also funny is that all our results are still showing high on top for all our keywords. But since they just show as URL without page title or text, people seem to be avoiding it. :(
Thank you,
Have you checked your robots.txt file? And if the site assembles the content dynamically, have you checked the robots meta tag to make sure that it doesn't say "noindex"?
[edited by: tedster at 4:05 am (utc) on May 12, 2009]
We didn't really changed anything in the structure for at least a year. Our robots.txt file:
User-Agent: *
Disallow: /
When do you think we can expect a deep recrawl? Considering PR5 for the main page.
[edited by: encyclo at 12:27 am (utc) on May 12, 2009]
[edit reason] replaced link with robots.txt contents [/edit]
Redirects... we might have some in place, but again those are years old. I don't think google could've catched any old links. And we surely don't have some of those that are showing as URLs from the old system.
And it's not that all results are showing like that. But just a lot are. Some 50% at least.
Is there anything wrong with our robots.txt file?
Thanks,
[edited by: Kres7787 at 12:26 am (utc) on May 12, 2009]
Google will work it out soon enough, it might take a few days to a few weeks for things to get back to the previous state however.
We could've been hacked. As I don't believe our guys would do that.
I can only say three words, DAMN and THANK YOU :)
Could that be the reason why we were just getting URLs as results? As Tedster said, some backlinks could've made those URLs results visible?
I found robots.txt file also on our test domain where we're developing new stuff. I can understand the need for such robots setup on that test domain.
BUT perhaps somebody from the developers team accidentally published robots from test domain to the live site.
Who's going to laugh on this? That move could cost us some 900,000 uniques if it will take 30 days for re crawl to occur. Will know tomorrow morning. They're sleeping :)
go your webmaster account and checked the website status there. You will see the website status there. Also you will find to generate the robots.txt option there Or use this in your robots.txt file for some time
User-Agent: *
Allow: /
Thanks
If the previous robots.txt file contained only the directives posted above, then all that is necessary is to delete it or to replace it with a blank file just to prevent 404 errors on robots.txt fetch attempts.
If the "Allow:" directive is used, it should be used only in a robots policy record addressed to the robots which support it (check the individual robots' "help" pages for support info).
Jim
I do not rely on using robots.txt files.
The development server is locked down with .htpasswd so that nothing gets in and has a look round.