Welcome to WebmasterWorld Guest from 54.146.240.181

Message Too Old, No Replies

After eight years - all our results changed to only the URL

     

Kres7787

10:56 pm on May 11, 2009 (gmt 0)

5+ Year Member



Hi,

On Sunday Google appears to have crawled our entire site and crippled all our results by just having them show as URLs. We're a PR5 site, running for 8 years, so we have been around for a long time. This never before happened.

Somebody mentioned a "design error" that can cause that. But we didn't made any even by far changes that could be related.

Our SE traffic went down to 50% of what it was. We were having about 65,000 uniques per day, and have yesterday had 28,900 uniques, which is really a bad thing isn't it... :)

If anybody have any words of comfort or any suggestion I would really appreciate it.

What is also funny is that all our results are still showing high on top for all our keywords. But since they just show as URL without page title or text, people seem to be avoiding it. :(

Thank you,

tedster

11:14 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



A url-only result usually means that the site has prohibited the url from being indexed - but there are backlinks in play that make Google want to list it anyway.

Have you checked your robots.txt file? And if the site assembles the content dynamically, have you checked the robots meta tag to make sure that it doesn't say "noindex"?

[edited by: tedster at 4:05 am (utc) on May 12, 2009]

g1smd

11:16 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Check the robots.txt file and meta robots tags for errors.

Does the site do any redirects (like non-www to www, and so on)?

*** But we didn't made any even by far changes that could be related. ***

What *did* you change? Maybe not intentionally, but perhaps you did change something that mattered.

Kres7787

12:22 am on May 12, 2009 (gmt 0)

5+ Year Member



Thanks for your replies.

We didn't really changed anything in the structure for at least a year. Our robots.txt file:

User-Agent: *
Disallow: /

When do you think we can expect a deep recrawl? Considering PR5 for the main page.

[edited by: encyclo at 12:27 am (utc) on May 12, 2009]
[edit reason] replaced link with robots.txt contents [/edit]

Kres7787

12:23 am on May 12, 2009 (gmt 0)

5+ Year Member



The site is dynamic, with cached content.

Redirects... we might have some in place, but again those are years old. I don't think google could've catched any old links. And we surely don't have some of those that are showing as URLs from the old system.

And it's not that all results are showing like that. But just a lot are. Some 50% at least.

Is there anything wrong with our robots.txt file?

Thanks,

[edited by: Kres7787 at 12:26 am (utc) on May 12, 2009]

encyclo

12:28 am on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Your robots.txt file bans all robots - it must have been changed by you or your developers. You should remove it as soon as possible - for the moment, just delete the robots.txt file!

Kres7787

12:32 am on May 12, 2009 (gmt 0)

5+ Year Member



I just checked. It was edited a week ago! WHAT THE HELL!

So can you please explain what that line does? Completely rejects ALL bots?

Thanks a lot man.

encyclo

12:34 am on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yes, it blocks everything (disallow access to the root-level folder - / - and everything below it). Delete the file completely right now, you can create a new one later at your leisure. :)

Google will work it out soon enough, it might take a few days to a few weeks for things to get back to the previous state however.

Kres7787

12:38 am on May 12, 2009 (gmt 0)

5+ Year Member



I just did and have as well emailed our developers about this.

We could've been hacked. As I don't believe our guys would do that.

I can only say three words, DAMN and THANK YOU :)

Could that be the reason why we were just getting URLs as results? As Tedster said, some backlinks could've made those URLs results visible?

Kres7787

12:42 am on May 12, 2009 (gmt 0)

5+ Year Member



I wish there was a paid deep crawl :)

Kres7787

12:49 am on May 12, 2009 (gmt 0)

5+ Year Member



I believe I know what happened...

I found robots.txt file also on our test domain where we're developing new stuff. I can understand the need for such robots setup on that test domain.

BUT perhaps somebody from the developers team accidentally published robots from test domain to the live site.

Who's going to laugh on this? That move could cost us some 900,000 uniques if it will take 30 days for re crawl to occur. Will know tomorrow morning. They're sleeping :)

fishfinger

10:32 am on May 12, 2009 (gmt 0)

10+ Year Member



Exact same thing happened once on a project I worked on (luckily I wasn't culprit).

New item for the testing to live rollout checklist!

pervezalam

10:52 am on May 12, 2009 (gmt 0)

5+ Year Member



It happened same with me before some time, check your robots.txt, it has banned your website to crawling,

go your webmaster account and checked the website status there. You will see the website status there. Also you will find to generate the robots.txt option there Or use this in your robots.txt file for some time

User-Agent: *

Allow: /

Thanks

jdMorgan

12:42 pm on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The "Allow:" directive is *not* supported by all robots.

If the previous robots.txt file contained only the directives posted above, then all that is necessary is to delete it or to replace it with a blank file just to prevent 404 errors on robots.txt fetch attempts.

If the "Allow:" directive is used, it should be used only in a robots policy record addressed to the robots which support it (check the individual robots' "help" pages for support info).

Jim

Kres7787

12:58 pm on May 12, 2009 (gmt 0)

5+ Year Member



Thanks again people. We're working on this now. First time that we're seriously working on it.

Thanks for all your input.

g1smd

6:19 pm on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



A robots.txt disallow means that Google will show those as URL-only entries. As soon as someone says "URL only" here, that's the big clue as to where to look first. :)

I do not rely on using robots.txt files.

The development server is locked down with .htpasswd so that nothing gets in and has a look round.

the_nerd

6:32 pm on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



even if you haven't changed anything in a year, things might change anyway: adsense all of a sudden started to show charities. Why? There were some syntax errors in my robots.txt file that Google seems to have ignored for years. They obviously changed their parser a couple of weeks ago - and I was cut off for a while before figuring out what went wrong.

Kres7787

8:28 pm on May 12, 2009 (gmt 0)

5+ Year Member



I mentioned this in the order thread. Our google earnings went down from steady $130 per day, to $15 per day, DUE to this robots.txt crap up. So not only has our traffic had a major blow, but also our earnings. Lovely.

So be careful people. Learn from idiots like us. ;)

bwnbwn

9:07 pm on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Kres7787 with
65,000 uniques per day
and only generating 130.00 per day I would as well look very hard at the way your presenting the ads there is something wrong.

leadegroot

12:24 pm on May 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would as well look very hard at the way your presenting the ads there is something wrong

Not necessarily - it could just be a low paying niche.

Kres7787

2:09 pm on May 13, 2009 (gmt 0)

5+ Year Member



yes somewhat lower CPM. We do not show Adsense on all pages as well. Due to some restrictions with ad agencies.

[edited by: tedster at 4:12 pm (utc) on May 13, 2009]

 

Featured Threads

Hot Threads This Week

Hot Threads This Month