Forum Moderators: open

Message Too Old, No Replies

Google answers - site removed from index

Are google answers always good?

         

halloerstmal

10:12 pm on Jun 8, 2003 (gmt 0)

10+ Year Member



I posted a question at google.answers, category:computers.
My site was removed from the index although I did not change it. It has not appeared again for the last two weeks.
The answer I got was that my robots.txt file was wrong.
It looks like that:
User-agent: *
Disallow
I was told to remove the disallow line.
I don't believe that.
www.robotstxt.org/wc/faq.html clearly mentions that this is the right way to set up the robots.txt.
Does google require a different robots.txt?

[edited by: halloerstmal at 10:22 pm (utc) on June 8, 2003]

Chris_R

10:18 pm on Jun 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am not that big of an expert on robots.txt, but I don't recall seing:

Disallow.

Ususally there is a : /

after it, but that is if you don't want google to index your page.

If you do want google to index your page - I believe there is nothing wrong with just having no robots.txt file [I know this works - and have never heard you were supposed to have a file for some sort of standard].

In other words - get rid of any robots.txt file and google will crawl your site.

Hope that helps

bird

10:24 pm on Jun 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



halloebenfalls [webmasterworld.com] ;)

User-agent: *
Disallow.

Maybe you just mistyped this here, but you should check whether that entry looks the same on your server. The dot at the end of the second line needs to be a colon (= "Doppelpunkt"):

User-agent: *
Disallow:

And yes, "Disallow:" with nothing following after the colon does mean "everything allowed".

<added>
Ok, so you just edited you post, and only have the "Disallow" with neither dot nor colon. This is invalid syntax, and may scare some robots away. In contrast, my second example above will have no adverse effects on Googlebot.
</added>

DerekH

10:35 pm on Jun 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Two of my sites are hosted on "free" ISPs that have no Robots.txt file and don't allow me to create one.

They're both fully indexed.

Robots.txt is not a compulsory file. I suggest you abandon it until you need selective indexing.

DerekH

halloerstmal

10:39 pm on Jun 8, 2003 (gmt 0)

10+ Year Member



Thanks.
I must have been sitting in front of my computer for too long.
That's what my robot.txt looks like:

User-agent: *
Disallow:

I don't think this is the reason for my site not being indexed. But that's what the guy at google.answers told me.
And I think that's wrong

rfgdxm1

10:50 pm on Jun 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The guy at google.answers was just a total moron.

coosblues

4:18 am on Jun 9, 2003 (gmt 0)

10+ Year Member



"The above format is the common and acceptable standard for allowing all spiders access to the site. We've recently learned (2002-06-09 that the practice of having just a User-agent: * and Disallow: without a trailing forward slash may not be recommended. Some spiders may incorrectly interpret this as blocking all content. You'll notice that we disallow the _private, css, and javascript folders in the below example and do not recommend an empty file."

This "quote" is from a well-respected web site and if you'd like the url sticky me with it.

anallawalla

4:25 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



What/where is google.answers?

Powdork

5:04 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you decide to get rid of robots.txt, then the only thing you need to worry about is whether your server is returning a proper 404 header.

shaadi

5:14 am on Jun 9, 2003 (gmt 0)

10+ Year Member



What/where is google.answers?

anallawalla, try searching for halloerstmal URL, you will get Google answers link [answers.google.com] :)

SlyOldDog

7:38 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>The guy at google.answers was just a total moron.

Try to help someone and that's the thanks you get eh, rxdfm1?

Powdork

7:45 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try to help someone and that's the thanks you get eh, rxdfm1?

Aren't they paid to give correct answers?

SlyOldDog

7:47 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, yes but for $5 a pop I guess the disclaimer must be pretty watertight

shaadi

8:01 am on Jun 9, 2003 (gmt 0)

10+ Year Member



Try to help someone

Say - try to mis-lead...

Chris_D

8:52 am on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Robots.txt is a Robots Exclusion Protocol.

If you have no intention of actually excluding any robots - you don't need one.

ie if you don't care who indexes your site - don't have a robots.txt. If you do care - ban the bad bots by name and directory.

gstewart

9:13 am on Jun 9, 2003 (gmt 0)

10+ Year Member



If you have no intention of actually excluding any robots - you don't need one.

On the other hand, Chris_D, having one keeps my Error Stats more manageable...

bird

12:07 pm on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try to help someone and that's the thanks you get eh, rxdfm1?

People are paid at google.answers for research, and to provide links to relevant resources. In this case, the respondent did neither, but his answer was given from the top of his head, and wrong. Well, he was partly right about some broken robots getting confused no matter what you serve them. But that fact has nothing at all to do with a question about Googlebot.

In other words, the respondent didn't do what he was supposed to do for his money, and should have gotten a zero star rating for that. Hallerstmal was very generous to accept that answer. Luckily, other people followed up with the correct information.

mfishy

1:21 pm on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I used Google answers once for a coding question that I could not find the answer to anywhere. They provided me with completely irrelevant info.

I got a refund - think it was $4 :)

2_much

4:42 pm on Jun 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've used Google Answers for a few random questions - list of banks in Berlin, list of newspapers in Australia, etc - and each time I have gotten really great answers and information. I've alwasy found myself tipping them.

Ever since, I've always thought that if I had more time I would try to become a researcher. It seems like a great way to make a little bit of cash while learning pretty cool information.

TeofenGL

4:51 pm on Jun 9, 2003 (gmt 0)

10+ Year Member



if a plain
User-agent: *
Disallow:
file has the potential to cause problems, maybe try
User-agent: *
Disallow: /somejunk/
(where somejunk is the name of a dir that doesn't exist)
-- this would satisfy the bots that may have issues with the thing, while allowing the rest of the site to be crawled...

philipp

10:52 pm on Jun 13, 2003 (gmt 0)

10+ Year Member



Well, by now the Google Answers thread in question [ [answers.google.com...] ] has been answered to user Halloerstmal's satisfaction.
That's what the clarification request is for, if you got doubts you can ask, and then there's the comments section, which pointed to another possible problem (duplicated content). The comment was by Robertskelton-ga, who answered 409 questions with an average rating of 4.61 out of 5 stars. I won't defend every single answer given by any of us at Google Answers, but in general there's splendid service, great experts, and sometimes they work very hard for those $5 questions.