Forum Moderators: goodroi

Message Too Old, No Replies

basic idea of robots.txt

         

moi britphile

10:47 am on Aug 24, 2005 (gmt 0)

10+ Year Member



i still have some question about robots.txt after browsing and reading some of the threads here. well, my question is...

say i upload robots.txt onto my root directory, public_html
and in my robots.txt,
i have:
User-agent: *
Disallow: /

as it means to keep all robots out from looking at my files in the root directory. do these files also include all folders in my root directory?

thanks

Angelis

10:49 am on Aug 24, 2005 (gmt 0)

10+ Year Member



Yes it means disallow everything.

you would need to do something like this...

User-Agent: *
Disallow: images/
Disallow: db/

moi britphile

2:20 pm on Aug 24, 2005 (gmt 0)

10+ Year Member



sorry but what does
User-Agent: *
Disallow: images/
Disallow: db/

this work for?

since you said this

User-agent: *
Disallow: /

disallows everything, why do i need those ones? wouldn't this be enough?

thanks.

Angelis

3:17 pm on Aug 24, 2005 (gmt 0)

10+ Year Member



if you wanted to block the images folder you would use what i put above, it depends what the want to block.

Lord Majestic

3:21 pm on Aug 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



User-Agent: *
Disallow: images/
Disallow: db/

Please please, for your owns sake -- do not forget initial /, ie:

User-Agent: *
Disallow: /images/
Disallow: /db/

Its important because every path starts with /, and robots.txt asks to check whether given URL starts with whatever disallow has got. If you missed first /, then bots legitimately will think URL is not disallowed.

Robots.txt validator here may not catch this, but I think it checks syntax rather than actual logic. I mean, a proper validator should ask for NOT just robots.txt, but also LIST OF URLS TO CHECK AGAINST IT!

moi britphile

3:23 am on Aug 25, 2005 (gmt 0)

10+ Year Member



hey thank you guys.
but i checked website stats, why do those robots still come after i've uploaded robots.txt? :S there was only one robot on one of my domain, now i got 2!

Lord Majestic

6:25 pm on Aug 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



why do those robots still come after i've uploaded robots.txt?

There could be a few reasons. One is that some search engines don't crawl robots.txt often, so it will take some time before your changes noted. Another reason could be due to you not putting robots.txt in correct place, check that you get it by using URL: http://www.example.com/robots.txt

moi britphile

10:38 am on Aug 28, 2005 (gmt 0)

10+ Year Member



hey thank you guys. i'll give it some time to take effect. hope it works properly... =)