Forum Moderators: goodroi
say i upload robots.txt onto my root directory, public_html
and in my robots.txt,
i have:
User-agent: *
Disallow: /
as it means to keep all robots out from looking at my files in the root directory. do these files also include all folders in my root directory?
thanks
User-Agent: *
Disallow: images/
Disallow: db/
Please please, for your owns sake -- do not forget initial /, ie:
User-Agent: *
Disallow: /images/
Disallow: /db/
Its important because every path starts with /, and robots.txt asks to check whether given URL starts with whatever disallow has got. If you missed first /, then bots legitimately will think URL is not disallowed.
Robots.txt validator here may not catch this, but I think it checks syntax rather than actual logic. I mean, a proper validator should ask for NOT just robots.txt, but also LIST OF URLS TO CHECK AGAINST IT!
why do those robots still come after i've uploaded robots.txt?
There could be a few reasons. One is that some search engines don't crawl robots.txt often, so it will take some time before your changes noted. Another reason could be due to you not putting robots.txt in correct place, check that you get it by using URL: http://www.example.com/robots.txt