| 5:45 pm on Nov 9, 2012 (gmt 0)|
|automatically all those files are blocked |
All which files?
| 5:04 am on Nov 10, 2012 (gmt 0)|
You are Preferred Member here, defiantly have more more knowledge than me. In case of robots.txt i understand only one thing, use it only if you want to block anything else delete it..Try it
| 7:22 am on Nov 10, 2012 (gmt 0)|
|I also added to my robots.txt files my email address ( is that useful, I am afraid google passes PR to the email address ) |
as well as a .pdf ( is it also useful )
Say what now?
robots.txt is for naming directories on your site that you don't want to have crawled by good robots. Bad robots don't read robots.txt -- or they do read it and head straight for the listed directories -- so you have to 403 them.
I'm guessing there are certain directories used internally by joomla that it doesn't want to have crawled, so it comes with its own robots.txt to go with its own htaccess. This presumably overwrites your own pre-existing robots.txt, so I hope you kept a backup.
| 9:07 am on Nov 10, 2012 (gmt 0)|
Joomla basic robots.txt automatically blocks the images directory (or certainly did until very recently) so if you want images indexed you will need to remove that line from the file
| 2:36 pm on Nov 10, 2012 (gmt 0)|
Here are the files
| 12:04 pm on Nov 11, 2012 (gmt 0)|
the Disallow: directive matches URLs, left to right, that are to be excluded from crawling by "well-behaved bots".
if you have any content on urls that match those paths that you want indexed then you should change your robots.txt file accordingly.
however excluding a URL from being crawled does not prevent the URL from being indexed, it prevents the content from being indexed.
if you don't want either to appear in the index you must allow crawling of the URL and provide a noindex signal such as the meta robots noindex element for HTML documents or the X-Robots-Tag HTTP Response header for other content types.
| 11:14 am on Nov 12, 2012 (gmt 0)|
I specialyze in Joomla.
YOu can just leave that file as is, I have dozens of sites that rank very well with the default robot.txt