Welcome to WebmasterWorld Guest from 54.147.0.174

Message Too Old, No Replies

robots.txt and joomla

     

member22

3:03 pm on Nov 9, 2012 (gmt 0)

5+ Year Member



I use joomla for my website and automatically all those files are blocked is that good or bad, so I remove anything and if so why ?

I also added to my robots.txt files my email address ( is that useful, I am afraid google passes PR to the email address )
and a javascript: void (0) because I have tabs on my webpage ( is that useful )
as well as a .pdf ( is it also useful )

any comments ? does anything need to be changed or is it ok ?

Thank you,

tedster

5:45 pm on Nov 9, 2012 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



automatically all those files are blocked

All which files?

sunnyujjawal

5:04 am on Nov 10, 2012 (gmt 0)



You are Preferred Member here, defiantly have more more knowledge than me. In case of robots.txt i understand only one thing, use it only if you want to block anything else delete it..Try it

lucy24

7:22 am on Nov 10, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I also added to my robots.txt files my email address ( is that useful, I am afraid google passes PR to the email address )
and a javascript: void (0) because I have tabs on my webpage ( is that useful )
as well as a .pdf ( is it also useful )

Say what now?

robots.txt is for naming directories on your site that you don't want to have crawled by good robots. Bad robots don't read robots.txt -- or they do read it and head straight for the listed directories -- so you have to 403 them.

I'm guessing there are certain directories used internally by joomla that it doesn't want to have crawled, so it comes with its own robots.txt to go with its own htaccess. This presumably overwrites your own pre-existing robots.txt, so I hope you kept a backup.

Rasputin

9:07 am on Nov 10, 2012 (gmt 0)

5+ Year Member



Joomla basic robots.txt automatically blocks the images directory (or certainly did until very recently) so if you want images indexed you will need to remove that line from the file

member22

2:36 pm on Nov 10, 2012 (gmt 0)

5+ Year Member



Here are the files

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/

phranque

12:04 pm on Nov 11, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



the Disallow: directive matches URLs, left to right, that are to be excluded from crawling by "well-behaved bots".
if you have any content on urls that match those paths that you want indexed then you should change your robots.txt file accordingly.
however excluding a URL from being crawled does not prevent the URL from being indexed, it prevents the content from being indexed.
if you don't want either to appear in the index you must allow crawling of the URL and provide a noindex signal such as the meta robots noindex element for HTML documents or the X-Robots-Tag HTTP Response header for other content types.

Oimachi2

11:14 am on Nov 12, 2012 (gmt 0)

10+ Year Member



I specialyze in Joomla.

YOu can just leave that file as is, I have dozens of sites that rank very well with the default robot.txt
 

Featured Threads

Hot Threads This Week

Hot Threads This Month