Robots.txt vs. JavaScript for Robot Control

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Robots.txt vs. JavaScript for Robot Control

eileenw

5:48 pm on Aug 5, 2008 (gmt 0)

I got a message in the Google Webmaster console the other day about a site having too many URLs, encouraging me to look at the URL structure and to robots.txt off duplicate URLs. However, they provided a list of problematic URLs and approximately 60% of the URLs were already robots.txt disallowed. I used the robots.txt analyzer tool in Google Webmaster console to verify that everything was configured right and it is.

What should I do here? Do I need to code every URL that I don't want Googlebot to crawl in JavaScript? That's a pretty old school approach, but I've noticed a few competitors do it recently.

Has anyone done any tests on using robots.txt vs. JavaScript for robot control? What works better?

tedster

6:40 pm on Aug 5, 2008 (gmt 0)

If those URLs are disallowed in your robots.txt then you can safely disregard the WMT messages. As to the comparison between robots.txt and javascript, robots.txt wins hands down. It's the official standard, where as Google has never promised that they "won't" use links in javascript. in fact, these days they sometimes do use javascripted links for discovery.