Forum Moderators: open

Message Too Old, No Replies

Spiders/Bots and Javascript

Spiders without javascript indexing our "javascript required" page

         

Advantex

4:00 pm on Sep 25, 2002 (gmt 0)

10+ Year Member



We have a website that requires javascript, and there are script/noscript tags in place to forward visitors to a "javascript required" page.

Certain search engines/bots are being sent to and are subsequently indexing our "javascript required" page, instead of the home page, apparently because the spider/bot is not "javascript enabled".

Is there an easy way, perhaps through the robots.txt file or some similar means, to "exclude" the bots from the Javascript requirement we have established for our regular visitors? Is there a way to easily detect that a visitor is actually a bot or spider, and send them all to a specific page to index instead?

Thanks for any help you can provide,

m

Dreamquick

4:16 pm on Sep 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No SE / bot will ever be "javascript enabled", its just not possible to parse javascript and maintain a massive database.

First question - can you stop it indexing the javascript required page using robots.txt? Yes you can.

Second question - can you make the SE/bot index the real homepage?

Well that depends, bots don't do javascript for a number of reasons, so unless your site features real links (as opposed to javascript ones) then no - bots follow links so no links equals no spidering.

You might be able to add a link to the noscript section to get that "real" homepage spidered but as mentioned earlier - javascript links will not get followed so all that extra link might accomplish is getting one extra page spidered!

Have you considered that lots of people might not like javascript heavy sites and that browsing onto a site which tells them to turn on their javascript if they have chosen to turn it off?

- Tony

Advantex

4:30 pm on Sep 25, 2002 (gmt 0)

10+ Year Member



The links are not javascript based -- they are simple anchor tags as usual. But the site in question has a fairly specific purpose, and requiring javascript has not posed a problem. Certain features of the site simply require javascript capability on the client side. At the same time, we do want the public parts of the site to be properly indexed.

I do know how to prevent the "javascript required" page from being indexed... I basically answered my own question on that.

To rephrase what I am looking for...

All pages of the site require javascript, and they all redirect the user to "javascript required" page if they don't have js turned on. I guess I need to modify my <noscript> section so that it only applies to users, not SE/bots.

It sounds like I answered my own question again... I need to dynamically adjust the <noscript> behavior based on CGI.HTTP_USER_AGENT. Any suggestions? I really don't want to maintain a database of bot names just for this purpose...

Thank you VERY MUCH for your fast reply to my close-to-off-topic post... I do appreciate it.

m

Dreamquick

6:01 pm on Sep 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How are you detecting that javascript is *not* enabled?

- Tony

Advantex

6:37 pm on Sep 25, 2002 (gmt 0)

10+ Year Member



we are using the <noscript> tag directly after a <script> block that detects whether the user is using cookies or not.

Dreamquick

8:46 pm on Sep 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ahh that explains it...

Why not try adding a link into the noscript along the lines of;

"you don't appear to have javascript enabled... <rest of text goes here> ...however if we were wrong and you do have javascript enabled, feel free to move on to the **front page**"

(where **front page** is a link to the real homepage)

As you have noticed the spiders will read the NOSCRIPT and so your link to the real homepage should then get spidered...

- Tony