Forum Moderators: open

Message Too Old, No Replies

Site slows down during crawl

Can robots.txt prevent googlebot from executing global.asa

         

jgar

10:16 am on Mar 3, 2003 (gmt 0)

10+ Year Member



We use global.asa (ASP) to record certain data in an access database when visitors arrive.

This normally works fine, but when Googlebot crawls the site, it is as if the number of active users climbs significantly, slowing down the site.

In the medium term we will be moving to SQL instead of Access, which should solve the problem.

However, in the short term, is there something we can add to the robots.txt file to prevent global.asa being called when spiders crawl the site?

Thanks for any advice

Jgar

jpjones

10:28 am on Mar 3, 2003 (gmt 0)

10+ Year Member



This thread might want moving over to the Microsoft Related - .NET and ASP forum....

Robots.txt cannot control anything ASP related - all it does is tell robots what files they can access and what they cannot.

Instead, I'd try it this way -

In your global.asa file, check the user agent in the server header to see if it contains the string "Googlebot". If it doesn't, then you assume it's a normal (human) visitor, and process global.asa code as normal.

This check could be also be extended to all the different spiders run by different Search Engines...

HTH,
JP

Krapulator

10:35 am on Mar 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What he said!

This code should do it (although im a little bit pissed so you may need to tweak it). add this to your global.asa where the code to add data to access executes:
'----------------------------------------------
agent = request.ServerVariables("HTTP_USER_AGENT")

if instr(agent, "google") > 0 then

'code which adds required data to access

end if
'----------------------------------------------

jgar

10:37 am on Mar 3, 2003 (gmt 0)

10+ Year Member



Thank for the replies.

How about putting global.asa in the robots.txt file as a file not to access?

Krapulator

10:45 am on Mar 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That wont work, because the global.asa isn't directly accesed by the bot. The server simply refers to it when a new seesion begins.

jgar

12:08 pm on Mar 3, 2003 (gmt 0)

10+ Year Member



Thanks Krapulator

Ok, so I guess that means we will have to detect the spider in the global.asa file.

Of course, there are many other spiders that can slow down the site. Putting in code that can detect all spiders seems a bit lengthy.

Anyone know of a good list of spiders?

jpjones

12:15 pm on Mar 3, 2003 (gmt 0)

10+ Year Member



Using the site search tool and clicking on "Spiders" on the right brings up a list at:

Search Engine World Spider List [searchengineworld.com].

JP