Forum Moderators: DixonJones

Message Too Old, No Replies

any idea how to log robots without log access and without serversides?

...they do not request images, .JS files, etc.

         

martin_k

12:15 pm on Feb 6, 2003 (gmt 0)

10+ Year Member



How to log robots without log access and without server sides?

I have a "www.server.site" that can do all the logging for the hosted site. But the scripts on the server should be called by a browser viewing the hosted site. Usually it is done by calling script through an IMG tag or SCRIPT tag.
Robots as a rule do not care about IMG tags and SCRIPT tags.

I had couple of ideas:
1. ...to have a small link on homepage on the hosted site to that would have a frameset and one frame would be a certain page from the www.server.site. The www.server.site would then have an apache custom log for that page.

2. ...to have a small link on homepage on the hosted site to that would have a meta refresh to a certain page from the www.server.site. The www.server.site would then have an apache custom log for that page.

Regarding the option 1, I have read that just some part of robots follows the frames.
Regarding the option 2, I have read that you can get penalized by SE if meta refresh is less than some 5 seconds. I would need 0 seconds.

Please, has anybody solved this problem with logging the robots?

[[Could not find any ideas from HitBox or any other remote webstats provider. HitBox said they do not log robots at all.]]

Dreamquick

12:32 pm on Feb 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To effectively log robots you *need* server logs or at least server-side scripting. That's the long and short of it, off-site logging isn't very effective and 3rd party trackers have the exact same problems to overcome...

The problem with both your ideas is that robots don't have to follow the links in the order you supply them, so both your ideas fail because the robots will find the page on "www.server.site" and they will crawl it.

But after they have found they do not have to go through your site to get to it - that means they could crawl your "logging" page without crawling your main site.

Also a crawler could just decide that they "like" your site but not the site with logging enabled - this would result in them crawling your site but giving you no idea that they were there.

Our of curiousity how have you gotten into this situation? Everything bar the a handful of free hosts should provide you with real logfiles if they are even attempting to be a professional webhosting service so I'd try asking them or looking elsewhere - at the moment most hosting packages are so cheap you might well find a better deal.

- Tony

martin_k

1:04 pm on Feb 6, 2003 (gmt 0)

10+ Year Member



Thanks, Tony

It seems that I will have to give up the idea of logging robots remotely on my hosted site.

Dreamquick

2:07 pm on Feb 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Martin,

That's a shame but to be honest I can't think of a decent solution either given the stated criteria...

Before you give up totally you might want to ask whoever manages the server if you can see the logs - they might surprise you...

-Tony