Forum Moderators: DixonJones
I have around 600 visitors a day, around 250 for PPC, another 250 from organic listings, and the rest from either referrers or direct links.
I've heard that spiders can cause problems with your logs, but I have no idea how to differentiate them.
Anyone have any ideas?
1. Take your logs, separate out the 68% that leave after 1 second.
2. Take a look at the HostNames and UserAgents, are they similar? do a distinct select on the agents and see how many different ones there are. If they are all the same, chances are its a bot.
3. You can search for "robot ip addresses" and get a listing of known robot ips/ user Agents. you can compare these to your ip's and user Agents and see if they match.
Chances are that if they're only on for a second its a spider or robot. You can add a robots.txt file if you dont have one already. You can search on how to make one, they're very easy to make.
Also you can do some regression testing and try to follow these 1second visits. Meaning that you can see if they are on PageX for 1 second, then they move to PageY for 1 second, then PageZ. If they are making traversals from one page to another in record time, chances are its a bot and you can exclude it from your metrics.
hope this helps a bit.