Forum Moderators: bakedjake

Message Too Old, No Replies

Recognizing Search Engine Bots

         

mandymo

9:07 pm on Dec 18, 2005 (gmt 0)

10+ Year Member



Hi,

I am very, very new at site creation and have only just uploaded my very first brand new site, so please excuse me if my questions seem a bit naive.

How can I know, from my webstats, which bot has read my site?

The obvious ones, like google or msn are written, but what are:

1. BLA
2. ia_archiver-web.archive.org
3. MetaTagRobot

As I said, my site is very, very new, so what other ones should I expect over the coming weeks?

And the million dollar question, should anybody know the answer, is how long after they appear in my stats should I expect to receive visitors from search engines?

Dijkgraaf

3:05 am on Dec 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Webstats probably has a list of UserAgent's which it identifies as Search Engines, and probably also sees which user agents ask for robots.txt

This thread should probably be in Search Engine Spider Identification, but anyway
1. BLA is probably a e-mail harvester from what I've observed of its behaviour (someone correct me if they know better)
2. See [archive.org...] They keep archive copies of web pages for historical purposes.
3. Possibly [widexl.com...]

It is really hard to predict what other bots you will see as it varies greatly from site to site.
Also it is even harder to predict when you will get the first user finding you via a search engine, it could be days, it could be months.

Stefan

3:20 am on Dec 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you can download the actual logfiles from the server, open them with Wordpad and look through them. The format is standard and you can use the internet to find info on what it all means. When you find bots, do a search on the referrer info, if there is any (somtimes you find unidentified ones by the .htm pages going out to the same IP#, with no image files asked for).

Your site is new, and probably not getting whole heaps of traffic yet, so this won't take an enormous amount of time. It will be an education, and in the future when you use analyzer programs you'll have a much better idea of what you're seeing. I'd also suggest getting "Analog" to check logfiles, although it's a little complicated at first, if you have access to the raw files.

And the million dollar question, should anybody know the answer, is how long after they appear in my stats should I expect to receive visitors from search engines?

It depends ;-)

MSN might list you well fast (a couple of weeks), Y could take longer, and with G they might find you right away, and then you could encounter the dreaded "Sandbox".

Dave_A

4:12 am on Jan 3, 2006 (gmt 0)

10+ Year Member



Hi I am not sure about the length of time it takes between the spider visit and when the database gets changed.
For my search engine in New Zealand the data is updated as soon as it is returned from the webspider, it updates as it indexes but the other search engines probably don't update that often.
I know that I have seen some data held within some search engines that is over six months old. When I do a search for my engine, it shows really old pages that have been modified or over written ages ago.

Hope this helps you to understand more..

Heaps of regards
Dave Andrews