Forum Moderators: open
I have always monitored my site traffic, which also includes watching how spiders dig around in my site. I use the infamous server side AXS application that allows me to monitor page visits – which I have been doing for a year now. In monitoring traffic I know my usual visitors. Google usually makes stops at my site on a regular basis (which were magnified deep crawls within the 4 month period). I have been quite happy with my ranking on Google – but not these recent months.
Current status
Since the last 2.5 months Googlebot has been no where to be found. My ranking plunged in the areas I was once enjoyed. Last month my entire website underwent a huge site change and update. Not even a look from Sir Google since the major update took place.
But! This is where things get interesting
Good news starts off that my site has seen increased deep crawls from other major engines. Googlebot used to be my ONLY major crawler. But in recent months Google has ignored me - or have I been ignored?
The oddities begins here…
Google has most of all my new pages indexed? They also rank pretty high in their categories for the most part… But, not in the specific areas I want them to be indexed.
So when did Google come and why did I not notice Google?
Google incognito???
I have a the idea that Google is crawling from AOL. AOL Spiders have never looked at my site - e v e r – until AOL teamed up with Google. AOL has been crawling my site on a regular basis and digging deep. Acting just like Googlebot does in prior indexes.
How do I get this idea? I decided to compare crawls from other spider crawls, and noticed AOL “acted” the same as Google bot does. Example: First a knock, then crawl the front page, followed with another knock, then a deep crawl. I noticed this from my print outs I do monthly from my stats - The way AOL crawled my site, seem to be the same behavior as GoogleBot. It is all really interesting to me and maybe I am way off here – Which is why I bring this to all of you :)
I am new at all this search engine stuff, but learning. So I hope to learn how things may have changed, or what I am doing to be ignored by Googlebot while loved by others.
Questions…
1.) Does Google crawl from their partners? (Yahoo, AOL etc…)
2.) What other methods does Google have to index sites into their directories if they have not been submitted yet?
3.) Am I loosing it? Need to be patient?
Just trying to get a grip on how things work with Sir Google.
-eboda
... all right... you misinterpreted your stats (easy thing to do...I did it too).... that proxy spider you are seeing in your logs is not really an indexing bot.... it grabs a 'copy' of your page and presents that page to its users.... that is what you are seeing. That's why it is very hard to get a good feel for aol traffic using log analyzers ... the caching screws them up. (You need some kind of 'on-page'/code counter, I think...anyone?)
The 'knocking->drill down' that you are seeing is what we all see... the majority of surfers hit the index first (so that is what the bot gets first) then they drill down from there (requiring more pages from the spider, in a sequential and logical order.) I'm not sure how your logs are being displayed, but you might also be being shown the requests, ranked by number of requests... which would also make sense. Additionally it also sounds like you don't have many pages in Google.... just a page or two? that would amplify this perception...
good luck to you ...and stay curious:)
Let me clear some things up here.
First a Question:
When Googlebot does a site crawl it identifies itself as for example: crawler12.googlebot.com.Right?
I am sure this is a crawl because of the Googlebot identifier.
If yes to the above and that is the indexing crawl that everyone talks about then I used to get these monthly.
My definition by a knock:
(note: this is an actual session from googlebot 2.5 months ago and used to be a monthly thing.)
Day #1 “The Knock”
crawler12.googlebot.com
Day #2: “The Visit” Which lasts about 3 hours on the average.
crawler11.googlebot.com
crawler13.googlebot.com
crawler14.googlebot.com
crawler15.googlebot.com
crawler16.googlebot.com
crawler17.googlebot.com
crawler18.googlebot.com
Example of my site log:
(Note: not accurate because I don’t have a recent one to copy and paste in here. Close enough though.)
A visitor from crawler11.googlebot.com (209.185.253.175)
arrived without a refering URL,
and visited myssitedomain.com
at 5:52:32 PM on Monday, July 29, 2002.
This visitor used Googlebot/2.1 (+http://www.googlebot.com/bot.html).
Search Result hit on my site looks like this:
A visitor from pbs23.in-tch.com (66.62.84.22)
arrived from www.google.com business identity 1-10,
and visited www.mysitedomain.com/Design/Development/
at 4:33:20 PM on Monday, July 29, 2002.
This visitor used Mozilla/4.0 (compatible; MSIE 6.0; Windows 98).
Now here is the question:
If I have not seen any crawls from Google like before – How can new content show in Google’s search results? Usually I would know when my new content is being fed upon and indexed… This time around I have no verification of that?
-eboda