|The bot, the logs, and the update|
identifying logs and the google bot
Last dance i began seeing google in the logs a few days prior. One or two bites a day for a few days. Around the 28th
google started the deep crawl. My logs are still showing activity from the bot since, about a dozen or so pages per day.
Google starts crawling the higher pr sites first in the dance. Mine is a pr5, so I am guessing I fall in the middle of the dance
somewhere. Are these small bites I am seeing prior to our deep crawl the effect of google crawling higher pr sites with
inbound links to us, and after our crawl from inbound links of lower pr sites? -correct me if I am wrong.
Also for 4 months running now I have noticed an unknown bot in the logs exactly a day before google shows up. Is this a
coincidence, or has anyone else noticed this? I gave it up as coincidence until this past dance...thought there may be more to
it. Now when I see the unknown bot show, I set out the snacks for google.
Welcome to WebmasterWorld, argusdesigns. I've got one that's a low PR5 and it seems to get later than anyone else.
This strange bot, do you happen to have an IP number for it? If so, is it consistently the same?
Hi marcia, thanks for the welcome.
As far as an IP #, no. But have since installed new web logs that track IP's so I will keep you posted on the results come the next dance. I dub the little critter "her shadow".
Thanks for the verify on the pr5.. sounds logical.
more details about unknown bot? Maybe a rudimentary cloak checker?
>I've got one that's a low PR5 and it seems to get later than anyone else.
Precisely how do you know it is a low PR5 and not a high PR5?
I am guessing that a low pr5 toggles between pr4-pr5 during the update. A high pr5 toggles between pr5-pr6 during the
update. I have heard people refer to different echelons of pr, but have not been able to really find any more info. on it.
Not sure about that one, we don't use any sort of cloaking or .asp files, just good ol' fashion html. It has occured to me that this may be the old log software, but i doubt it. I'll know more next dance.
|more details about unknown bot? Maybe a rudimentary cloak checker? |
|Maybe a rudimentary cloak checker? |
I think Savvy was saying that it might be a check to see if you are using a cloak of some sort. Cloaking relies on the software being able to recognise a particular search engine and depending on who requests the page different versions are presented
SE = optimised content, keyword rich
User = flash version or similar
Engines know what goes on so they look for sites that serve different pages to different IP addresses, so by using new IP's SE can hopefull (they think) tell the difference.
Right, ukgimp. If Google wants to catch cloaking, send a bot not identifying as Googlebot, and from IPs Google doesn't use, and compare what is sent to the bot to what Googlebot got. This should easily catch cloakers. Googlebot seems to do a LOT of crawling. Google obviously has the resources to have secret bots running looking for cloakers.
Yes, exactly what ukgimp said :) (Thanks ukgimp)
I did not mean to imply that you were cloaking or anything about your site in particular. :)
I have been listed in google for a little while now, but since my initial listing I have not seen an update. I was under the impression that google does a sweep once a month. Am I correct in assuming this? I have a PR of 4 and was wondering if this is the reason my site's not getting updated