Forum Moderators: open
In my stats program it lists monthly stats, updates them each day. (Google also crawls a few pages of mine each day)
for google it looks like this:
Googlebot (Google) 107hits 2.17 MB
Then say the next day it says
Googlebot (Google) 112 2.23 MB
Is there any way of knowing which pages it has hit? Does that now mean I have .05MB more of content included or could that mean that it just crawled .05MB of something it crawled before.
I think now that I'm reading this, what i'm noticing is that I make no sense. Soooo... In light of that, could you guys just throw out some answers to what It sounds like I'm asking, because most of you never cease to educate me whether or not the answer was exactly what I was looking for or not.
Thanks!
P.S. how does google pick which I my pages to crawl each day since I don't have new content on any particular pages each day, why some not others?
Is there any way of knowing which pages it has hit?
If you would look to the raw your website log files. You would know exactly. I wrote small php script for myself to extract googlebot hits from log files. So it's very easy for me to see when and which pages google requests. Deep crawl is when every or almost every page of your web site is requested.
I don't know how google decides on which page to pick during other time. If you would search earlier topics, you could find some information on that.
Yesterday googlebot requested the robots.txt (on a lot of sites) over a 100 times per site.
3 times a minute from the same IP in some occasions.
This is the first time that I’ve seen more requests for the robots.txt then the content pages.
Has anyone else seen this - yesterday or before?
It depends upon your access and application!
If you run on linux, do you have cpanel or shell access?
I do run on linux and have cpanel, not sure about shell access.
In cpanel there is an option that says Raw Access Logs, when I click that it downloads an archived file. When I unzip it and run it, it's just a blank dos screen to closes automatically after like 2 seconds.
Now I've checked the option to archive logs each day; should it start working, and is this the log that will tell me what pages have been hit?
Thanks