Brett_Tabke

msg:3076171 | 2:47 pm on Sep 8, 2006 (gmt 0) |
it will clear itself out in about 30 days as gbot learns your update frequency. Are you producing *anything* that changes from page view to page veiw? auto updating advertising code? random headlines? changing links? different menu or flash. any aspect of the page that changes at all? Even so much as one character change? Take a accurate byte count of the source code (view from browser) and then compare that to a page view 2-3 hrs later. Any changes at all? Anything you are overlooking that would make the byte count or the character positions/code even a little bit different?
|
Phil_Payne

msg:3076197 | 3:11 pm on Sep 8, 2006 (gmt 0) |
> it will clear itself out in about 30 days as gbot learns your update frequency. If that were true it would have cleared itself up just after Big Daddy. This is permanent activity. The page is HTML 4.01 written using the Crimson editor. Static HTML barely begins to decribe it - fossilized HTML would be closer.
|
motorhaven

msg:3076555 | 6:07 pm on Sep 8, 2006 (gmt 0) |
Is your server properly handling dates in the header and set up to use if-modified-since requests?
|
Phil_Payne

msg:3076712 | 8:17 pm on Sep 8, 2006 (gmt 0) |
> Is your server properly handling dates in the header and set up to use if-modified-since requests? Boring IIS 5. Google accesses the XML sitemap using if-mofified-since and gets 304s, just like it should. IIS log extract: 2006-09-06 14:18:59 66.249.65.18 GET 200 /index.html 2006-09-06 14:21:27 66.249.65.18 GET 200 /index.html 2006-09-06 15:18:31 66.249.65.18 GET 200 /index.html 2006-09-07 04:04:07 66.249.66.34 GET 304 /sitemap.xml 2006-09-07 04:27:07 66.249.66.34 GET 200 /index.html 2006-09-07 05:08:47 66.249.66.34 GET 200 /index.html 2006-09-07 05:13:08 66.249.66.34 GET 200 /index.html 2006-09-07 05:51:26 66.249.66.34 GET 200 /index.html 2006-09-07 06:10:28 66.249.66.34 GET 200 /index.html 2006-09-07 06:48:00 66.249.66.34 GET 200 /index.html
|
jomaxx

msg:3076740 | 8:48 pm on Sep 8, 2006 (gmt 0) |
Keep it in perspective, you're talking about "up to" about 7 pageviews per day. This amounts to some tiny fraction of a penny.
|
Phil_Payne

msg:3076750 | 9:09 pm on Sep 8, 2006 (gmt 0) |
> .. perspective .. I'm not complaining, just trying to understand. Results of the GSiteCrawler Server-Test Tested at 9/8/2006 9:05:15 PM / from 82.3.81.13: URL=http://www.mysite.com Result code: 200 (OK / OK) Server: Microsoft-IIS/5.0 Content-Location: [mysite.com...] Date: Fri, 08 Sep 2006 20:58:12 GMT Content-Type: text/html Accept-Ranges: bytes Last-Modified: Wed, 06 Sep 2006 12:20:34 GMT ETag: "6a5577daaed1c61:c3a" Content-Length: 4163 So "Last-Modified" is being returned correctly.
|
g1smd

msg:3076762 | 9:18 pm on Sep 8, 2006 (gmt 0) |
It should be requesting http://www.domain.com/ and not http://www.domain.com/index.html I think. I suspect that fact may turn out to be important.
|
ronburk

msg:3077593 | 10:12 pm on Sep 9, 2006 (gmt 0) |
Interesting. What made you think it was requesting /index.html? Looked like the request was for the (illegal, but universally accepted) URL of [something.com,...] and the server used the Content-Location header to inform the User-Agent where the actual resource resides. ------ Although you describe this as fossilized HTML, the server says it was modified recently -- which is correct?
|
g1smd

msg:3077596 | 10:16 pm on Sep 9, 2006 (gmt 0) |
What made me think that? The fact that I have seen several hundreds of sites with that exact same problem in recent months. Does the call for http://www.domain.com respond with a 302 or a 301 response? That short URL was my other, much more unlikely, guess.
|
Phil_Payne

msg:3079818 | 8:39 am on Sep 12, 2006 (gmt 0) |
Results of the GSiteCrawler Server-Test Tested at 9/12/2006 8:36:31 AM / from 82.2.113.108: URL=http://www.mysite.com Result code: 200 (OK / OK) Server: Microsoft-IIS/5.0 Content-Location: [mysite.com...] Date: Tue, 12 Sep 2006 08:29:22 GMT Content-Type: text/html Accept-Ranges: bytes Last-Modified: Wed, 06 Sep 2006 12:20:34 GMT ETag: "6a5577daaed1c61:c3a" Content-Length: 4163 What I don't understand is: a) Why this is "bad"? I have the same "problem" on many other sites that are performing very well. b) Why is Google downloading the index.html page repetitively - when it almost never changes - and NOT downloading the pages that do change?
|
|