Forum Moderators: Robert Charlton & goodroi
HTTP/1.1·200·OK(CR)(LF)
Date:·Sat,·02·Aug·2008·16:33:18·GMT(CR)(LF)
Server:·Apache/2.2.4·(Unix)·mod_ssl/2.2.4·OpenSSL/0.9.7a(CR)(LF)
X-Powered-By:·PHP/5.2.0(CR)(LF)
Expires:·Thu,·19·Nov·1981·08:52:00·GMT(CR)(LF)
Cache-Control:·no-store,·no-cache,·must-revalidate,·post-check=0,·pre-check=0(CR)(LF)
Pragma:·no-cache(CR)(LF)
Set-Cookie:·PHPSESSID=d683648f30758c68e7c79b68ef82732c;·expires=Mon,·10·Nov·2008·16:33:18·GMT;·path=/(CR)(LF)
Connection:·close(CR)(LF)
Transfer-Encoding:·chunked(CR)(LF)
Content-Type:·text/html(CR)(LF)
I'm not sure about the if-modified-since header, but is there anything in the headers which could be causing poor performance/negativity with search engine spiders?
We've got some great links, content, pictures etc have tried pretty much everything else other than move server/change headers
Any comments would be greatly appreciated
[edited by: Robert_Charlton at 4:45 pm (utc) on Aug. 2, 2008]
[edit reason] Removed specific. [/edit]
With regard to if-modified-since, that's something the user agent sends TO your serve, not a header your server sends back. Looking at your highly restrictive cache settings, I'd also guess that your server never returns a "304 not modified" response, and that can chew up the crawl budget for your site.
You also could improve spidering by enabling mod_gzip on apache (http compression) to save bandwidth.
However, your first sentence most likely holds the big clue: "I've seen a significant drop/strange behaviour when changing titles/descriptions." Google has become ever more sensitive to changes (especially repeated changes) in titles and descriptions.
(According to Google) I was having problems around 7-9 weeks ago with the Googlebot, and I got offered the faster crawl setting (which I checked) and it said it would last for 70 days. The server seems to be running fairly quickly, and haven't had any issues with speed/loading time, but perhaps GoogleBot is?
I just ran the header checker on a standard .html file (which exists physically on the server):-
HTTP/1.1·200·OK(CR)(LF)
Date:·Sun,·03·Aug·2008·09:40:12·GMT(CR)(LF)
Server:·Apache/2.2.4·(Unix)·mod_ssl/2.2.4·OpenSSL/0.9.7a(CR)(LF)
Last-Modified:·Sun,·03·Aug·2008·09:39:43·GMT(CR)(LF)
ETag:·"a7453f-6f-4538b00a1b5c0"(CR)(LF)
Accept-Ranges:·bytes(CR)(LF)
Content-Length:·111(CR)(LF)
Connection:·close(CR)(LF)
Content-Type:·text/html(CR)(LF)
(CR)(LF)
Looking at the server logs, there are lots of 304s being returned, but only for files which exist on the server (images/css/js), not for any of the URLs which are dynamically generated. The url's currently use a folder-like structure similar to the one used on this site e.g. /pagename/ if this makes any difference to headers.
I've been meaning to look at the PHP server variables to try and return headers which are up to date, and as you mentioned above with slightly more relaxed cache settings. My main reason for asking the initial question was there anything off with the headers that we could raise with our server company :)
$_SERVER['HTTP_IF_NONE_MATCH']
$_SERVER['HTTP_IF_MODIFIED_SINCE']
Looking at your highly restrictive cache settings, I'd also guess that your server never returns a "304 not modified" response, and that can chew up the crawl budget for your site.
Exactly right. These are the standard headers returned by the current platform/environment I'm using - I haven't changed any of these or reconfigured any options.
However, as it's an e-commerce site with includes/baskets etc, the data is often changing for different users. Would you not need to force the browser to update it's cache or am I barking up the wrong tree?
the data is often changing for different users.
Maybe take a closer look at how your site treats googlebot. Does the data that you serve googlebot also change frequently? Or is the session cookie required before data starts changing? If a cookie is required, then perhaps you can afford to take the restrictive cache-controls off for user agents that aren't taking cookies. Of course, I don't know your whole set-up, and if/how a regular user without cookies can shop there.
With regard to "IF_NONE_MATCH", you are probably returning a fresh E-tag for every request too, so it's a parallel situation. And also a very heavy use of the E-tag. If you aren't already using it, check out the YSlow tool [webmasterworld.com] and all of its documentation. It may be a guide for you through this particular thicket.
However, all of this aside, I don't see anything in your http headers that should cause a ranking drop.