Forum Moderators: open
The header is saying the page expired 22 years ago and it should not be cached or stored and must be revalidated. This is because the site uses PHP sessions to track the progress of each visitor. Because of that, use of the IMS header is ignored and not currently doable.
Is there anyway of working around this to Google's satisfaction? In one case the user id's are visible in the url and in another case they are not.
Tip #1: Use If-Modified-Since (IMS). IMS lets your webserver tell Googlebot whether a page has changed since the last time the page was fetched. If the page hasn't changed, we can re-use the content from the last time we fetched that page. That in turn lets the bot download more pages and save bandwidth. I highly recommend that you check to see if your server is configured to support If-Modified-Since. It's an easy win for static pages, and sometimes even pages with parameters can benefit from IMS.
Trying to make Google happy! Is it possible to do the above with Session User ID tracking?
If-Modified-Since (if included in the request) is available to your script in the superglobal array $_GET (or $HTTP_GET_VARS).
Before your script has sent any output you can check for the "If-Modified-Since" and return 304 if appropriate:
if ($_GET["IF_MODIFIED_SINCE"] <> "")
{
if (sometest)
{
header("Status: 304 Not Modified");
exit();
}
}
Hope this helps.
dmorison:
If-Modified-Since is found in the request header, it's not a GET variable - can get it like this...
$headers = getallheaders();
$if_mod_date = $headers['If-Modified-Since'];
Haymeadows:
The header is saying the page expired 22 years ago and it should not be cached or stored and must be revalidated
He's doing this so that the pages are not cached... as long as he sets Pragma to no-cache and Cache-Control to no-cache, he shouldn't have any problems.. I think the Expires header is probably being used in this case for ancient rogue browsers that don't correctly implement Pragma and Cache-Control. I would happily ommit the Expires header, or preferably use it do display a date when the page will be out of date.
However... a great idea would be to disable session IDs for google if you can - I do this on my sites. Check for googlebot in the HTTP_USER_AGENT variable and if it is google (or another search engine) then don't start a session. There is some debate as to whether this is OK, or if it may be considred spamming by google. My justification is found on google's guidelines page:
Allow search bots to crawl your sites without session ID's or arguments that track their path through the site.
[google.com ]
hope it helps...
Our programmer says it can't be done.
If they still are stubborn ask "Really? No solutions AT ALL? Not one? Ok, have you posted the question on any forums? Which ones? What where your responses?"
ALWAYS challenge programmers, as they are well lazy (I am one, so I know). The last thing most want is more work.
Trying to make Google happy! Is it possible to do the above with Session User ID tracking?
Session IDs, on the other hand, are Google's Nemesis. This is the problem that needs looking at, and there are two ways to handle it:
1. Pass it as a cookie, NEVER via the URL
2. Don't use them unless some action occurs that a spider can't perform, e.g. filling in a form.
Hope that helps.