Forum Moderators: open

Message Too Old, No Replies

Are you using If Modified Since?

You should be!

         

GoogleGuy

5:57 pm on Oct 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wanted to urge people to configure their server to support the "If Modified Since" header?

Why should you care about "IMS"? When a smart spider like Googlebot comes around, IMS lets you tell the spider that a page hasn't changed. Then Googlebot can use the old copy of the page. That frees up the bot to download more pages while saving bandwidth. Because of the bandwidth savings, IMS hits are almost "free" in terms of server load. Plain apache can serve _lots_ of IMS queries per second before slowing a machine down.

IMS can work for dynamically generated pages too. Someone posted how to do it for PHP-generated pages, for example. The bottom line is that if your server supports IMS correctly, you can tell Googlebot about more pages without as much server load or bandwidth on your part. As Google crawls more often to make the web a fresher place, adding this flag will help you and search engines.

graywolf

4:57 pm on Oct 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok so I run an asp site with static and dynamic content. For the static pages I can keep a table of the pages and the last modified date and publish that into my header (a little more maintenance but no big deal).

What about the dynamic pages? We have thousdands of products and items come off, go up every day ( products sell out, different stuff comes in, we rebuild this file everyday). Should I put a last modified here?

andreasfriedrich

4:58 pm on Oct 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Answering a request containing a If-Modified-Since: date field is easy to implement in any server side scripting language. Any implementation will involve the following steps.

  1. Check incoming header for If-Modified-Since: date field.
  2. Unless it is there continue with your script.
  3. Read the date the Useragent supplied.
  4. Check whether the content your script would produce has changed since that date. How you do that, depends on a lot of circumstances. Grumpus pointed out one way. If the content you produce depends on more than one database record you need to take that into consideration.
  5. Send a 304 - Not Modified response header if there were no changes and exit your script. As you see, this method saves not only bandwidth but processing time for your own server as well.
  6. If there were changes, just run your script as you did before.

Andreas

Sasquatch

5:09 pm on Oct 9, 2002 (gmt 0)



You should also consider what information you consider important enough, an who you are sending it to.

In my case, I don't want the freshbot hitting all my content pages every time I change the navigation of my site, but I want the users to get the new navigation.

1. I want to keep everyone within 1 month of current, even on navigation. Set the first date to the most recent 20th of the month.
2. If it's not googlebot, add the dates of all the source files to the array.
3. add the update dates of the content to the array.
4. do a MAX() on the array, and that is my last-modified date.

In the case of catalog pages you probably don't care if google crawls each time your "in stock" count for an item goes up or down, so you might not want to include that value in your last-modified date for google, but you want to include it for your customers.

chinook

9:23 pm on Oct 9, 2002 (gmt 0)

10+ Year Member



A quick scan of the MS knowledge base seems to imply that the "if modified since" header is automatic on IIS 5, EXCEPT when urlscan has been configured in a particular way.

May I boldly suggest to GG that an information page for server administrators broken down between IIS & Apache & etc (other web servers) would be appropriate, and perhaps even a header testing tool.

andreasfriedrich

9:38 pm on Oct 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



the "if modified since" header is automatic on IIS 5

To be sure let me stress this again. It is the useragent that makes a conditional request with the If-Modified-Since: date field. The server does not send such a header.

The quoted statement would have to read: the "if modified since" header is handled automatically by IIS 5.

Andreas

Sasquatch

9:39 pm on Oct 9, 2002 (gmt 0)



I seriously hope that MS is only saying that about static pages! Of course it would be just like MS to decide that they know better than you about your dynamic pages.

I agree that some sort of best practices document would be welcome. But the truth is that they are not going to officially suggest this for dynamic content as too many people will not implement it right.

Gizmare

10:04 pm on Oct 9, 2002 (gmt 0)

10+ Year Member



Ok here is an example I wrote that you may use to get your ASP to function properly according to RFC2616(If-Modified-Since). Note I have no error trapping, and I have not tested this solution, but anyone familiar with ASP should be able to add that functionality.

NOTE: IIS 5 does support RFC2616(If-Modified-Since) for static websites and images, but when it comes to ASP you will want to add the following functionality so that you are certain that an unchanged page will be bypassed.

I think the Grumpus example above is a little off - He is sending the response a date when actually you need to check the request date sent (from the client) and compare it to your modified date (from the server), and if it has not been modified since then you need to send a 304 to the client.

This should get the ASP coders out there a start.


<%
Dim dModified, sModifiedSince, sModifiedLast, ckDate

'This is the date in which the last update was made.
dModified = "10/29/1994 7:43:31 PM"

' Add 7 hours to our time for PST to GMT difference
dModified = DateAdd ("h",7,dModified)

'If the HTTP_IF_MODIFIED_SINCE exists then compare it
If Len(Request.ServerVariables("HTTP_IF_MODIFIED_SINCE")) > 0 Then

'Modify our date to make it readable in VBScript
sModifiedSince = Request.ServerVariables("HTTP_IF_MODIFIED_SINCE")
sModifiedSince = Left(sModifiedSince, Len(sModifiedSince) - 4)
sModifiedSince = Right(sModifiedSince, Len(sModifiedSince) - 5)
ckDate=CDate(sModifiedSince)

'Compare our dates and throw a 304 if our date is less then or = to the If Modified Since Date
If (dModified<=ckDate) Then
Response.Clear
Response.Status = "304 Not Modified"
Response.End
End If
End If

'This may not be necessary but I am passing back the Last-Modified date
'Converting it back to the Standard Date Format
sModifiedLast = WeekDayName(WeekDay(dModified),TRUE) & ", " & Day(dModified) & " " & MonthName(Month(dModified),TRUE) &_
" " & Year(dModified) &" "& Hour(dModified) & ":" & Minute(dModified) & ":" & Second(dModified) & " GMT"
' Passing back the Last-Modified
Response.AddHeader "Last-modified", sModifiedLast
%>

nell

12:35 am on Oct 10, 2002 (gmt 0)

10+ Year Member



Other Last-Modified Usage

On our e-commerce sites (on order submit) we send ourselves an e-mail with full customer details, enter selected customer and order details in a MySQL database and send the customer an order confirmation e-mail. We do all that in a single sendorder.php page and use this to no-cache and expire that page:

<?
$delete = time() + 1;
header ("Expires: $delete");
header ("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header ("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header ("Pragma: no-cache"); // HTTP/1.0

.

andreasfriedrich

12:55 am on Oct 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Other Last-Modified Usage

Again: the Last-Modified entity header field specifies when the URL was last modified. This is just to inform the useragent when the last change was made.

The If-Modified-Since request header field is send by the useragent to ask the server to send the requested information pointed to by the URL only if it has been modified since the specified time.

nellīs example which can be found in the documentation for the header [php.net] function is just the serverīs attempt to prevent caching of the document.

The If-Modified-Since request header field is the useragentīs way to always work with the latest version of a given document.

Hope this clears things up.

Andreas

Finder

6:48 am on Oct 10, 2002 (gmt 0)

10+ Year Member



Does Googlebot use the Last-Modified header from a previous visit to form its If-Modified-Since request?

As in, it visits today and gets

Last-Modified: Thu, 10 Oct 2002 00:18:53 GMT

Then visits next week and sends

If-Modified-Since: Thu, 10 Oct 2002 00:18:53 GMT

I can use a string comparison in PHP if this is the case. But if Googlebot uses some other method of creating the if-modified-since then I'll need something more complicated.

Sasquatch

7:22 am on Oct 10, 2002 (gmt 0)



That should work, but a better method would be to use strtotime() on the time that they send you, then do an if ($modified > $since) in case of some sort of screwup.

Slade

2:03 pm on Oct 10, 2002 (gmt 0)

10+ Year Member



Does anyone have working PHP code for an if-modified-since check? (looking for non-register-globals code)

I've been toying with it, but don't seem to be able to get my browser to actually make that request.

andreasfriedrich

2:29 pm on Oct 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does anyone have working PHP code for an if-modified-since check? (looking for non-register-globals code)

No. But it should be fairly easy to write. Just look at my explanation on what to do. Gizmareīs ASP code should help you as well.

I've been toying with it, but don't seem to be able to get my browser to actually make that request.

This is entirely unrelated to the question of how to implement If-Modified-Since handling in PHP. There is nothing you can do on the server side to force a useragent to use the If-Modified-Since header field.

To test your implementation telnet to your server at port 80 and do:

GET / HTTP/1.1 
Host: your.server.tld
Connection: close
If-Modified-Since: date

Andreas

martin

2:41 pm on Oct 10, 2002 (gmt 0)

10+ Year Member



>Does anyone have working PHP code for an if-modified-since check?

$last holds the string version of the time you think the page was last modified.

if ($last) {
$last = strtotime($last);
$cond = isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : 0;

if ($cond and $_SERVER['REQUEST_METHOD'] == 'GET' and strtotime($cond) >= $last) {
header('HTTP/1.0 304 Not Modified');
exit;
}

header("Last-Modified: " . $this->rfc_date($last));
}

Strip $_SERVER[' and the matching '] to get it to register_globals.

Yidaki

4:34 pm on Oct 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



GoogleGuy, do you smile about seo techies that know everything about "hacking the google algo" but don't even know the basics about how their webservers work? ... bad boy! ;)

Google's educational september update - a history:
Step 1: show the webmasters that they waste their time with seo
Step 2: show them that they should learn about webmastering first
[add]Goal: if they work on their server they don't have time to seo-stress google. :)[/add]

.. ts, ts ...

[Yidaki now drive's to his office to clean his server's response headers, too ;)]

This 75 message thread spans 5 pages: 75