Forum Moderators: open

Message Too Old, No Replies

meta http-equiv="expires" Question

         

thetrick

4:50 am on Aug 30, 2002 (gmt 0)



If I have a header tag like so:

<meta http-equiv="expires" content="">

What will the web-crawler/spider bots do with it since it has an invalid "expires" tag? Will they ignore it or delete it like they would an expired tag? Or would they accept it and add it to their searchable pages?

Thanks,
Tony

tedster

5:15 am on Aug 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld, thetrick

As far as I know, there's no commercial search engine that cares about the expires tag. It should have no effect, whether the tag's content is valid or not.

Do you have reason to think that valid "expires" tags will be honored by some search engines?

thetrick

12:36 pm on Aug 30, 2002 (gmt 0)



I found this on Vancouver Webpages while looking for information: [vancouver-webpages.com...]

"The date and time after which the document should be considered expired. Controls cacheing in HTTP/1.0. In Netscape Navigator, a request for a document whose expires time has passed will generate a new network request (possibly with If-Modified-Since). An illegal Expires date, e.g. "0", is interpreted as "now". Setting Expires to 0 may thus be used to force a modification check at each visit.

Web robots may delete expired documents from a search engine, or schedule a revisit."

Any thoughts to the validity of the last sentence?

jdMorgan

8:12 pm on Aug 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thetrick,

Welcome to WebmasterWorld!

The point here is that it is an "http-equiv" -type tag. This means that the information contained in the tag can be treated as equivalent to information contained in the HTTP header served with the file.

However, in the trip from your server to the end-user, very few if any of the involved agents - caches, proxies, routers, etc. - are going to read and interpret your html document. Only certain proxy caches, such as those used to speed up long-latency high-speed connections like satellite links, will actually interpret your document contents. Therefore, the majority of them never "see" your tag.

Some robots might honor the tag and some browsers might honor the tag, but it does not work consistently-well.

There's another variation on this - the "pragma no-cache" tag that does seem to work consistently in browsers, but I've never had much luck with controlling search engine caches or network caches using any html-tag method.

The method that does seem to work consistently is to set up your server to send a proper Expires directive in the http headers served with your files.

You might want to read some of the tutorials here [web-caching.com] and try the cacheability checker tool to check your files.

Web robots may delete expired documents from a search engine, or schedule a revisit.

The observed behaviour of the most important search engine robots is that they re-spider your site and update their indexes only when they feel like it, and there's not much you can do to control it. Some spider the sites they deem most important more often, so that's one angle you can use - get more relevant good-quality incoming links.

Hope this helps,
Jim