Forum Moderators: phranque

Message Too Old, No Replies

What information can the Engines see on your host?

Date, time of updated files? Length in bytes? what else?

         

larryhatch

7:35 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd like to know just what information is generally available to the search engines when they spider.

Do they take note of the date/time of updates to your files? How about the length in bytes?
What OTHER info can they or do they download?

- Larry

brotherhood of LAN

7:44 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Larry,

I couldn't answer what they actually look for, you can download a program called CURL or WGET to see what's available to them though.

They both work from the command line, this is webmasterworld's headers:

C:\Documents and Settings\Richard Lees>c:/curl -I www.webmasterworld.com
HTTP/1.1 200 OK
Date: Fri, 07 Jan 2005 07:41:20 GMT
Server: Apache/1.3.26 (Unix) FrontPage/5.0.2.2510
Cache-Control: max-age=0
Pragma: no-cache
X-Powered-By: BestBBS v3.15
Content-Type: text/html

I think the first 3 are mandatory in any HTTP response.

Some people are keen to hide the "powered by" header to disguise the fact they are using a server side language.

Others will deliberately alter the "last modified" headers to make the page appear fresher than it is.

Most of it can be faked, I doubt the SE's give any "special" attention to any of the headers outside the caching ones, though I'm probably missing something obvious :)

larryhatch

8:51 am on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Bro.. I suppose the SEs pull in something similar.

Maybe I missed it, but in the WebmasterWorld example you gave, I did not see the file length in bytes.
Of course, if G or Y spiders in the whole page, they can just look at the file length on their own logs.

My reason for asking is that I'd like to know which fields or info might be relevant to ones rankings.
Anyone can falsify their meta tags, but a change of file-length might indicate an actual revision better.

Best wishes - Larry

brotherhood of LAN

12:05 pm on Jan 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>change of file length

Possibly, but then there would be occassions where you've updated a page, and it just so happens it's the same length in bytes.

I guess the HTTP spec created the likes of the "last modified" header for this kind of thing. In an ideal world (where the data was always true), you could just use that to see if a page had been changed or not.