Forum Moderators: open

Message Too Old, No Replies

google and XML

can it read it?

         

sullen

9:55 am on Apr 7, 2003 (gmt 0)

10+ Year Member



OK another question. Can Google read XML? If yes, can it actually combine it with the XSL (and therfore work out what the links are)?

Brett_Tabke

10:08 am on Apr 7, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



List yes. Index yes - cache yes - decode - looks to be no.

[google.com...]

sullen

10:16 am on Apr 7, 2003 (gmt 0)

10+ Year Member



ta very much.

Why ever didn't I think of doing a search like that?

fmonk

9:36 am on Apr 8, 2003 (gmt 0)

10+ Year Member



I just installed MovableType on my host and ran ht://Dig to update my site search, it returned many errors when it hit the .rdf and .xml files that MT created, many of which were listed as "search-engine spamming."

Has me wondering how googlebot and other spiders will react, should I exclude these file types in my robots.txt file?

chiyo

10:15 am on Apr 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are hundreds of Mt sites with the default xml and rdf files (usually used for RSS delivery) indexed on Google. Becuase of their nature (just providing the last 10 or so links with short descriptions) of your last 10 entries, htdig may interpret these multiple index files as "duplicate content". I dont think Google would, but dont take my word for it. Its a guess.

sullen

10:23 am on Apr 8, 2003 (gmt 0)

10+ Year Member



I posted a thread on xml and google yesterday.

answer seems to be that Google reads and indexes them, but doesn't cache or decode them (i.e. it won't be able to tell if there are any links in the docs)

fmonk

10:24 am on Apr 8, 2003 (gmt 0)

10+ Year Member



Yeah, but it's an educated guess... as you said there are hundreds of MT sites indexed so maybe google is smart enough to overlook these files types.