Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

xml vs. html for SERPS..

is XML web content better in google's view?

         

vacorama

4:54 pm on Jul 1, 2005 (gmt 0)

10+ Year Member



hello,
i just got my first XML book, definitley seems like a technology where you don't fully realize why you need it until you learn it. Looks cool though and i'm definitley going to stick with it.. I was wondering if redoing my site in XML would have any effect on my rankings? Right off the bat i got a lot of messy html that could use some trimming down, so im sure the 'code / text ' ratio would benefit nicely. does anyone know if having keywords as xml tags help?

Clint

11:56 am on Jul 2, 2005 (gmt 0)



I'm interested in knowing this as well. It's something I've wondered about since I first heard of XML. If you're currently ranking well in SE's, I wouldn't risk changing anything. I have heard that just changing a template can cause one to lose SERP's in G.

decaff

1:04 pm on Jul 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"...does anyone know if having keywords as xml tags help?..."

Not an unreasonable question...as with XML you can actually write your own set of tags to describe your products/descrptions/content..

I would venture to say that, like with HTML, the bots will strip out the tags (anything between < and / > and simply index the text ... effectively ignoring what the tags say...

No doubt search engineers have already discussed the fact that people would look to "gain some sort of ranking advantage" by "keyword stuffing" their XML tags...

The fact that with XML/CSS combined....you bring the all important body text to closer to the top of a page... is a good reason to get this technology under your belt...it also can reduce the size of a page with all the off page formatting ...

encyclo

4:22 pm on Jul 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google does not recognise XML - or rather it sees it as plain text only. It can parse it but anything shown as a search result will be followed by the line "File Format: Unrecognized - View as HTML". I haven't tested but I don't think it even strips out the tags.

XML is great server-side, but for the front end, stick to good old HTML (you can use XLST to switch XML to HTML).

Clint

9:27 am on Jul 3, 2005 (gmt 0)



I believe G can see XML now. In their SERP's, (for sites that use XML), you now see a "View in XML" link below the hit....if that means anything.

Johan007

2:17 pm on Jul 3, 2005 (gmt 0)

10+ Year Member Top Contributors Of The Month



You cant have a client side page made up of XML can you? Do you mean XHTML? If so then Google will strip out the tags like it does for HTML so it will not make any deference because XHTML will never replace HTML.

However you should aim to design all your new sites in valid XHTML to work well in future and assistive technologies.

py9jmas

3:30 pm on Jul 3, 2005 (gmt 0)

10+ Year Member



Indeed you can. The browser applies either a CSS stylesheet or a XSL transform to present the data in the XML.

Look at [w3.org...] for example*, then look at the source code. Doing the transform server side and delivering (X)HTML to the client is safer, I agree.

* assuming a decent browser, ie Mozilla or Opera.

encyclo

5:32 pm on Jul 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Look at [w3.org...]

Yes, and look at that page in the SERPs: URL only, no snippet, no cache. What matters to Google (and the others) is the MIME type: if it is

text/xml
Google will parse it as plain text and display it in the SERPs as an unknown format. If like the above page the MIME type is
application/xml
or
application/xhtml+xml
then it won't be parsed at all. So the original question - would it have any effect on rankings - is yes, it would destroy them.

Try this search: filetype:xml test xml [google.com]

Some are seen as unknown, some are parsed and cached. Check the MIME types for each result (right-click, View Page Info in Firefox). Only

text/html
(standard HTML pages but just using an .xml file ending) or
text/plain
(which when viewed show the source code) pages have been parsed.

It's not a question of what the browsers can do (IE, Firefox, Opera and Safari can all read styled

text/xml
pages) but what Googlebot can do - and Googlebot can't parse XML in any useful way.

HTML (or XHTML served as

text/html
) is the only way.

<added> See this thread [webmasterworld.com], especially message #2 ;) </added>

Clint

8:42 am on Jul 4, 2005 (gmt 0)



Johan, I don't recall the search I did when I saw the "View XML" links under one of the spots. I'll be watching for them now to see if they were XML or XHTML. I'm pretty sure it was XML because I think I clicked the link and saw the strange page with raw XML code instead of a typical webpage.

Clint

8:46 am on Jul 4, 2005 (gmt 0)



Just what is the difference between XML and XHTML? Isn't it XHTML that now uses for one example <br /> instead of <br>?

uk_webber

8:52 am on Jul 4, 2005 (gmt 0)



"The fact that with XML/CSS combined....you bring the all important body text to closer to the top of a page... is a good reason to get this technology under your belt...it also can reduce the size of a page"

I do this with HTML and CSS using DIVs...

petehall

9:30 am on Jul 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought XML was designed to compliment HTML + CSS, not replace it...

XML makes managing data really easy - especially if you want to share data across servers / sites without them connecting to your database.

mrMister

12:11 pm on Jul 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I vaguely remmber someone mentioning on these forums that Googlebot had been seen with XML in its HTTP_ACCEPT header. I can't find the post though.