Forum Moderators: open

Message Too Old, No Replies

xml

are those pages spidered

         

GetVisible

3:40 pm on Oct 5, 2002 (gmt 0)

10+ Year Member



I've having trouble understanding whether xml pages can easily be spidered by the search engines - does anyone have any ideas?

Mark_A

3:57 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not even knowledgeable on xml but my understanding is that xml is about data / content management and integration from multiple sources into various sources one of which can be an html page. So it mainly allows seperation of content from formatting and merging of content from multi sources with formatting for display / output.

If my understanding is correct then browsers and spiders need not know that the page was generated via an xml interface server side as it were, to them it should / could be a normal page.

I am sure someone will correct me if I am wrong.

bird

9:28 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, XHTML is technically a form of XML, and I have no problems at all getting that spidered.

On the other hand, if you're thinking about plain XML data based on other DTDs, then your guess is as good as mine. Some spiders may still be able to extract text from that, but I wouldn't count on it.

GetVisible

9:41 am on Oct 6, 2002 (gmt 0)

10+ Year Member



I've been looking at the webmaster pages on google and they say that when they spider dynamic pages they can put excessive pressure on the site's servers thus they limit the number of pages they crawl - may have answered my own guestion - will have to look at the others though.

bird

11:04 am on Oct 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First you asked about XML, now you're talking about dynamic pages, which are two completely independent concepts. What's your problem really about?