homepage Welcome to WebmasterWorld Guest from 54.226.166.224
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Optimizing .xsl pages
twright




msg:52482
 7:23 pm on Jun 16, 2003 (gmt 0)

I have a potential new client whose site is built using .xsl pages. I haven't really dealt with these before, but I am assuming that Google can read these pages ok, but I'm not seeing many SERPs showing results with .xsl.

I've worked with lots of dynamically driven sites, but this one is a new one. Any insight on how Google treats these pages - and insight on how to get the dynamic content indexed, would be greatly apprecaited.

 

Seattle_SEM




msg:52483
 8:27 pm on Jun 16, 2003 (gmt 0)

XSL (eXtensible Stylesheet Language) is applied to XML Data, to output either text, html, or xml.

Your clients page would not typically look like this in the address bar:
[blah.com...]

If they're depending on the client (browser) to apply an XSL to existing XML data, then it would look like this: [blah.com...]

And then the XML file would reference the XSL file, which is used to render the final output HTML.

Or, more commonly, an XML DOM such as MSXML is used to apply an XSL to an XML document, and output HTML. In this case, the URL would look like this, and Google would index it as normal:
[blah.com...]

So is it server-side or client side XML processing?

twright




msg:52484
 8:38 pm on Jun 16, 2003 (gmt 0)

That's the confusing thing - the address bar has a .xsl extension - and this is throughout the entire site (thousands of dynamically generated pages). I was under the impression that they would appear as .xml as well. I'm not an expert with xml but I've dealt with it enough to that the xsl is the stylesheet that renders the xml visible.

They have several pages in the Google index, but less than 50, definitely not good saturation for a site of this size. I think they may have, in the past, been involved in some cloaking activity, but I haven't received confirmation of that yet.

It looks like Google is reading the pages, but only when .xsl is followed by database query language (? and =).

My guess is that Google won't read any of the main pages because of the .xsl extension, but will read some of the dynamically generated pages because of the query language.

Any thoughts on optimizing a site like this?

vincevincevince




msg:52485
 8:45 pm on Jun 16, 2003 (gmt 0)

mod rewrite it all as .htm?
that's solid and dependable?

Seattle_SEM




msg:52486
 8:49 pm on Jun 16, 2003 (gmt 0)

Are they actually XSL files in the source, or are they just being silly and naming them all *.xsl?

View the source of one of the pages, and let us know if it looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet
version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:user="urn:my-scripts"

exclude-result-prefixes="#default">

<xsl:output method="html" indent="yes" />

<!-- blah blah -->

twright




msg:52487
 9:03 pm on Jun 16, 2003 (gmt 0)

Actually,I think you may have helped me to stumble upon the problem. Looks like they may just be HTML pages masquerading as .xls files. Below is what they have, so I guess if we get the business my first task is to rename the files and make sure they still work with the dynamic side of things. I think that may actually work!

<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

Seattle_SEM




msg:52488
 9:09 pm on Jun 16, 2003 (gmt 0)

That's what you call a bad programmer trying to impress an even more clueless manager ;-)

twright




msg:52489
 9:11 pm on Jun 16, 2003 (gmt 0)

Yea, this potential client could be great for us, but I'm scared to death about the project, considering this isn't the only weird thing I've seen on the site. And it's a BIG company too...Crazy stuff!

rogerd




msg:52490
 9:11 pm on Jun 16, 2003 (gmt 0)

Looks like they may just be HTML pages masquerading as .xsl files.

That's one way to give your site a shiny, new, high tech look... rename all your html files to the flavor of the month. (Though I think .xml or .aspx might be more impressive.) ;)

twright




msg:52491
 9:20 pm on Jun 16, 2003 (gmt 0)

We may be getting off on a tangent here, but I don't know WHY they would do this. I will get to talk to the Webmaster, so I'll ask, but I'll be discreet because his boss probably won't know that he did this.

But WHY? The general public could give a crap whether the page is written in HTML, ASP or Klingon for that matter, as long as it works. And these pages do work.

There could be several answers, but it's really strange. Why would someone do this?

Dolemite




msg:52492
 11:36 pm on Jun 16, 2003 (gmt 0)

But WHY? The general public could give a crap whether the page is written in HTML, ASP or Klingon for that matter, as long as it works. And these pages do work.

There could be several answers, but it's really strange. Why would someone do this?

Sounds like he's reaching for some kind of dork-webmaster street cred, even though most would realize .xsl isn't what he thinks it is.

mil2k




msg:52493
 9:12 am on Jun 17, 2003 (gmt 0)

A year ago my boss faced similar problem. The Savvy programmers created an unknown extension file. The site wasn't indexed by google for 6 months. With great difficulty the site is now finally indexed.:)

We may be getting off on a tangent here, but I don't know WHY they would do this

Bcoz search engines are the last thing they have on their mind. Also the desire to be different than those ubiquitous .htm files ;) Added to that ignorance is a bliss :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved