homepage Welcome to WebmasterWorld Guest from 107.20.109.52
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / XML Development
Forum Library, Charter, Moderators: httpwebwitch

XML Development Forum

    
Remove duplicates in XSLT 1.0
Claes100




msg:4354456
 9:05 am on Aug 23, 2011 (gmt 0)

Hi,
I am stuck with a problem removing duplicate nodes.
My XML looks like:

...
<stat>
<overview>

<tools>
<item resp="abc"><link id="...">Item 1</link></item>
<item resp="abc"><link id="...">Item 2</link></item>
<item resp="abc"><sub><link id="...">Item 3</link></sub></item>
</tools>

<tools>
<item resp="abc"><link id="...">Item 4</link></item>
<item resp="abc"><sub><link id="...">Item 1</link></sub></item>
</tools>

<tools>
<item resp="abc"><link id="...">Item 1</link></item>
<item resp="abc"><link id="...">Item 5</link></item>
</tools>

</overview>
</stat>


Now I want to output all unique "Items" with @resp=abc.
To just print all is no problem using:

<ul>
<xsl:for-each select="//overview/tools//item[contains(@resp,$role)]">
<xsl:apply-templates select="."/>
</xsl:for-each>
</ul>

That gives me:
Item 1
Item 2
Item 3
Item 4
Item 1
Item 1
Item 5

But my desired outcome, removing duplicates, is:
Item 1
Item 2
Item 3
Item 4
Item 5

I tried to compare each item's name with all preceding ones and skip the ones already existing by:

<xsl:for-each select="//overview/tools//item[contains(@resp,$role) and not(text() = preceding::item[contains(@resp,$role)]/text())]">
<xsl:apply-templates select="."/>
</xsl:for-each>

but with the same output (with duplicates).

Any help is much appreciated!
Regards /Claes

 

httpwebwitch




msg:4354514
 1:21 pm on Aug 23, 2011 (gmt 0)

Hi Claes100,
I suspect that XSLT is capable of this, but it won't be easy and I don't know the answer. My take on XSLT is "GIGO" - Garbage In, Garbage Out... When I'm transforming XML I do expect that the XML is going to be formatted and sorted and deduped and packaged up appropriately. The tools available in XSLT are not ideal for doing this kind of work.

Approaching the same problem, I'd be inclined to change the XML input rather than knit together complex XSLT templates to do data-manipulation jobs that XSLT isn't suited for. My language of preference is PHP, and I'm pretty sure I could dedupe the XML in about 20 lines of code, with 30 minutes of work, and the result would execute faster than the same thing done with XSLT.

Do you have control over the source of the XML?

If XSLT is your only option, there may be a way

Here's a snippet I found on the interwebz. It checks each node to see if any of the preceding ones have the same value


<xsl:template match="@*|node()">
<xsl:if test="not(node()) or not(preceding-sibling::node()[.=string(current())])">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
source [jguru.com]
Claes100




msg:4354523
 1:43 pm on Aug 23, 2011 (gmt 0)

Thanks!
No, I don't have control over the XML source unfortunately...
I'll give it a try tomorrow, and yes, it would have been easier (and more fun) to solve it by php... :)
/Claes

Claes100




msg:4356841
 8:39 pm on Aug 30, 2011 (gmt 0)

Just wanted to post a short comment that I didn't need to test the snippet above. The structure of the XML is changed and duplicates will no longer exist.
Thanks anyways!
/Claes

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / XML Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved