Welcome to WebmasterWorld Guest from 54.234.244.30

Forum Moderators: httpwebwitch

Message Too Old, No Replies

Remove duplicates in XSLT 1.0

     
9:05 am on Aug 23, 2011 (gmt 0)

New User

5+ Year Member

joined:Nov 21, 2007
posts: 33
votes: 0


Hi,
I am stuck with a problem removing duplicate nodes.
My XML looks like:

...
<stat>
<overview>

<tools>
<item resp="abc"><link id="...">Item 1</link></item>
<item resp="abc"><link id="...">Item 2</link></item>
<item resp="abc"><sub><link id="...">Item 3</link></sub></item>
</tools>

<tools>
<item resp="abc"><link id="...">Item 4</link></item>
<item resp="abc"><sub><link id="...">Item 1</link></sub></item>
</tools>

<tools>
<item resp="abc"><link id="...">Item 1</link></item>
<item resp="abc"><link id="...">Item 5</link></item>
</tools>

</overview>
</stat>


Now I want to output all unique "Items" with @resp=abc.
To just print all is no problem using:

<ul>
<xsl:for-each select="//overview/tools//item[contains(@resp,$role)]">
<xsl:apply-templates select="."/>
</xsl:for-each>
</ul>

That gives me:
Item 1
Item 2
Item 3
Item 4
Item 1
Item 1
Item 5

But my desired outcome, removing duplicates, is:
Item 1
Item 2
Item 3
Item 4
Item 5

I tried to compare each item's name with all preceding ones and skip the ones already existing by:

<xsl:for-each select="//overview/tools//item[contains(@resp,$role) and not(text() = preceding::item[contains(@resp,$role)]/text())]">
<xsl:apply-templates select="."/>
</xsl:for-each>

but with the same output (with duplicates).

Any help is much appreciated!
Regards /Claes
1:21 pm on Aug 23, 2011 (gmt 0)

Moderator This Forum from CA 

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 29, 2003
posts:4059
votes: 0


Hi Claes100,
I suspect that XSLT is capable of this, but it won't be easy and I don't know the answer. My take on XSLT is "GIGO" - Garbage In, Garbage Out... When I'm transforming XML I do expect that the XML is going to be formatted and sorted and deduped and packaged up appropriately. The tools available in XSLT are not ideal for doing this kind of work.

Approaching the same problem, I'd be inclined to change the XML input rather than knit together complex XSLT templates to do data-manipulation jobs that XSLT isn't suited for. My language of preference is PHP, and I'm pretty sure I could dedupe the XML in about 20 lines of code, with 30 minutes of work, and the result would execute faster than the same thing done with XSLT.

Do you have control over the source of the XML?

If XSLT is your only option, there may be a way

Here's a snippet I found on the interwebz. It checks each node to see if any of the preceding ones have the same value


<xsl:template match="@*|node()">
<xsl:if test="not(node()) or not(preceding-sibling::node()[.=string(current())])">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
source [jguru.com]
1:43 pm on Aug 23, 2011 (gmt 0)

New User

5+ Year Member

joined:Nov 21, 2007
posts: 33
votes: 0


Thanks!
No, I don't have control over the XML source unfortunately...
I'll give it a try tomorrow, and yes, it would have been easier (and more fun) to solve it by php... :)
/Claes
8:39 pm on Aug 30, 2011 (gmt 0)

New User

5+ Year Member

joined:Nov 21, 2007
posts: 33
votes: 0


Just wanted to post a short comment that I didn't need to test the snippet above. The structure of the XML is changed and duplicates will no longer exist.
Thanks anyways!
/Claes
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members