Forum Moderators: open

Message Too Old, No Replies

Preserve HTML entity from JavaScript string?

Browsers keep converting   to  .

         

JAB Creations

5:50 pm on Sep 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to insert an HTML entity (double space entity) like so...

myString = myString.replace(/\. /g,String('.   '));
myString = myString.replace(/\? /g,String('?   '));
myString = myString.replace(/\! /g,String('!   '));


However browsers keep replacing   with  . None of the global functions [w3schools.com] seem to be preserving the HTML entity when it's inside the string. I haven't been able to track down a native JavaScript method that inserts custom/specific entities. I also want to use numeric entities specifically. Thoughts please?

- John

penders

10:41 pm on Sep 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not sure how the HTML entity is being 'lost'. It should be preserved all the time whilst in the JavaScript string and only converted to its actual character when output in the HTML. I certainly don't see (at least not at the moment) how " " would become " " without explicitly doing this conversion at some point?

  is the entity reference for the non-breaking space (double space?). So your example essentially substitutes a single space for 3 x spaces - which appears to work OK for me.

<pre><script type="text/javascript"> 
myString = 'Hello. World';
myString = myString.replace(/\. /g,String('. &#160; '));
for (var n=0; n<myString.length; n++) {
document.write('-' + myString[n]);
}
</script></pre>


This outputs the following:
-H-e-l-l-o-.- -&-#-1-6-0-;- -W-o-r-l-d


The hyphen between each character is just to prevent the HTML entity from being converted (by the browser) when output to the page. This works OK for me in IE8, Chrome 13 and Firefox 3.6.

Do you have an example where this entity is being replaced?

astupidname

7:39 am on Sep 14, 2011 (gmt 0)

10+ Year Member



Ah, I see why you have the problem, this is one area where setting innerHTML can be more intuitive, it takes the entities no problem. But you're obviously (and historically well noted :) using createTextNode which does not accept entities and actually converts them. The solution = String.fromCharCode
Instead of:
String('. &#160; ')

Use:
'. '+String.fromCharCode(160)

penders

12:33 pm on Sep 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



astupidname: ...using createTextNode which does not accept entities and actually converts them.


I think you've hit the mark. However, just querying your terminology... createTextNode() takes the passed string as a literal string. Any HTML entities it might contain are just treated as literal characters and appear as-is in the output. It's just pure text. In my mind, nothing is actually 'converted' (as you suggest) - the HTML entities themselves are actually preserved (which is really the opposite to how I think the word "preserve" is used in the title of this thread). "&#160;" goes in, "&#160;" comes out. Is there a conversion that I'm not seeing?

Although this wouldn't explain why "&#160;" would become "&amp;#160;" - unless that is not actually what is happening, and is just an assumption from the apparent output?

JAB Creations

4:10 pm on Sep 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thank you both for your replies. I tried out String.fromCharCode as it looked promising however it actually generates the actual character instead of inserting the literal HTML entity itself. So when I serialized the DOM fragment it displayed two spaces at the end of each sentence instead of outputting the entity.

So then I decided if it's going to RENDER the entity I could maybe fool the computer in to rendering the ampersand before the entity (minus the leading ampersand).

var mystring = mystring.replace(/\. /g,'. '+String.fromCharCode(38)+' #160; ');


The result?

&amp;#160;


...nope.

Yes Aname, I am using the createTextNode method. My approach is to use string replacement to clean up the strings before generating the text node though. Penders mentioned that using the createTextNode method would output the literal string version of the entity so I did a quick test...

var test = document.createTextNode('&#160;');
alert(test.nodeValue);


...it outputs....

&#160;


...as desired!

So I'm left with a few choices at this time. I can continue seeing if I can output a literal string version of an HTML entity (perhaps though I think unlikely unless it's buried deep in DOM 1/2/3 documentation) or I could split the text string and create a node (with a dedicated node for the entity) in a fashion and merge them together in to a single string. I also just tried the following...

var a1 = '&';
var a2 = '#160;';
var a3 = a1+a2;
alert(a1+a2+'\n\n'+a3);


...which does output the literal string version of he HTML entity. So I tried the following...

var en1 = '&';
var en2 = '#160;';
var mystring = mystring.replace(/\. /g,'. '+en1+en2+' ');


...which doesn't work go figure.

So unless someone comes across a method that doesn't seem to exist I think since regular expressions won't output literal string versions of HTML entities I'm going to have to split the main string where I want to regex the entity in to the string, insert the string at the end of each text node and then finally loop through the generated textNodes and append them to the end...

Okay here we go...

The problem was that while working with the string that is a nodeValue browsers will attempt to escape ampersands. So I did the following...

var mystring = String(node.nodeValue);
mystring = mystring.replace(/\. /g,'. &#160; ');


...which works!

The odd thing to keep in mind is that when alerting the typeof of a nodeValue string is that the browser will still tell you that it's a string while still treating it as direct (X)HTML.

So I have my answer, thank you both! This really helps me clean up the output of the editor I'm working on. :)

- John