Forum Moderators: open

Message Too Old, No Replies

innerHTML property vs actual HTML

the property doesn't match what's really there

         

Purple Martin

1:12 am on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to find the position of some text in a page. Having discounted other ways of doing this because they don't work in some browsers, I'm looking for the text in the body's innerHTML property. Unfortunately, using IE6, when I look at what's in the property I can see that it is different from the real HTML in my file. Here is a small page which demonstrates the problem:

<html>
<head>
<script language="JavaScript">
function search(form) {
var searchText = form.findthis.value.toLowerCase()
var everything = document.body.innerHTML.toLowerCase()
var position = everything.indexOf(searchText)
var message = "You searched for '"+ searchText
message = message + "'.\n\nThe lowercased innerHTML looks like this:\n\n"
message = message + everything + "\n\nPosition:\n" + position
alert(message)
}
function doFocus() {
document.form1.findthis.focus()
}
</script>
</head>
<body onLoad="doFocus()">
<form name="form1" onSubmit="search(this); return false">
<input type="text" name="findthis" size="15">
<input type="submit" value="Find">
</form>
<p>This is some text. Try searching for one of these words.</p>
<p>In theory you should get the position of the word you've searched for, but you'll actually get 76 instead.</p>
<p>Have a close look at the innerHTML displayed in the alert, you'll see that it doesn't match the actual HTML code.</p>
<p>Specifically, there are problems with the attribute quotes in the input element.</p>
</body>
</html>

DrDoc

1:41 am on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mozilla leaves the quotes exactly the way they are...
Opera replaces the double quotes with single quotes...
IE removes the double quotes entirely...

<added>
Using single quotes causes Mozilla to replace them with double quotes...
Opera leaves the quotes alone...
IE again removes the quotes entirely...
</added>

Purple Martin

2:41 am on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Interesting. It's as if each browser rebuilds the innerHTML property from it's stored DOM of the page instead of just getting the raw HTML code.

But that doesn't explain why I can't get a result for a position past 76! It really shouldn't matter if the string variable everything has single quotes or double quotes or no qoutes for the attributes. I should still be able to use indexOf(), shouldn't I?

Purple Martin

3:03 am on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm getting somewhere. The value of the text field starts at position 76. The browser can't read past that position in the string, presumably because the value changes evey time I type in it (thus changing the innerHTML property).

This still seems like bizarre behaviour to me, because all I want to do is use inString() on a normal string variable! How strange that I can't get a position in a string... maybe the browser doesn't actually store the variable as a genuine string object, maybe it still has a reference back to the objects in the DOM?

My best work-around so far: I put the paragraphs in a div with an id, and I look at the innerHTML of that div. Because the form is outside the div it doesn't cause problems. I still don't like this solution: what if my page has another form somewhere in it - I still want to be able to find text that's after the form. :-(