Forum Moderators: open

Message Too Old, No Replies

location.href URL value when form action is POST on remote page

POST, GET, location.href, ajax, file_get_contents

         

dusky

2:23 am on Jan 8, 2010 (gmt 0)

10+ Year Member



Time to swallow one's pride and ask for help here, makes a change from helping others on different matters and threads.

On one of the projects I am working on, I have it almost done, but except for one hurdle, which if I carry on banging my head, I'll eventually work it out somehow, but the project is well overdue and don't want to waste more time, so here goes my problem:

If you have seen or worked with the module autolink (they call it Multihook now) from Zikula / PHPnuke and other variants CMSs, that's what I want to achieve BUT the difference is on remote sites that are hosted using our hosting service, that is, they get a javascipt snippet, place it on their header templates or HTML/PHP pages and that will call the JS script on our site which itself calls php scripts to scan their page, search and replace certain keywords with recommended partner sites, basically if it finds widget, it'll turn it into hyperlinked link with a target url as the partner site and the alt title is the message such as visit our partner about widgets.

Now, for simplicity sake, I have replace.php as the main PHP script which calls different functions and include files that are there to do the search and replace etc on the $_GET['url'] or $_REQUEST['url'] which is $url.
I use $data = file_get_contents($url); to analyze and do the search and replace, then return and echo that at the end of the script which will do the scan, search and replace on users site / pages.

Also have replace.js is the js script, on users sites / pages, they place this on their pages

 <script type="text/javascript" src="http://mysite.com/replace/replace.js"></script> 
and change the body tag from <body> to
 <body id='src' onload="doHttpRequest();"> 
. doHttpRequest is the Ajax function which has .....replace.open("GET", "http://mysite.com/replace/replace.php?url="+url, true);.....and the url var being location.href.

When I test it on my localhost machine with many sites locally, it does the job in every respect, EXCEPT:

When you are on a site that has, say a form to fill and that form has a preview page and the action is POST instead of GET with hidden values, the URL (url = location.href; in the doHttpRequest function) becomes the same as from the previous page, so the POST or GET request in the above ajax will send the url without parameters as it does not see them because the action of the form is POST, just like normal users can't see them on the address bar, and I end up with the preview page in news articles, forum posts etc being refreshed back to the original url with empty fields. I.e, when you are on /postnews.php, you fill in the news form, press PREVIEW, you are supposed to be in /postnews.php?action=preview&id=1234&watever=something, you only see /postnews.php and that's fine because the action was POST, the problem is when you place the js script on the page, the doHttpRequest function is grabbing /postnews.php sending it to the php S&R function instead of /postnews.php?action=preview&id=1234&watever=something

Can't blame the php Search & Replace functions, they get the $url from the ajax as $_GET[url] and that url is /postnews.php

I know I am close if I say it has to do with the ajax request needs to find a way to get the URL as if its action was a GET. I tried and changed the form action on news and forums to "GET" and all works OK on our sites, of course I can't ask other sites to change their forms to use the GET instead of POST action.

I'll appreciate a step by step help to solve the problem, PM me if you wish to help and give me your favorite charity details once all this is solved and I promise I'll still scan those forums and try and help in other matters in which I am more qualified.

astupidname

11:00 pm on Jan 17, 2010 (gmt 0)

10+ Year Member



Just a thought that occurs to me is, why parse the page with php? Granted, ok. you are set-up already to use ajax call to php, less efficient than if you do the replacing strictly from the javascript, but whatever -aside from that though, why must the php grab the page? If you really prefer to parse the page in it's "live" state perhaps you could just send document.body.innerHTML (instead of url) to the php and have the php parse that and return it, then declare that document.body.innerHTML = (the response text from ajax).
Just my thoughts off the top of my shallow head, don't know really if I'm being dense here or making sense at all, but good luck.

dusky

12:20 pm on Jan 18, 2010 (gmt 0)

10+ Year Member



astupidname, thanks for your reply, the work has already been done in php, functions, filtering, regex etc and to have to do that all in javascript may take few more hours. I know translating it should not be a major problem, but because a big chunk of the code is about the interaction with the database and other php scripts, javascript alone would not be the best option, I agree that it's quicker and more efficient if there were only few things to be changed, but the database has 10k+ keywords and phrases as well querying it for all sorts of things from stats and tracking to the admin UI.

Now, below I have in brief, not all content of the scripts except the ajax calls:

1) the ajax file, lodfunctions.js file:


var xmlhttp
function rePlace(){
xmlhttp=GetXmlHttpObject();
if (xmlhttp==null){
alert ("Your browser does not support AJAX!");
return;
}
var page = escape(location.href);
var dest = "?url=" + page;
var url="http://localhost/ads/replace.php" +dest;
xmlhttp.onreadystatechange=stateChanged;
xmlhttp.open("GET",url,true);
xmlhttp.send(null);
}
function stateChanged(){
if (xmlhttp.readyState==4){
document.getElementById("theID").innerHTML=xmlhttp.responseText;
}
}
function GetXmlHttpObject(){
if (window.XMLHttpRequest){
// code for IE7+, Firefox, Chrome, Opera, Safari
return new XMLHttpRequest();
}
if (window.ActiveXObject){
// code for IE6, IE5
return new ActiveXObject("Microsoft.XMLHTTP");
}
return null;
}

2) the test HTML or PHP page, test.html or test.php


<html>
<head>
.........
.........
<script type="text/javascript" src="http://localhost/ads/lodfunctions.js"></script>
</head>
<body id="theID" onload="rePlace();">
....................
...................
</body>
</html>

3) the replace.php script which has other php include files doing the regex and filtering, database stuff etc


<?php
$file = $_REQUEST['url']; // or $file = $_GET['url'];
.................
$data = file_get_contents($file);
....................
few include php files with function to analyze, parse and return content of $data (they return $content;)
..............................
.............................
echo $content;

?>

You see, testpage.html passes its content to lodfunctions.js, which sends that content to replace.php, the php script echos the parsed content with the replaced keywords. All works OK except on submitted preview forms...as explained above.

passing document.body.innerHTML instead of the location.href does make sense, I tried the below change but got errors from php about file_get_contents encountering problems:
I tried: var page = document.body.innerHTML; instead of var page = escape(location.href);
and
document.body.innerHTML=xmlhttp.responseText; instead of document.getElementById("theID").innerHTML=xmlhttp.responseText;
I can see theID is missing, not sure if I change
<body id="theID" onload="rePlace();">
to:
<body onload="rePlace();">

I am also aware that I should use POST in the ajax calls sending data url encoded with the relevant headers, but that does not seem to be the issue.

Any pointers will be greatly appreciated, I was never a javascipt programmer and know little clientside for that matter, Perl is my main thing for solid backend programming and PHP for frontend site design, but I admit clientside is back in fashion due to ajax/jquery which can't be ignored.

Fotiman

3:51 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Have you considered how this will affect sites that have their own scripts and event handlers within the <body></body> tags? You are essentially replacing any scripts and/or event handlers when you replace the innerHTML on the body, so that could have a major impact. Replacing the innerHTML of the body is not a good idea.

Also, sending the page as a URL which must then be parsed by your PHP page will not work for pages that process form POST data, and maybe not for some other pages as well. If someone has any sort of server side script that checks the referrer to deliver customized content, then that won't work for your PHP script.

The best alternative is to have your JavaScript send requests back to the server, passing the text to be replaced as a parameter instead of passing a URL to be parsed. For example, have your JavaScript walk the DOM and send each text node to your PHP page. The PHP page would then do the lookup for any keywords in the database, and return the modified value, and your JavaScript would replace the appropriate node. That's really the only way that will work and will be safe from destroying existing event handlers, and will work for pages that display POSTed data.

dusky

4:59 pm on Jan 18, 2010 (gmt 0)

10+ Year Member



Fotiman, thanks for your response.
Have you considered how this will affect sites that have their own scripts and event handlers within the <body></body> tags? You are essentially replacing any scripts and/or event handlers when you replace the innerHTML on the body, so that could have a major impact. Replacing the innerHTML of the body is not a good idea.

Yes, I've done a lot of filtering work using a lot of Perl regex, well in the php scripts to only replace the keywords or phrases nothing else and ignores any tags, script calls etc.

Also, sending the page as a URL which must then be parsed by your PHP page will not work for pages that process form POST data

Fotiman, this is the reason I started this thread, all works OK except on a page with POST form data with submit button to preview the page, then it becomes a problem. As I said above, in test.php if there is a form to fill in, say a forum post or reply with action=post, you press preview which should go to test.php?action=preview&id=234 for example, it's normal to only see test.php on the address bar because of the action not being a GET action, the ajax function sends the location.href to php's file_get_contents function for processing and that location.href is test.php, so it becomes file_get_contents(test.php); WHEN it should be file_get_contents(test.php?action=preview&id=234);. If the form action is GET all works OK.
astupidname suggested document.body.innerHTML instead of location.href which does make sense as the page content to be processed is the current page where the user is in the real sense, however, I am not too sure how to implement that, so far I get few error complaints from the file_get_contents function in the php script.
I thought of sending the filtered keywords $search and $replace after doing all the required regex on the content in a js snipped in php and print / echo it back to the calling page, using a dynamic way of placing that on the page itself like this:


<script type="text/javascript">
onload = function()
{
document.body.innerHTML = document.body.innerHTML.replace(/WIDGET/gi, 'REPLACED');
}
</script>

As the above snipped works when placed on the test.php or test.html (adding the keyword WIDGET on the page), however, I needed to adjust it to something like this:

....document.body.innerHTML.replace(/<?php print $search; ?>/gi, <?php print $replace; ?>);

OR ....document.body.innerHTML.replace(/<?php echo $search; ?>/gi, <?php echo $replace; ?>);

but this did not do it either. Even used document.write the above preceded with js tags.
I can go all out js solution, but that means I have to re-rewrite everything in javascipt, something which I am shaky about and would get lost when the database has to be queried as well as interaction with other scripts is needed, let alone having to adjust my Perl and PHP knowledge to the way js code is constructed.

astupidname

5:27 pm on Jan 18, 2010 (gmt 0)

10+ Year Member



Fotiman brings up excellent points, which were pretty much after-thoughts I was having also after my previous response (was being somewhat dense, I suppose). Walking the DOM and gathering text (only text) from text nodes and storing references to where the text came from would be the best option. The response handler would then be able to access the references and place the newly altered text (received back from the php) accordingly.

document.body.innerHTML=xmlhttp.responseText; instead of document.getElementById("theID").innerHTML=xmlhttp.responseText;

Were you actually using document.getElementById("theID") before? Where was the id of the proper element coming from? You could stick with working from a particular id'd element, would not actually have to be document.body. This could be a possible configuration option to allow the sites to determine whether the replacement is page-wide or limited to particular element/s?
[added]just saw you had this: <body id="theID", but I would still question whether the replacement need be page-wide or allow users to limit to particular element/s.[/added]
Just a few other side-notes at this moment, rather than requiring users to place an onload event attribute on the body tag, your script which gets linked in to the page could provide it's own onload event handling, utilizing addEventListener or attachEvent depending on which is available (browser dependant), in order to prevent conflicts with other scripts (unknown by you) requiring load events. And, it would be best if your script were set up to be one single global object with multiple methods and properties as needed, rather than a number of global functions/variables. This is intended to keep the global space "clean". For example, something like:
//lodfunctions.js file
var scriptOriginSiteName_lodfunctions = {
xmlhttp:null,
GetXmlHttpObject:function () {
if (window.XMLHttpRequest){ //IE7+, Firefox, Chrome, Opera, Safari
return new XMLHttpRequest();
} else if (window.ActiveXObject){ //code for IE6, IE5
return new ActiveXObject("Microsoft.XMLHTTP");
}
return null;
},
stateChanged:function () { //note the references to 'this.xmlhttp', as xmlhttp is now a property of this lodfunctions object
if (this.xmlhttp.readyState==4 && this.xmlhttp.status == 200){ //note should be checking for status == 200 o.k.
document.getElementById("theID").innerHTML = this.xmlhttp.responseText;
}
},
rePlace:function () {
var O = this; //we'll need a reference to the 'this' object for inside the onreadystatechange function
O.xmlhttp = O.GetXmlHttpObject();
if (O.xmlhttp==null){
alert ("Your browser does not support AJAX!");
return;
}
var page = escape(location.href);
var dest = "?url=" + page;
var url="http://localhost/ads/replace.php" +dest;
O.xmlhttp.onreadystatechange = function () { O.stateChanged(); };
O.xmlhttp.open("GET",url,true);
O.xmlhttp.send(null);
},
init:function () {
var O = this,
W = window,
f = function () { O.replace(); };
if (W.attachEvent) {
W.attachEvent("onload", f);
} else if (W.addEventListener) {
W.addEventListener("load", f, false );
}
}
};

scriptOriginSiteName_lodfunctions.init();
//end of lodfunctions.js


Note also, that if you were to allow users to configure what elements are to be having the replace done on them, there's a few ways you could do it. Simplest would be to have them give the particular element/s they want processed a particular style class name attribute, such as 'class="lodFuncReplace" or something, and then you could grab elements by class name and process only their text nodes and their childNodes text nodes and so on... Or you could have user's call the init function, and have them pass elements in to it which could then be set as the elements to process. Just a few extra thoughts I had, but note if you did allow such configuration, you could have the program default to starting from the document.body's childNode's unless the user has utilized the configuration option.
Sorry to draw you away from the real problem at the moment, but these were just things which I would possibly take in to consideration.

Fotiman

5:35 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




Yes, I've done a lot of filtering work using a lot of Perl regex, well in the php scripts to only replace the keywords or phrases nothing else and ignores any tags, script calls etc.

It doesn't matter. You're replacing all of the body innerHTML, thereby destroying any pre-existing event handlers. My point is that you can not replace the body.innerHTML.

Fotiman

5:43 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




all works OK except on a page with POST form data with submit button to preview the page,

No, it will fail on any form processing page that uses POST (which does proper validation, etc.).
For example, suppose I have a form:
<form action="processForm.php" method="post">

And suppose processForm.php does something like this:


if(!isset($_POST['city'])) {
// display error message and/or redirect back to form
}
else {
// stay at this URL and output some HTML
}

Then when your PHP page tries to access processForm.php, that page's processing might redirect back to the form page, or it might display an error message, neither of which will match what the end user sees who is already on that page. You simply can not do it this way... it will never work.

Fotiman

6:03 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Note, you can still leverage some of your existing PHP code, but rather than sending a whole page of HTML, only send smaller text-node contents. This will result in many small request, vs. one big request (or you could find some way to create a queue of items and send the queue instead as a single request, and then getting the results back as a collection/array as well and doing the replacement that way).

Here's a helper function:


function isTextNode(n) {
return (n.nodeType === 3);
}

So you could walk through all of the nodes in the DOM and call isTextNode on each one to create your array of nodes:


var nodes = [];
// start loop
if (isTextNode(n)) {
nodes.push(n);
}
// end loop
// send nodes array to the PHP for processing
// get the results back in an array, then
// start loop
nodes[i].parentNode.replaceChild(results[i], nodes[i]);
// end loop

dusky

7:22 pm on Jan 18, 2010 (gmt 0)

10+ Year Member



astupidname, thanks a lot for the example, I tried it and after few errors for "O.replace() is not a function", I realized it was a typo, so I changed

f = function () { O.replace(); };
TO:
f = function () { O.rePlace(); }; P instead of p

Looks much neater, thanks, however, I am back where I started, when a form is posted for preview, the page just empties and refreshes to the original page, I guess it wouldn't work as Fotiman said on forms with POST action and it's not an error.

Fotiman explained it nicely with the php example:

Then when your PHP page tries to access processForm.php, that page's processing might redirect back to the form page, or it might display an error message, neither of which will match what the end user sees who is already on that page. You simply can not do it this way... it will never work.

Fotiman, I'd like to have a go at implementing the isTextNode function in astupidname's example even if that needs to be modified and see if that resolves the problem, not too sure where to drop those functions in the loadfunctions.js file.

astupidname:

Note also, that if you were to allow users to configure what elements are to be having the replace done on them, there's a few ways you could do it. Simplest would be to have them give the particular element/s they want processed a particular style class name attribute, such as 'class="lodFuncReplace" or something

Yes astupidname, I do that in php filter functions I wrote, example, if they don't want a particular block of text or indeed a whole page, they'll put something like:

<span name="NoReplace">.....</span>

And everything in-between the the above tags is left alone. I also have other optional conf for users, such as leaving out anything between H tags or bold etc.

dusky

8:15 am on Mar 17, 2010 (gmt 0)

10+ Year Member



Fotiman, I decided to walk the DOM and used James Padolsey's function:

<code>function findAndReplace(searchText, replacement, searchNode) {
if (!searchText || typeof replacement === 'undefined') {
// Throw error here if you want...
return;
}
var regex = typeof searchText === 'string' ?
new RegExp(searchText, 'g') : searchText,
childNodes = (searchNode || document.body).childNodes,
cnLength = childNodes.length,
excludes = 'html,head,style,title,link,meta,script,object,iframe';
while (cnLength--) {
var currentNode = childNodes[cnLength];
if (currentNode.nodeType === 1 &&
(excludes + ',').indexOf(currentNode.nodeName.toLowerCase() + ',') === -1) {
arguments.callee(searchText, replacement, currentNode);
}
if (currentNode.nodeType !== 3 || !regex.test(currentNode.data) ) {
continue;
}
var parent = currentNode.parentNode,
frag = (function(){
var html = currentNode.data.replace(regex, replacement),
wrap = document.createElement('div'),
frag = document.createDocumentFragment();
wrap.innerHTML = html;
while (wrap.firstChild) {
frag.appendChild(wrap.firstChild);
}
return frag;
})();
parent.insertBefore(frag, currentNode);
parent.removeChild(currentNode);
}
</code>

After some tweaking to get it to replace the keywords only once, it does that BUT, it does it only in each block of text instead of the whole document.

For example, if I have the whole page with the following block of text:

The little brown dog jumped over the red fox, then the the grey dog chased after the two and each of the brown dog and the red fox ran away from the grey dog....
And suppose my keywords are: brown dog, red fox, grey dog.

When I do:

<code>
function doreplace(){
Output from php xml list or database etc..as arrays....
var oldword = ....; // brown dog for example
var newword = ....; // highlight or replace the above with new formatting (hyperlink it or place between tags etc)
findAndReplace(oldword,' <span class="theword">'+newword+'</span> ');
}
</code>

The function does the replacing OK and only replaces the keywords once, BUT when I try and test it further and duplicate the above paragraph twice in the page, ie:

The little brown dog jumped over the red fox, then the the grey dog chased after the two and each of the brown dog and the red fox ran away from the grey dog....<br />The little brown dog jumped over the red fox, then the the grey dog chased after the two and each of the brown dog and the red fox ran away from the grey dog....

the function instead of reading the whole page / document to replace only once in it, it replaces once in each paragraph, and ends up consequently if the paragraph is duplicated 10 times with keywords replaced 10 times, as you can see there is a <br /> tag separating the paragraphs and I suspect I need to render all the paragraphs into one single textnode. I tried the ifs below, but that did not do it:
<code>
//if (currentNode.nodeType !== 3 || !currentNode.nodeValue.match(/[^ \f\n\r\t\v]/) ) {
//if (currentNode.nodeType !== 3 && currentNode.nodeValue !="\n" ) {
//var filteredText = document.body.innerHTML.replace(new RegExp( "[<br \/>|<br>]", "g" ), "");
//var filteredText = document.body.innerHTML.replace(new RegExp( "[\\s]", "g" ), "");
</code>
My Javascript is very basic, unlike my Perl and PHP which are better, I appreciate you help or direction either from you or from other competent members here, I am scratching my head quite a lot over this..
I emailed James, no reply yet as I managed to fix the problem he had with it when dealing with text such as:
We ate mango, pine<strong>apple</strong> and passion fruit, I have it deal with that properly.

Fotiman

5:13 pm on Mar 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, your example actually worked for me in non-IE browsers. It turned out that the regex.test method was returning false the second time it ran in IE (I'm not sure why)!

In any case, after several iterations, I came up with something that works. It uses the string.match() method instead of regex.test(). Other notable changes... it works from top to bottom (going to nextSibling) instead of bottom to top as you had it before... my thought is that this would be a better user experience (the first items in the document will be replaced first, vs. being replaced last).

Here's the working example:

<html>
<head>
<title>Find and Replace Test</title>
<style type="text/css">
.theword { font-weight: bold; }
.brownDog { background-color: brown; }
.redFox { background-color: red; }
.greyDog { background-color: #CCC; }
</style>
</head>
<body>
<p>
The little brown dog jumped over the red fox, then
the the grey dog chased after the two and each of
the brown dog and the red fox ran away from the
grey dog....<br />
The little brown dog jumped over the red fox, then
the the grey dog chased after the two and each of
the brown dog and the red fox ran away from the
grey dog....
</p>
<script type="text/javascript">
function findAndReplace(searchText, replacement, s) {
if (!searchText || typeof replacement === 'undefined') {
// Throw error here if you want...
return;
}
var parent,
frag,
oldNode,
searchNode = s || document.body,
regex = (((typeof searchText) === 'string') ? new RegExp(searchText, 'g') : searchText),
currentNode = searchNode.firstChild,
excludes = 'html,head,style,title,link,meta,script,object,iframe';
while (currentNode != null) {
if (currentNode.nodeType === 1 && (excludes + ',').indexOf(currentNode.nodeName.toLowerCase() + ',') === -1) {
// Element node that's not excluded
findAndReplace(searchText, replacement, currentNode);
}
if (currentNode.nodeType !== 3) {
// Not a text node, so move on to the next
currentNode = currentNode.nextSibling;
continue;
}
// Still here so we have a text node
if (!currentNode.data.match(regex)) {
// Text node doesn't contain a match, so move on to the next
currentNode = currentNode.nextSibling;
continue;
}
// Looks like we can do the replacement
parent = currentNode.parentNode;
frag = (function () {
var html = currentNode.data.replace(regex, replacement),
wrap = document.createElement('div'),
frag = document.createDocumentFragment();
wrap.innerHTML = html;
while (wrap.firstChild) {
frag.appendChild(wrap.firstChild);
}
return frag;
})();
oldNode = currentNode;
currentNode = currentNode.nextSibling;
parent.replaceChild(frag, oldNode);
}
}
function doreplace() {
var i, n, newword, oldword = ['brown dog', 'red fox', 'grey dog'];
for (i = 0, n = oldword.length; i < n; i++) {
newword = ' <span class="theword">' + oldword[i] + '<\/span> ';
switch (oldword[i]) {
case 'brown dog':
newword = ' <span class="brownDog"> ' + newword + ' <\/span> ';
break;
case 'red fox':
newword = ' <span class="redFox"> ' + newword + ' <\/span> ';
break;
case 'grey dog':
newword = ' <span class="greyDog"> ' + newword + ' <\/span> ';
break;
}
findAndReplace(oldword[i], newword);
}
}
window.onload = function () {
doreplace();
};
</script>
</body>
</html>

dusky

2:28 am on Mar 18, 2010 (gmt 0)

10+ Year Member



Thanks for your prompt reply, the problem persists I am afraid, it does the replacing of each keyword as needed (only the first occurrence that is) when I take away the global variable G from: new RegExp(searchText, 'g') to: new RegExp(searchText), but it does that in very paragraph, that's the problem I was trying to solve all along.

Below when on the browser (FF, IE, Opera etc), brown dog, red fox and grey dog are highlighted twice in the Whole document (only once in each paragraph) because there are two identical paragraphs, though they don't have to be identical, so if you repeat the paragraph 10 times, you'll have brown dog highlighted 10 times and I only want each keyword highlighted / replaced once in the whole document / page.

The little brown dog jumped over the red fox , then the the grey dog chased after the two and each of the brown dog and the red fox ran away from the grey dog....<br /> <--- this is problem, take it away and problem solved!
The little brown dog jumped over the red fox , then the the grey dog chased after the two and each of the brown dog and the red fox ran away from the grey dog....
See, red fox for example is highlighted twice, in the first paragraph, then in the second, try and see.

I know what's the problem, it is the new line / carriage returns or <br />, if you take them away, problem solved, but of course users pages will have new lines, carriage returns separating paragraphs and blocks of text etc, so when 10 occurrences of the same word are in a document which is likely to be split over 5/6 paragraphs or more, only one should be highlighted on the whole page. If a piece of news or a forum thread, the word widget debated would be repeated by people in almost every reply, I want widget to be highlighted only once, in the first thread or wherever as long as it is only once in the whole document.

However, I find your adaptation more suitable and innovative, thanks

Fotiman

3:04 am on Mar 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, I think I misunderstood what you were trying to do. I thought you wanted them ALL replaced, but you only want 1 replacement for the whole page... got it.

In that case, you can probably just add a global flag that indicates whether a match has been found, and if so then don't do any more replacements. I don't have time at the moment to code it out, but it should be very simple... maybe a slightly complex because you need to make sure you flag individual for each word/phrase that you are trying to replace, but still not that complex. I'll try to whip something together.

dusky

4:38 am on Mar 18, 2010 (gmt 0)

10+ Year Member



Yes, you got me thanks, that will be great if you whip something together and solve my problem and no doubt that of others, and I'll promis to help more here and respond to any requests for help in my sticky. In addition I'll renew my WebmasterWorld supporters forum subscription as soon as I get round to it :), it's well overdue, let's face it, WebmasterWorld is great place and that's the least what we can do!

Fotiman

1:23 pm on Mar 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, give this a shot. It should now only replace the first instance it finds for any given replacement string.


<html>
<head>
<title>Find and Replace Test</title>
<style type="text/css">
.theword { font-weight: bold; }
.brownDog { background-color: brown; }
.redFox { background-color: red; }
.greyDog { background-color: #CCC; }
</style>
</head>
<body>
<p>
The little brown dog jumped over the red fox, then
the the grey dog chased after the two and each of
the brown dog and the red fox ran away from the
grey dog....<br />The little brown dog jumped over
the red fox, then the the grey dog chased after the
two and each of the brown dog and the red fox ran
away from the grey dog....
</p>
<p>
The little brown dog jumped over the red fox, then
the the grey dog chased after the two and each of
the brown dog and the red fox ran away from the
grey dog....<br />The little brown dog jumped over
the red fox, then the the grey dog chased after the
two and each of the brown dog and the red fox ran
away from the grey dog....
</p>
<script type="text/javascript">
var replaced = {}; // Keeps track of what's been replaced
function findAndReplace(searchText, replacement, s) {
if (!searchText || typeof replacement === 'undefined') {
// Throw error here if you want...
return;
}
var parent,
frag,
oldNode,
searchNode = s || document.body,
regex = (((typeof searchText) === 'string') ? new RegExp(searchText) : searchText),
currentNode = searchNode.firstChild,
excludes = 'html,head,style,title,link,meta,script,object,iframe';

while (currentNode != null) {
if (currentNode.nodeType === 1 && (excludes + ',').indexOf(currentNode.nodeName.toLowerCase() + ',') === -1) {
// Element node that's not excluded
findAndReplace(searchText, replacement, currentNode);
}
if (currentNode.nodeType !== 3) {
// Not a text node, so move on to the next
currentNode = currentNode.nextSibling;
continue;
}
// Still here so we have a text node
if (!currentNode.data.match(regex)) {
// Text node doesn't contain a match, so move on to the next
currentNode = currentNode.nextSibling;
continue;
}
if (replaced.hasOwnProperty(searchText)) {
// Already did this replacement
return;
}
// Looks like we can do the replacement
parent = currentNode.parentNode;
frag = (function () {
var html = currentNode.data.replace(regex, replacement),
wrap = document.createElement('div'),
frag = document.createDocumentFragment();
wrap.innerHTML = html;
while (wrap.firstChild) {
frag.appendChild(wrap.firstChild);
}
return frag;
})();
oldNode = currentNode;
currentNode = currentNode.nextSibling;
parent.replaceChild(frag, oldNode);
replaced[searchText] = true;
}
}
function doreplace() {
var i, n, newword, oldword = ['brown dog', 'red fox', 'grey dog'];
for (i = 0, n = oldword.length; i < n; i++) {
newword = ' <span class="theword">' + oldword[i] + '<\/span> ';
switch (oldword[i]) {
case 'brown dog':
newword = ' <span class="brownDog"> ' + newword + ' <\/span> ';
break;
case 'red fox':
newword = ' <span class="redFox"> ' + newword + ' <\/span> ';
break;
case 'grey dog':
newword = ' <span class="greyDog"> ' + newword + ' <\/span> ';
break;
}
findAndReplace(oldword[i], newword);
}
}
window.onload = function () {
doreplace();
};
</script>
</body>
</html>

dusky

7:17 am on Mar 19, 2010 (gmt 0)

10+ Year Member



Thanks, on initial tests, it's a good solution, thanks for your time, you're a star!.