Forum Moderators: open

Message Too Old, No Replies

Tricky Global Search and Replace

         

Ride45

7:24 pm on Mar 26, 2004 (gmt 0)

10+ Year Member



I have 200+ html files that need a global search and replace. However, I need some help.
I think it requires regular expressions for the replace, because the only thing common among the html that needs to be replaced is that the html falls inside of <li>the links are here in a list</li>

and the only think that needs to be changed in for the links is to change dashes ("-") to underscore ("_").

Example:

<li><a href="http://www.widgets.com/my-cool-widget.html">My cool widget</a></li>
<li><a href="http://www.widgets.com/my-cool-widget2.html">Another cool widget</a></li>

should become:

<li><a href="http://www.widgets.com/my_cool_widget.html">My cool widget</a></li>
<li><a href="http://www.widgets.com/my_cool_widget2.html">Another cool widget</a></li>

..the only difference among 200 pages between the dash changed to an underscore and only where the <li></li> is present.

Thanks in advance for your help

OrlandoTodd

9:09 pm on Mar 26, 2004 (gmt 0)

10+ Year Member



If it's the same on each page, do it on one page then find and replace the entire section. You could also try cut and pasting into notepad that section then find & replace that bit.

As for a program that allows for condition statements like within the <li>, I don't know of any. Maybe others will know.

Ride45

9:18 pm on Mar 26, 2004 (gmt 0)

10+ Year Member



I hear ya. It's different though. The list of links is different on every page, both in name and in the number of links. It's not the name of the link that needs to be replaced, it's the dash in between words which needs to be changed to an underscore. So really it-could-be-anything, contained within <li> and </li> where the change results in

it_could_be_anything

Thanks!

ronin

2:04 am on Mar 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This may be a dumb question, so apologies if it is.
Is there any possibility that there aren't any dashes elsewhere in the document?

Ride45

4:06 am on Mar 27, 2004 (gmt 0)

10+ Year Member



I wish it were that simple..

Actually, in the header there are only 2 and the footer 2 as well.

I am going to try the entire document first, and then go back and clean up the top 2 which are consistent and the 2 in the footer which are also consistent. I will let you know if this works.

thanks

tedster

4:55 am on Mar 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



a program that allows for condition statements

I would love a text editor that allowed conditional statements in a global search and replace over many documents.

I often have html editing issues just like this one, and usually have to result to circumstantial luck (the pages just "happen" to have this or that characteristic) or some bit of goofy but clever trickiness I dream up in the shower.

DrDoc

5:13 am on Mar 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would love a text editor that allowed conditional statements in a global search and replace over many documents.

HomeSite does! You can do a search and replace using regular expressions... [google.com]

Ride45

7:11 am on Mar 27, 2004 (gmt 0)

10+ Year Member



Homesite, Dreamweaver, there are a tools that will do the regular expression search and replace, but it seems so confusing where for complex tasks, it doesn't involve expressions that are so regular. I played wiht this some more tonight and still no luck. Instead I am writing a PHP script that will help execute the specific items which need to be replaced conditions.

tedster

4:58 pm on Mar 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wow, I've been using Homesite all this time and never knew.

Now I just need to understand those Help files I've been ignoring - I'm not an ace with regular expressions, but now I am motivated.

RonPK

4:37 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I doubt that Ride45's problem can be solved with one search and replace, even with the use of regular expressions. Two things need to be done:
1. select the piece of text inside <li> and </li>
2. in the selection, replace - with _

If this were PHP, I would use preg_replace_callback() : a method that selects a piece of text that matches a pattern and feeds it to another function, the callback function.

Ride45

4:49 pm on Mar 28, 2004 (gmt 0)

10+ Year Member



You are correct RonPK. It was a 2 part process that could not be solved with regular expressions after all. I thought initially it could be, but it required a script to be written to do it very similar to how you describe.

Cheers.

g1smd

9:12 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I thought that we all agreed that hyphens-in-filenames was the right thing to do, as Google treats the words as separate words.

I thought that underscores were problematical on search engines, as well as the underscore seeming to disappear on underlined links.

Ride45

10:23 pm on Mar 28, 2004 (gmt 0)

10+ Year Member



That is True. The search engines interpret a dash as a space, while the underscore is treated as another character, but there is no negative implication by using the underscore. If you heard that somewhere it's a myth.
I have 200+ pages that were indexed by Google now and every one of them has underscores in the file name.
In fact, I started the discussion because I didn't want to lose the indexing of those pages and therefore have been trying to get my "updating html" to maintain the same file names.

g1smd

10:45 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google indexes any sort of file name, even ones with spaces, or other types of punctuation.

Some people reckon that using hyphens allows the words in the URL to count towards the keywords that a page ranks for, whereas underscores do not count in the same way.

I would avoid spaces or underscores.