Forum Moderators: coopster & phranque

Message Too Old, No Replies

writing perl script to edit all html files in a directory

would something like this work?

         

jeremy goodrich

10:48 pm on Apr 14, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All of the files are in a sub directory, so I need the script to read all the files that are there, and go into each one, editing the colors for the site.

Here is my code so far:

#!/usr/bin/perl
# directory.plx
use warnings;
use strict;

print "Contents of the current directory:\n";
opendir DH, "." or die "Couldn't open the current directory: $!";
while ($_ = readdir(DH)) {
next if $_ eq "." or $_ eq "..";
print $_, " " x (30-length($_));
print "d" if -d $_;# this part of the script
print "r" if -r _;# will read a print a list
print "w" if -w _;# of all the files in the dir
print "x" if -x _;
print "o" if -o _;
print "\t";
print -s _ if -r _ and -f _;
print "\n";
}

$_ = \@files

# loop and sub to open and replace files

foreach $value (@files){

OPEN $value, or die "Couldn't open the current file: $!";

while $value (@files) {

$search =~ s/body bgcolor\=\"\#666666/body bgcolor\=\"\#918264/

#body bgcolor="#666666"
#
#change to body bgcolor="#918264"

$search1 =~ s/font color\=\"\#FFFFFF/font color\=\"\#000024/

#font color="#FFFFFF" change to font color="#000024"
#<td height="..." width="..." bordercolor="#000024"> change to
#<td height="..." width="..." bgcolor="#FFFFFF">

#change <font color="#CCCCCC"> to
#on new as well change font color="#CCCCCC" to <font #color="#FFFFFF">

}

I know the last parts are commented out. I'm still new at this, and was wondering two things:

Anybody want to point me in the right direction, tell me if I'm on the wrong track, suggest better ways to write it, etc.

Or two, if somebody knows of a script like this already, feel free to post or sticky mail me the link. I'm off to hotscripts right now to see if I can did something up.

theperlyking

12:00 am on Apr 15, 2001 (gmt 0)

10+ Year Member



Jeremy,
Have you seen my "orphaned image" script a bit further down this forum, it loops through html files in a directory. It might have the odd bit you may want to pinch and improve upon for this script.

jeremy goodrich

2:43 am on Apr 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the tip. If I actually manage to mush some things together of my own doing, I'll post the results. Who knows, it may help some other blokes out there. ;)

jeremy goodrich

11:58 pm on Apr 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I found something that will work. Searching for "search and replace in perl" in google gave me some scripts that were exactly what I needed. If anybody else needs a script to scan a directory, edit the files, and then save them, I found one handy script. Even though it's not mine, I don't think I should post it.

Stickymail me for the link if anybody is interested.

Brett_Tabke

8:05 am on Apr 16, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Post it, it's ok. I'm sure others would like to have a look too (me)...

jeremy goodrich

2:59 pm on Apr 16, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay, this is the script I found. I didn't write it, so don't come to me for technical assistance :)

Oh, and since I'm still learning, I'll say this for the other wannabe programmers like myself: this is designed for a *nix system, and seems to work on linux. I'm not sure about other systems.

#!/usr/bin/perl
#
# Kevin Kadow (kadokev@msg.net) [msg.net...]
# Can be downloaded from [msg.net...]
# For 1stomni, free for all to use as long as the comments are kept intact!
#
# Search and replace strings in html files
#
require 'find.pl';

#
# Change these variables as needed.
#
$dir="/home/path/etc";

$search="what you are going to match";

$replace="replace with this";

warn <<"EOF";
Starting search/replace script $0
Search all files below $dir
Looking for string $search
Replace the string with $replace

EOF

#
# Do the actual work
#
&find($dir);
warn "Scanned $files making $changes changes.\n";
exit(0);

################ SUBROUTINES FOLLOW #################################

#wanted
#
# This routine is called by 'find', once for each file, directory, etc.
#
sub wanted {

#Skip everything except files named .html or .shtml
return unless( -f $_ && m/\.[s]*html$/);

&update("$dir/$_");
}

#update
#
# Do the search-and-replace on a given file
#
sub update {
local($file)=@_;

local(@contents);

# Open the file for reading and writing (the + does that)
unless(open(FILE,"+<$file")) {
warn "Cannot open $file for updating, Error: $!\n";
return;
}

warn "Updating $file\n";
$files++;

# Read the contents into an array, replacing the string as we go
while ($_=<FILE>) {
$changes += s/$search/$replace/ig;
push(@contents,$_);
}

# Go to the beginning of the file
seek(FILE,0,0);

# Update the contents with the data from our array
print FILE (@contents);
close(FILE);
}

###EOF###

Edited by: sugarkane

msgraph

3:05 pm on Apr 16, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I posted this in another forum but not sure if you came across it. It's a pretty long script but maybe it can help you on setting up yours.

rfindrep [unimelb.edu.au]

jeremy goodrich

3:28 pm on Apr 16, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just noticed, forgot to turn off the graphic smilies. Oops :)

Thanks msgraph for the link. I like how that has all sorts of documentation, including links to some stuff on pattern matching and regular expressions. Exactly the kind of info I need, since I'm still trying to figure it all out.

If anyone needs me to repost the previous script I can, since the version that got posted is a little choppy from the happy faces!

sugarkane

4:01 pm on Apr 16, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> forgot to turn off the graphic smilies

fixed :)

You can fix this on your own posts by using 'owner edit' BTW ;)

littleman

2:30 am on Jul 27, 2001 (gmt 0)



A related post.. [webmasterworld.com]

Bolotomus

7:28 pm on Jul 28, 2001 (gmt 0)

10+ Year Member



Jeremy

Your Perl code (the top code) looks reasonable, but there is a tell-tale sign of a Perl-newbie in there...

$search1 =~ s/font color\=\"\#FFFFFF/font color\=\"\#000024/

Perl Neophytes are always paranoid about invoking some meta-character inside a regex that they don't know about, so they escape everything with a backslash.

E.g., you know that # invokes a comment, so you figure that you need to escape it. You know that double-quotes " are used for a zillion things, so it would seem reasonable to escape that too. You escaped = probably because you weren't sure, and better safe than sorry.

In fact, not a single backslash is necessary in that expression. And look how much easier it is to read:

$search1 =~ s/font color="#FFFFFF/font color="#000024/

How does Perl know that those #'s don't start a comment? That's where the Magic™ comes in :)

I'm telling you this because, like yourself, I used to "play it safe" and escape every other character. Later, you look at your old code, and say to yourself "Ewwww!"

So what DO you need to escape? Here's a short list:
$ @ ( ) [ ] \ ? . * + { }
...plus your separator character (usually a /)--but it's always better to use a separator which you won't need to escape.

Take care
The Bolot

PS --- I've got a home-spun piece of perl that does what you want. I call it "subst."

neil laurance

8:51 am on Aug 1, 2001 (gmt 0)



If you're using UNIX, you can do all of this code in one line from the shell:

find . -type f ¦ xargs perl -i.old -p -e 's/foo/bar/g'

Cheers, NEIL