The pages contain special character codes like ♠. The <textarea> renders them as the actual character (the spade symbol), not the code for the character itself. That doesn't really bother me, and if I do a View Source then I see the data in the <textarea> is indeed listed as ♠. But when I save the <textarea> back to the server, the ♠ has changed into a question mark.
I'd rather not test for ♠ specifically because I'm using a bunch of different character codes and it would be silly to test for dozens of them. I'm sure there's a more elegant solution. Any ideas?
Damn Safari.
To avoid such issues, you want to run an html-escaping function on the data before sending it to the browser. e.g.,
sub etags {
my $text = shift;
$text =~ s/&/&/gs;
$text =~ s/</</gs;
$text =~ s/>/>/gs;
return $text;
}
# ...
print "<textarea>", etags($file_contents), "</textarea>";
Then things like "♠" should show up in the page source as "&spades;", in the textarea as "♠" and get sent back to your script as "♠".
The problem with converting all &'s into &'s is that my partner uses a different browser and if I do that then when he edits a file it will contain code like: <a href="http://example.com?&value=1">link</a>.
I guess I can do browser-sniffing, and only replace the ampersands if the browser is Safari. I've always just tried to avoid that, but I guess there's no way around it in this case.
Thanks, Pinterface. My solution to the <textarea> problem had been to replace <textarea> with <textareatag>, then before I saved the file I converted <textareatag> back to <textarea> That part at least worked well.
Where did you get that from? I've never seen
<textareatag>before. I don't think it's a valid tag...
The problem with converting all &'s into &'s is that my partner uses a different browser and if I do that then when he edits a file it will contain code like: <a href="http://example.com?&value=1">link</a>.
That IS a proper way to write links. Many people neglect it and just use & without escaping it, but it's not right.
I guess I can do browser-sniffing, and only replace the ampersands if the browser is Safari. I've always just tried to avoid that, but I guess there's no way around it in this case.
No need. Just escape everything and it will work fine. The reason to this is that
<textarea>has a weird side effect that confuses everyone. If you put raw HTML into it - it displays it fine, when really it shouldn't.
Think about it. What if the page you want to edit had a textarea in the code? You read the file, place it into the textarea on your editing screen. Then the textarea from the page you want to edit will close the textarea tag earlier that needed... That's why you need to escape all the tags, so that the browser does not treat the literaly. All browsers un-escape all the tags back when they send it to the server, so you shouldn't worry about that.
1. Of course <textareatag> isn't a valid tag. It's what I use internally in order to get the code for the textareas to show up in my editing box. Like I said, I change it back to <textarea> when saving it.
2. A url with "&" instead of "&" will definitely, definitely break. There are many other things that break, for example, a Javascript statement like: "if (a && b)...".
3. Regarding your "no need" comment, I don't think I can explain it better than I already have. You might want to have a look at my description of the problem again.
1. Of course <textareatag> isn't a valid tag. It's what I use internally in order to get the code for the textareas to show up in my editing box. Like I said, I change it back to <textarea> when saving it.
Sure, but you are just making your life harder.
2. A url with "&" instead of "&" will definitely, definitely break. There are many other things that break, for example, a Javascript statement like: "if (a && b)...".
URLs with & do work and it's encouraged to use them to avoid confusion with the start of another entity.
JS will not break.
3. Regarding your "no need" comment, I don't think I can explain it better than I already have. You might want to have a look at my description of the problem again.
I think I understand it fully. It's a trivial problem. Here is my solution in Perl. If you want better formatted code, sticky me with your email address and I'll send them to you.
Once you run the code and it loads the file into the textarea, view the source and notice that all the <>& characters were escaped automatically by the CGI module.
Perl Script
#!/usr/bin/perl
my $file = 'index.html';
use strict;
use warnings;
use File::Slurp;
use CGI;
use CGI::Carp qw(fatalsToBrowser);
my $q = new CGI;
write_file($file, $q->param('file')) if $q->param('file');
$q->header,
$q->start_html($file),
$q->h1($file),
$q->start_form,
$q->textarea('file', scalar read_file($file), 10, 80),
$q->submit,
$q->end_form,
$q->end_html;
1;
Example HTML file I used to test the script
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<title>Form Test</title>
<script type="text/javascript">
if (a && b);
</script>
</head>
<body>
<div id="container">
<p>We've got tags and special characters such as &, ♠, and ...</p>
<p>Also we <a href="http://www.google.com/search?oe=UTF-8&q=learning+perl">have links</a> that contain & signs...</p>
<p><a href="http://www.w3.org/TR/REC-html40/charset.html#h-5.3.2">Authors should use "&amp;"</a> (ASCII decimal 38) instead of "&" to avoid confusion with the beginning of a character reference (entity reference open delimiter).</p>
</div>
</body>
</html>
After some testing, I found that the magic difference between your code and mine was that you printed the form input with CGI (i.e., print $q->textarea rather than $q->textarea('file', scalar read_file($file))). I guess that CGI takes care of all the conversions. I make that change to my file editor and now it appears to properly work cross-platform, and I don't have to screw around with converting back and forth between <textarea> and <textareatag>.
Thanks very much for your help, and your patience.