Forum Moderators: coopster
What I want is let the users write ordinary text, in English, French, German or Spanish, with all normal text attributes like punctuation, spaces, line feeds or carriage returns.
And validate that it is text, but nothing but text (no commands or html etc.).
So I tried something like:
$valid_comment = eregi("^([a-zA-Zà-üÀ-Ü0-9 \.\!\?\'\-]+)$", $comment);
but it returns false.
Even [[:alnum:][:space:][:punct:]] did not do the trick.
Is it because of the accents, the line feeds or the punctuation, I don't know, but whatever French text I enter, it says that it is not valid.
Most of the tutorials I've read focus on <input> field validation, but not on what I'm looking for.
Could someone supply a link toward useful information?
Regards. (and happy christmas, by the way).
You are right about the eregi being case insensitive.
But it doesn't matter apparently.
As for the bizarre range of characters, I saw these in some posts.
And they do their job in a simple text field (for instance the company name, which in France can have accents.
It's once I tried to apply the regex to a textarea that the expression failed (even the one with the generics as [:alnum:]).
So I'm still stuck with that bizarre problem.
I hope this helps.
Perhaps my explanation is not clear.
What I want to do is check that there is text and nothing but text in the textarea input.
I thought to be able to perform this check with a regular expression.
If the input in the <textarea> validates, I insert it in the database.
Otherwise, I send the user back to his keyboard to correct his submission himself.
I'm not going to correct things in his place, and load all kind of useless stuff in my DB.
Especially since it's a collaborative project, and that other users will have to work with what has been submitted.
So it really has to MATCH the requirements. Letters with accents or not, digits, spaces and punctuation and line feeds, nothing else.
Does that make a bit more sense?
As far as i can see, the regexp you typed will catch the stuff you want it to. Have you checked the other parts of your script for possible errors?
Here's a regexp tester btw: [regexlib.com...]
<?php
if (preg_match('!([#-&(-+/<->@\^_{-}])!', $content, $match)) {
$error_message = "You used a '$match[1]' in the text field, which is not allowed.";
// return to sender.
}
else // $content validates.
?>
I used ASCII ranges here. If that doesn't work for you, maybe just list the characters you want to disallow in the [] rather than using ranges. The only individual characters you'd need to escape are \^-] and your boundary.
I hope this helps.
That's really odd - i could make it match perfectly fine (even some special Danish characters that you don't have in French :)). Perhaps it's something with the eregi() function (i'm not very well versed in PHP) - is that one working multiline or does it require a single line?
---
Added: Perhaps your form has URL-encoded the characters before they reach your script, if so your string should be decoded before you test it. This is would decode it in Perl, i'm sure there's something similar for PHP:
$string =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; Just a thought, perhaps it's totally off.
ereg*) and Perl-Compatible [php.net] Regular Expression Functions. Not that it will make a difference here (don't have time to check right now), but just so you are aware.
I implemented the suggestion of Salsa, and it worked out of the box. Thx, man!
But since I was a bit confused about that ASCII-range thing, I went for a search and found out that the complete set is like this:
!\"#$%&'()*+,-./:;<=>?@[\\]^_`{¦}~
Just in case other users of the fabulistic Webmasterworld might find this piece of info useful.
Merry Christmas to all.