Welcome to WebmasterWorld Guest from 50.19.135.67

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Textarea Word Counter Needed

How do you do it.

     

typomaniac

2:35 pm on Feb 11, 2010 (gmt 0)

5+ Year Member



Hi, I need to be able to count the number of words in a textarea input. Counting the characters is easy but need to be able to do it with counting words instead of just characters
    without
using javascript.

janharders

3:19 pm on Feb 11, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



that's easy enough, too.
my $string = 'hello, my name is Xasghjda.';
my $count = scalar split(/\W/, $string);
print $count;


it's not pretty, not efficient, but very easy. The above code prints 5 btw ...

phranque

4:11 pm on Feb 11, 2010 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], typomaniac!

just to clarify, you were looking for something "server side" rather than in the browser, correct?
if so janharder's solution would certainly suffice or you could probably do something with a "regular expression" if you needed something more efficient.

hmmm, on second thought that split might create empty array elements between consecutive non-word characters...

janharders

5:40 pm on Feb 11, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



you got me, phranque. It didn't when I first wrote it as /[, .]+/ but then I thought "nooo, the other guys will see it and point out that \W would've been much nicer" (also, once you want to do it right, you'll end up adding alot of word-boundaries...), so I changed it left the + out. I usually don't like to work with \w, because of locales, but \W should work just fine.


my $count = scalar split(/\W+/, $string);

would work better.

typomaniac

11:21 am on Feb 12, 2010 (gmt 0)

5+ Year Member



Works like a charm. Can't say thanks enough. One thing I noticed though, it doesn't count puncuation or other special characters even if typed in separate. Would something have to be added to the regex for that.

janharders

4:03 pm on Feb 12, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



What exactly do you mean?
currently,
"hello. how are you?" would count as four words because ". " is counted as a single word boundary. if you'd want it to count punctuation as words in special situation, you'll have to define the circumstances and I'll be happy to help in putting that into the regexp. Think of something challenging ;)

typomaniac

9:22 am on Feb 13, 2010 (gmt 0)

5+ Year Member



I understand what you are saying but what I meant was, if someone typed in something like Hello, how are you? >>>>>)(*)(* )(* ** it still only shows up as four words, meaning someone could type in all kinds of things like special characters and it would not be counted against the limit allowed. I think what I was wanting to do is put handcuffs on malicious users but like I said the char count will still get them. I apologize for going overboard in what I was asking because I can use a character count and still limit user input. You were still more help to me than you'll ever know.

janharders

12:06 pm on Feb 13, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hey, it's a pleasure to help out.
If you defined the wordboundaries stricter, you could have it count differently.

my $string = "hello, my name is Xasghjda. What\nyes sadddd/( /(\$dd";
my $count = scalar split(/[,. ?!\r\n]+/, $string);
print $count;


would count 9 words. To make sure your users don't mess with your design, you might also look at spaces, i.e. while I stay below the character-count (let's say 80) and the word-count, it might break your design if I just submit 79 x "a" without any spaces or dashes where the browser could break the line.
with

if($string =~ m/[^\- \r\n]{20,}/s)

would match any strings with 20 or more chars that do not contain spaces, dashes or line breaks. what you do with those is up to you, either yell at the user or silently insert a space every X characters:
$string = "dasdddddddddddddddddddddddddddddddddddddddddddddddddddd";

while($string =~ m/[^\- \r\n]{20,}/s)
{
$string =~ s/([^\- \r\n]{19})/$1 /gs
}

print '"' . $string . '"' . "\n";

=>
"dasdddddddddddddddd ddddddddddddddddddd ddddddddddddddddd"

typomaniac

10:50 am on Feb 15, 2010 (gmt 0)

5+ Year Member



Bingo! That hit the nail on the head! You were so quick with answers...what would you recommend as a good reference for learning regex for coming up with solutions like this? I can write enough perl(sometimes, though usually with help from great people like you)but regex is really new to me. Thank you once again.

phranque

11:40 pm on Feb 15, 2010 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



there are quite a few regexp references linked in the "Perl Server Side CGI Scripting forum Charter" [webmasterworld.com].

janharders

10:48 am on Feb 16, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I have started with the Mastering Regular Expressions (EAN: 9781565922570, ISBN 10: 1565922573), but that just got me the first few miles. the rest was just getting routine. It took me quite some time, but I'm a slow learner...

glad I could help

typomaniac

12:43 pm on Feb 16, 2010 (gmt 0)

5+ Year Member



I was looking at that book myself and hope to get it when I visit the homeland(U.S.) next time I'm there. The shipping would cost to much here(atleast double). I have Regex Buddy ( [regexbuddy.com ] )I just got it and haven't had time to attempt making sense of it yet. JG software also has a product called Regex Magic and maybe.......slow learner? I'm not sure which is my first or middle name---typomaniac(master of mistakes) or slow learner. Hopefully I can get ahead enough with things to be able to help others as you've helped me. Thanks so much.

janharders

4:45 pm on Feb 16, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I've looked at regexbuddy some time ago, mostly for debugging regexps, but haven't used it. From my personal experience, in most cases you'll just need the basic functionality, that is character groups, backreferences, quantification, mostly. also, it's very important to know the modifiers (m//#*$! < those!). I've seldomly used look-ahead and look-behind, so I still have to check the manual whenever I need them ... if I ever do.
As for typos, I'm sure you're already including "use strict;" in all your scripts? that makes pretty much sure that typos won't go unnoticed. Also: check out Regexp::Common [search.cpan.org], it contains many solution for everyday problems (such as matching valid emails which can be quite a painful thing otherwise).

typomaniac

1:24 am on Feb 17, 2010 (gmt 0)

5+ Year Member



use strict....one of the meanest things ever discovered..lol. Amazing how simply (#)commenting that line out makes life so much easier. It even got "mad" at me as I tried to replace strings with variables in pursuit of building a language file...i.e.,
$lng{'1'}="All Fields Required"; so that to use the script with a different language it would be a simple matter of replacing the value. It didn't like the part with {'1'} Once the script is running okay I just commented out the use strict line and moved on. As far as Regexp::Common, I looked at that and once again stand in amazement at the lengths people will go to in the pursuit of making life easier for others.

janharders

12:55 am on Feb 18, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



whatever you can do without use strict, you can do with strict. there are exemptions, of course, but that's just the real evil dark voodoo-stuff. I can only recommend to keep it in there ... next time you want to change stuff, fix a bug or extend anything, you'll be lost hunting that typo ;)

btw ... for localization take a look at the Maketext-Family of modules. There's a great article from the perl journal available on cpan [search.cpan.org]. I haven't seen anything that is more flexible and easy to maintain than the maketext-idea.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month