Forum Moderators: coopster

Message Too Old, No Replies

What's in a [Wiki]Name?

I want to make sure my CamelCase is up to spec.

         

cmarshall

4:25 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Folks,

Maybe someone here can help me out. I have been using a standard Wiki REGEX to validate WikiNames:

preg_match("/^[A-Z,ƒ÷‹][a-z,fl‰ˆ¸]+[A-Z,ƒ÷‹][A-Z,a-z,0-9,ƒ÷‹,fl‰ˆ¸]*$/", $text)

(That fl is a special character that doesn't translate well to browser display).

I had issues with adding numbers to the name, such as "P123projectPage4", so I changed the REGEX like so:

preg_match("/^[A-Z,ƒ÷‹][a-z,0-9,fl‰ˆ¸]+[A-Z,ƒ÷‹][A-Z,a-z,0-9,ƒ÷‹,fl‰ˆ¸]*$/", $text)

I guess I could also make the numbers part of the capital requirements, like so:

preg_match("/^[A-Z,0-9,ƒ÷‹][a-z,fl‰ˆ¸]+[A-Z,ƒ÷‹][A-Z,a-z,0-9,ƒ÷‹,fl‰ˆ¸]*$/", $text)

Any feedback?

coopster

7:31 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Can you post some examples of the [Wiki]Names you are referring to?

cmarshall

7:49 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can you post some examples of the [Wiki]Names you are referring to?

Sure. Here's what I have posted as the current allowables in our wiki:


* Is Not a WikiName: notawikiname (All lowercase)
* Is Not a WikiName: NOTAWIKINAME (All uppercase)
* Is Not a WikiName: NOTAWikiname (Does not have lowercase letters between capital letters)
* Is Not a WikiName: NotAWiki_Name (Contains non-alphanumeric character)
* Is Not a WikiName: 4AnExample (Starts with a number)

* Is a WikiName: IsAWikiName
* Is a WikiName: IsawikinamE
* Is a WikiName: Isawikiname2
* Is a WikiName: Is2AWikiName

Here [en.wikipedia.org] is the official Wikipedia description of CamelCase.

Here [c2.com] is an even older definition.

As you can see, there are no hard and fast rules.

Our wiki insists that the first letter always be capital, and that it be alphabetic (A-Z). It can then be followed by lowercase alphabetic characters, then by a mix of numbers and alphanumerics, with subsequent capitalized letters.

What I did was add the ability to create the following type of name:

* N123umbersRightAfterACapital

Before, you had to have a lowercase letter right after the first letter:

* Nu123mbersRightAfterACapital

coopster

9:06 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Thanks, now it's becoming clearer.
[en.wikipedia.org...]

If you wanted to hold true to the Wiki format you could always pull their source code down and locate the regular expression they are using to build names. As a matter of fact, you could start with what they have and modify yours from there if you so desire. I believe the function might be found in their

importUseModWiki.php
file.

BTW, you don't need to separate ranges in a character class with commas. As a matter of fact, having the comma there means you are allowing the comma character to be part of the expression. Are you certain that is what you want?

cmarshall

9:21 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks!

That was the original code in the Wiki software I adapted (WakkaWikkaWiki). I'll go in there and clean it up. I have no idea what those whacky characters mean either. The authors are Russian, so they may be for Cyrillic character sets.

coopster

9:45 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Sure, not really certain how much I helped you there though. I pulled that source code and did a quick search to find the CamelCase functions and found a function in the file mentioned. Yes, you can read it, but is is a bit more advanced with a callback, etc. But, to check for a lowercase after an uppercase is going to call for a bit more advanced regular expression so perhaps that is the best routine the programmers came up with. Good luck with it all ;)

cmarshall

9:50 pm on Jan 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I happen to have a REGEX whiz working for me. I'll put it to him to give me the most efficient code.

What I wanted was the proper "rules" for what makes a WikiName, so I could give him the best parameters.

cmarshall

3:52 am on Jan 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just FYI. Here's what I tasked my REGEX guru to do:

In the meantime, I have a Regex homework assignment. I suspect that it will be easy.

Here is the current WikiName regex in the Wiki:

function IsWikiName($text) { return preg_match("/^[A-Z,ƒ÷‹][a-z,0-9,fl‰ˆ¸]+[A-Z,ƒ÷‹][A-Z,a-z,0-9,ƒ÷‹,fl‰ˆ¸]*$/", $text); }

I actually added the ,0-9 in the second set of brackets, but I'm not so sure I like the results. I think the whacky characters may be because of
Cyrillic character sets (the authors are Russian).

In any case, I want to get rid of the Cyrillic characters, and I'm pretty sure the commas are not supposed to be there.

Here are the rules:

The first letter of a WikiName must be an uppercase English/Roman character (A-Z). It cannot be numeric or punctuation.

No non-alphanumeric characters anywhere ([A-Za-z0-9]).

Numbers 0-9 should be classified as capital letters.

There must be a switch between capital letters and lowercase letters, followed by at least one capital letter.

Examples:

ThisIsAWikiName
thisIsNotAWikiName
THISISNOTAWIKINAME
thisisnotawikiname
Thisisnotawikiname
ThisisawikinamE
Thisisawikiname2
1ThisIsNotAWikiName
ThisIs_Not_AWikiName
ThisIs.Not.AWikiName
THISisnotawikiname
THISis2awikiname
T123HISisAwikiname
T123HISisnotawikiname
T123HISisa4wikiname
T123HISISNOTAWIKINAME
YeS
No

Think you can give me a regex that will do this?

cmarshall

2:38 am on Jan 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just FYI.

This is the regex that does it:

^[A-Z][A-Z0-9]*(?=[a-z]+[A-Z0-9]+)[a-zA-Z0-9]*$

cmarshall

3:50 am on Jan 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, there is one issue that I expect to be fixed on the morrow.

cmarshall

8:17 pm on Jan 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I simplified it even more:
/^[A-Z][A-Z0-9]*[a-z]+[A-Z0-9]+[a-zA-Z0-9]*$/

cmarshall

9:20 pm on Jan 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just FYI. I "broke" one "rule" of WikiNames. They don't want a numerical character between the initial capital letter and the next lowercase letter.

I deliberately codded around that, as all our project codes are single-letter classifications, followed immediately by numbers.