Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond %{HTTP_REFERER} - is this correct?

         

Laibcoms

6:46 am on Mar 7, 2006 (gmt 0)

10+ Year Member



Currently I have a long lise of RewriteCond %{HTTP_REFERER} and many of the lines are for the same domain name with multiple subdomains and/or TLD.

Looking at what's currently available in my current .htaccess I came up with a summarize version as such:


RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?example\.(info安s)*(/)?.*$ [NC]

Now I'm wondering if it is correct, the way I understood it is:
]

I.
([a-z0-9-]+\.)? => any subdomain but optional
vs
([a-z0-9-]+\.)* => any subdomain but required = thus http://example.info is not allowed?

II.
\.(info安s)* => either .info or .ws and is required because of the * instead of?

III.
(/)?.* => / as optional because of the? while the .* means all subdomains.

IV.
(whatever)* vs .* means required vs all

Did I got things correctly?
And is it advisable to summarize different lines into one syntax like that (above)?

Thanks everyone.

Additionally, can I summarize this

RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?exam-ple2.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?example2.org(/)?.*$ [NC]

as:

RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?exam(-)?ple2.org(/)?.*$ [NC]

with it like this: exam(-)?ple2
or should it be exam(\-)?ple2

Another question is:
should it be written as
example\.(info安s)*(/)?.*$

or can it be written as:
example.(info安s)*(/)?.*$

ie, without the backslash before the dot?

THanks again.

[edited by: jdMorgan at 2:47 pm (utc) on Mar. 7, 2006]
[edit reason] No URLS, please. See Terms of Service. [/edit]

jdMorgan

3:05 pm on Mar 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Laibcoms,

Welcome to WebmasterWorld!

I.
([a-z0-9-]+\.)? => any subdomain but optional
vs
([a-z0-9-]+\.)* => any subdomain but required = thus http://example.info is not allowed?

No, ([a-z0-9-]+\.)* is "any number of subdomains and sub-subdomains, including zero."

"*" means "zero or more of the preceding character or parenthesized group of characters."

II.
\.(info安s)* => either .info or .ws and is required because of the * instead of?

No, same problem. (info安s)* would match "info" or "ws" or "infoinfo" or "wswsws" or "infowsinfo", etc. or blank. If you want to match exactly one occurrance of "ws" or "info", then no quantifier is needed at all.

III.
(/)?.* => / as optional because of the? while the .* means all subdomains.

Or no subdomain at all.

IV.
(whatever)* vs .* means required vs all

(whatever)* means "zero or more occurrances of 'whatever'" while .* means "any characters or none."

RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?exam-ple2.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?example2.org(/)?.*$ [NC]

as:

RewriteCond %{HTTP_REFERER} !^http(s)?://([a-z0-9-]+\.)?exam(-)?ple2.org(/)?.*$ [NC]


Just use:

RewriteCond %{HTTP_REFERER} !^https?://([a-z0-9-]+\.)?exam-?ple2\.org [NC]

"?" means "one or zero of the preceding character or parenthesized group of characters."

"." in regular expressions means "any single character." If you want to match a literal dot/period/full stop, then precede it with a "\" as in "\."

Do not end-anchor hostname patterns or include the trailing slash. It won't always work.

Trying to "guess" regular expressions is a good way to get into trouble. I recommend reviewing the regular expressions tutorial cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim

Laibcoms

7:10 am on Mar 9, 2006 (gmt 0)

10+ Year Member



Ah, okay thanks!
So I really have to create a single entry for each TLD then, right?

Hmm, .htaccess seems to be following a different rule in reading the syntaxes. ^_^