Forum Moderators: phranque

Message Too Old, No Replies

Spam behind url

ref=spamurl

         

dolcevita

2:53 pm on Nov 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have just noticed that some spammer use technique to
add ?ref=spamsite behind mydomain.com so that it looks as
mydomain.com/?ref=spamsite.com and then simple submit to Search engines and that got indexed.

Can someone help me and write rewrite rule to prevent this kind of spamming. In such a way that when someone type mydomain.com/?ref=spamsite.com it will redirect to mydomain.com

Thanks

ps

It seems that this rule works:
RewriteCond %{THE_REQUEST} \?(ref=.*)?\ HTTP [NC]
RewriteRule .? http://www.example.com%{REQUEST_URI}? [R=301,L]

[edited by: jdMorgan at 3:50 pm (utc) on Nov. 28, 2009]
[edit reason] example.com [/edit]

jdMorgan

3:50 pm on Nov 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's fine, but you could shorten and simplify it a little:

RewriteCond %{QUERY_STRING} ref= [NC]
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

or make it more specific if the "ref=" is a full URL:

RewriteCond %{QUERY_STRING} ref=https?:// [NC]
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

Jim

g1smd

10:04 pm on Nov 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does the real site use any query strings at all?

For sites that have no query strings in any of the valid URLs, I sometimes use a rule that redirects for any and all query strings and values.

icedowl

7:34 pm on Nov 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just saw something similar today that I'd like to stop:
/?cfg[prePath]=http://www.example.net/01.gif?

The result is the display of my site with that string added. I tried without success the above mentioned code:

RewriteCond %{QUERY_STRING} ref= [NC]
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

There's probably something I don't understand.

As an aside, in WMT I saw a duplicate description for just one page, one with a trailing slash and one without. Without a trailing slash is the way I want things to be when the URL ends with an extension "/page.html" and I only use a trailing slash for my home page eliminating the "/index.html". Something I'm doing wrong or something I'm missing does allow any page, "/page.html", be returned as "/page.html/" and it just might be what allows the crap I mentioned at the beginning of this post.

jdMorgan

9:47 pm on Nov 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The query string pattern in your RewriteCond doesn't match "cfg[prePath]=", it matches "ref="... Suggest you might want to modify it. You will need to escape the square brackets for use as literals in a pattern.

The post-slash problem is not related, but if you allow it to happen, then it very well might happen.

Jim

icedowl

11:16 pm on Nov 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So, I would replace "ref=" with "\/?cfg\[prePath\]=" ? Is this correct?

icedowl

12:43 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Guess not as that didn't work either.

jdMorgan

12:52 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Replace "ref=" with "cfg\[prePath\]="

Jim

icedowl

1:00 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tried that too with no luck. It still gives me:
[mysite.com...]

Where mysite is my own site and badsite is the site trying to do this crap.

jdMorgan

2:11 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, that's the first time there was any indication that the brackets were encoded...

That takes a different method:


RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^?]*\?cfg\%5bprePath\%5d= [NC]
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

Of course, if you don't use a query name of "cfg=" on your own site, then you don't need to check the whole query string, and could just use:

RewriteCond %{QUERY_STRING} ^cfg [NC]
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

[added] Correction as noted below. [/added]

Jim

[edited by: jdMorgan at 12:12 pm (utc) on Dec. 1, 2009]

icedowl

3:31 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Jim. The first version works, the second did not. The only thing now that isn't the way I want it to be is that I now get "www.example.com/index.php" instead of "www.example.com/".

g1smd

10:00 am on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well, that's the first time there was any indication that the brackets were encoded...

Or that there is an "

/index.html
" within the URL path part.

Your initial question was for the URL path part...

/?cfg[prePath]=http://www.example.net/01.gif

After the solution was given you changed the question to...

/index.html/?cfg%5bprePath%5d=http://www.badsite.net/01.gif

All detail for the pattern of the URLs to match is important.

jdMorgan

12:20 pm on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> ... the second did not

My mistake. For some reason, I was compelled to add the "=" directly after "cfg", which of course didn't match an encoded "[" character...

> I now get "www.example.com/index.php" instead of "www.example.com/".

That indicates that you've got another rule or directive firing before this one, and it is adding "index.php" to the path and that it is subsequently 'exposed' as a URL to the client by the "remove cfg query" redirect.

My guess would be an internal rewrite that may also be missing the generally-recommended [L] flag.

All external redirect rules must precede any internal rewrite rules, and both of these redirect and rewrite rule groups should be ordered from most-specific patterns and conditions to least-specific.

Jim

icedowl

1:26 pm on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks. I'm at work now (graveyard), but when I get home and catch a bit of sleep I'll take another look. I have a hunch that I plugged this in after my code that rewrites "/index.html" into just "/" and it's probably just a matter of relocating it.

icedowl

5:37 pm on Dec 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm home now and it is fixed. My above hunch was correct. Relocation did the trick. Thanks again.