homepage Welcome to WebmasterWorld Guest from 54.161.214.221
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
++liker.Profile URL appended to request url
jamesMP




msg:4602119
 2:09 pm on Aug 15, 2013 (gmt 0)

I noticed in my server logs I get a lot of 404's from requests with ++liker.Profile_URL appended to the url.

Firstly, can anyone highlight a source? (seems like a spambot) and secondly, what the best way of handling it? Continue to serve a 404, or block completely?

[subharanjan.in ] recommends redirecting to the homepage, but it doesn't feel like a good solution.

Checking the IP addresses, the requests mostly come from a hosting company in Canada, and an ISP in the US.

Mods: Not sure if posting IP address/log entries is against forum rules?

 

lucy24




msg:4602213
 8:39 pm on Aug 15, 2013 (gmt 0)

recommends redirecting to the homepage, but it doesn't feel like a good solution.

Good instincts. Redirecting to the home page is never a good solution in the first place-- and why on earth would anyone do it with an unwanted robot? Answer: because a 301 response uses fewer server resources than anything else, assuming the visitor doesn't bother to follow the redirect.

If you like, you can block the requests by issuing a flat [F] in mod_rewrite. But that's mainly for your own emotional satisfaction. A 404 gets rid of them just as effectively.

Incidentally, the quoted RewriteCond is silly.
RewriteCond %{REQUEST_URI} ^(.*)\+\+liker\.profile\_URL\+\+* [NC]
Since nothing is being captured for reuse, you can simply omit the
^(.*)
part.

jamesMP




msg:4603774
 7:01 am on Aug 21, 2013 (gmt 0)

Thanks, Lucy - I'll leave things as-is.

mantri




msg:4604067
 9:31 am on Aug 22, 2013 (gmt 0)

Hi Lucy and James,
Thanks for reviewing the Rewrite rules. I was not sure about this.
Can you please tell me why isn't it a good idea to redirect to homepage ? I am really unaware.

jamesMP




msg:4604072
 10:05 am on Aug 22, 2013 (gmt 0)

Because its a non-existant url. 301-ing to the home page tells the bot (or Google, or anyone else) that the requested page does exist and can be permanently found at /index.php (or whatever).

A 404 indicates that the file doesn't exist and is the most appropriate response from the server.

lucy24




msg:4604073
 10:07 am on Aug 22, 2013 (gmt 0)

It isn't an Apache question, it's an SEO question. So you will get a more in-depth answer if you post in, say, one of the google-related subforums.

Quickies:

#1 The World's Leading Search Engine doesn't care for mass redirects to a single page; it tends to suspect they are "Soft 404s".

#2 Human visitors probably don't care for them. Has anyone ever done a poll with published results? This site's forums don't "do" polls, so I can only speak for myself and say it drives me stark staring bonkers when I end up on the home page if that wasn't what I asked for. (Q.: How do you benefit from annoying your visitors?)

mantri




msg:4604079
 10:25 am on Aug 22, 2013 (gmt 0)

Aren't these 404s( too many in numbers ) created by Spambots harmful for the website ?

Thanks Lucy. Sure I should post on SEO related subforums.

mantri




msg:4604081
 10:30 am on Aug 22, 2013 (gmt 0)

" it drives me stark staring bonkers when I end up on the home page if that wasn't what I asked for. "

Thats what I wanted to do with these Spambots, not with the real users and real users don't end up with this type of urls with some non-sense text.

lucy24




msg:4604091
 11:12 am on Aug 22, 2013 (gmt 0)

Well, your spambots don't end up anywhere. Unlike humans with browsers, they don't automatically follow the redirect; they just make a note of it for (maybe) later.

When you give a robot a 403 or 404, your server sends out the 403/404 page-- but the robot may choose not to read beyond the response header.

Some robots do seem to go away faster if you do something evil like redirecting to 127.0.0.1 or back to its own originating IP. And a redirect does use a tiny bit less server resources. But if you're spending your own time working out the best way to get rid of each individual robot, you've either got too much time on your hands or you've got a serious infestation and need to call and exterminator :)

Aren't these 404s( too many in numbers ) created by Spambots harmful for the website?

Any request uses server resources. But it would be worse if they landed on actual pages. Even the html by itself is almost certainly bigger than the 404 page-- and once they're on a page, they've got all of its links to suggest more ways to bother you.

mantri




msg:4604100
 11:41 am on Aug 22, 2013 (gmt 0)

Thank you very much Lucy. You are a genious !

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved