homepage Welcome to WebmasterWorld Guest from 54.167.144.4
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Unusual log entries
cyberdyne




msg:3595067
 5:04 pm on Mar 8, 2008 (gmt 0)

Hi,
I've just come across these entries in my raw logs. I intend to block them all via my .htaccess as they all seem to be, strangely, looking for a file that doesn't even exist and never has. I'm assuming their intentions aren't honourable.

Can anyone shed any light on what they think the visitors' intentions were in searching for a root/admin.php file.

Thank you.

===BEGIN===
- - [08/Mar/2008:15:38:23 +0000] "GET //admin.php?include_path=p4n93r4nk0d0k/yhe.txt? HTTP/1.1" 302 419 "-" "libwww-perl/5.808"
- - [08/Mar/2008:15:48:46 +0000] "GET /html/some-dir/index.php?entry=60//admin.php?include_path=/id.txt? HTTP/1.1" 301 505 "-" "libwww-perl/5.808"
- - [08/Mar/2008:15:48:47 +0000] "GET /index.php?entry=60//admin.php?include_path=/id.txt? HTTP/1.1" 200 12693 "-" "libwww-perl/5.808"
===END===

[edited by: goodroi at 2:17 am (utc) on May 10, 2008]
[edit reason] Please no specific URLs [/edit]

 

Staffa




msg:3595088
 5:37 pm on Mar 8, 2008 (gmt 0)

I have banned libwww-perl a long time ago. It's an automated thing and automated 'things' are not welcome at my sites.

cyberdyne




msg:3595090
 5:42 pm on Mar 8, 2008 (gmt 0)

How do you go about banning libwww-perl please?
Using .htaccess I presume?

Thank you

jdMorgan




msg:3595111
 6:13 pm on Mar 8, 2008 (gmt 0)

Two ways in .htaccess, using mod_setenvif and mod_access, or using mod_rewrite:

SetEnvIfNoCase User-Agent libwww-perl getout
SetEnvIfNoCase User-Agent "Indy Library" getout
SetEnvif Request_URI (robots\.txt¦custom-403-error-document\.html)$ permit
#
# (Note: only one unconditional Order directive per .htaccess file)
Order Deny,Allow
#
Allow from env=permit
Deny from env=getout

or

# Rule to allow serving robots.txt, custom 403 error page, and bad-bot script to bad-bots
RewriteRule (robots\.txt¦custom-403-error-document\.html¦bad-bot\.pl)$ - [L]
#
RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC]
RewriteRule .* - [F]

In both cases, certain resources are allowed to be fetched, even by unwelcome visitors. In the case of the custom 403 error document, this is required in order to avoid a server loop. Robots.txt should be accessible so that bad-bots are warned that they are not welcome, and the bad-bot script (if you use it) and robots.txt must be accessible in the mod_rewrite version in order for the bad-bots script to function.

Robots.txt must be accessible even if you don't use a bad-bots script; Some robots (even good but undesired ones) will interpret an inaccessible robots.txt file as carte-blanche to spider the entire site, and you want to prevent a flood of denied requests from them by asking them to go away nicely using robots.txt.

Replace all broken pipe "¦" characters above with solid pipes before use; Posting on this forum modifies the pipe characters.

Jim

cyberdyne




msg:3595114
 6:17 pm on Mar 8, 2008 (gmt 0)

Excellent, some great info there.
Thank you very much Jim.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved