Forum Moderators: phranque

Message Too Old, No Replies

Question on logs entries

         

Leweezsky

7:17 pm on Feb 22, 2012 (gmt 0)

10+ Year Member



Not sure if this is the right place to post this but here it goes; I need help to understand some logs entries on my website.

At least once a day, there is several entries appearing coming this referrer site: [checkprivacy.or.kr:6600...]

The entries that I see in my logs are like this, for example:

GET /myfile.htm531214/122_shaw_drive/
GET /myfile.htmhp t=531215p;goto=newpost
GET /myfile.htm-parker-531225.html
GET /myfile.htmtml531206

"myfile.htm" is one of my real file on my server and it's always the same file who is being "taken".

For the moment, they are receiving code 404 but do I need to worry about these entries? I've check "myfile.htm" and nothing seems to be wrong with it.

What could be the meaning of these entries?
Thank you so much for your help!

lucy24

10:01 pm on Feb 22, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, what are the chances that a Korean-language site has suddenly taken an interest in your front page? Zero, would you say?

In a way there's no difference between a 404 and a 403. Either they don't get in because the page doesn't exist, or they don't get in because you have locked the door. But a 403 is definitely more satisfying :)

You could lock out everyone whose referer includes the element
\.kr/

Or you could lock them out by IP if they are from the same range(s).

Forged referers seem to be the hot thing with robots these days. They can't fake their IP, but coming in with a referer makes them less noticeable. Most of the time the referer is a real site-- but it has nothing to do with you. They just pick names out of a hat.

A variation is to give an auto-referer, where your page is listed as its own referer. If your site doesn't use that type of linking, it should be possible to block all auto-referers. (I say "should" because so far I can only get it to work with individual named pages, not for auto-referers in general.)

Leweezsky

1:27 pm on Feb 24, 2012 (gmt 0)

10+ Year Member



Thanks for your reply!
They have a 403 code now anyway...

Regards!

Leweezsky

1:16 pm on Mar 8, 2012 (gmt 0)

10+ Year Member



Ok, I really need to know what is going on with one of my website page. It's been several weeks now that I am getting entries on my logs from mostly Ukraine IP's fetching one of my pages constantly with "bogus" referers mostly from "ru".

They are getting a "403" for the moment but I am very worry about my page being used in some way and I really want to protect my site. They mainly get a 403 because I already blocked most IP's from Ukraine and Russia.

Yesterday, I even got an entry like this, showing Google bot:

"GET /myfile.htm - 80 - 193.106.136.37 HTTP/1.0 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) - [uristportal.ru...] 403

The referer is always different and the Ukraine IPs are often different. I can have up to 5 to 10 entries per day and it's always my same file!. (shown here as "myfile.htm")

Does anybody have an idea of what is going on with this file? Many thanks in advance for your help!

tangor

1:34 pm on Mar 8, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Most times these kind of things are log referer (sic) spam. Block either the referer, the ip, or the UA (if common, usually not). If your logs are not visible to the public side of the web, just 403 the entries and move on to other business.

Add \.ru to your .htaccess, might include ro cn, etc...

Leweezsky

1:20 pm on Mar 17, 2012 (gmt 0)

10+ Year Member



Thank you for your reply. It seems that's exactly what it is. I've blocked most of the referers and they all getting a 403.

Do they sometime "give up" the link at one time?
Thanks!

enigma1

11:25 am on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



these spambots don't care what HTTP header you return. Entries will still be logged unless you specify rules not to log them. Instead of complicating matters just ensure the logs aren't accessible from the outside.

As a side note, the fact you posted log entries above with hard links to some dubious sites and mods didn't notice, is what spammers would want.

Leweezsky

12:03 pm on Mar 19, 2012 (gmt 0)

10+ Year Member



It would be great to learn what to do instead of what not to do... I've searched on ways to block them but couldn't find anything on the web on that subject. What is the meaning of "instead of complicating matters just ensure the logs aren't accessible from the outside"? not sure I follow.

What would be an example of specified rules in order to not log them? Any suggestions of a good place to learn on the web?

Sorry for my stupidity on the matter but everyone has to start somewhere...

Thank you for any suggestion in helping me understand.

enigma1

12:21 pm on Mar 19, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It means your access_log or error_log files should not be accessible from anyone. If they are, they can put links to your log files and search engines will crawl them and index them. Or search engines will find them by crawling your domain. These log files sometimes processed by various programs which in turn store refined copies inside your webspace and are accessible via HTTP requests.

Depending how much control you have on the webserver (shared, dedicated etc) you can add rules to conditionally log requests. So if a request matches a set of referrers, ips etc, you setup an environment and skip logging.
[httpd.apache.org...]