homepage Welcome to WebmasterWorld Guest from 54.225.1.70
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Referrer same as request URL
Are these invalid log entries?
dalyea




msg:4389916
 5:04 pm on Nov 22, 2011 (gmt 0)

I was analyzing my log file today, when I found 750 requests for a single page. Not actually a page, but a handler script that logs and redirects the user off my site. Of those 750 request, 748 were such that the referrer string was the same as the requested URL. Example:

46.105.144.64 - - [21/Nov/2011:22:53:50 -0500] "GET /cgi-bin/item.cgi?s=41&t=3&id=1617234 HTTP/1.0" 302 483 "http://www.mysite.com/cgi-bin/item.cgi?s=41&t=3&id=1617234" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) Opera 6.01 [en]"

When I look at all the item.cgi requests, they generally have a referrer that is a page on my site, or sometimes a yahoo email account or such.

Is it valid to have a page request like this where the URL is the same as the referrer?

 

wilderness




msg:4389954
 6:10 pm on Nov 22, 2011 (gmt 0)

There's a relatively recent thread on this in the SSID forum.

lucy24




msg:4390065
 11:38 pm on Nov 22, 2011 (gmt 0)

For a given definition of "valid". I kinda think robots do it to get around rules that begin by looking at requests with blank referers. If you're going to forge a referer, may as well just grab the name of whatever you're asking for. My Ukrainians have started doing this; it only took them a few months to figure out that any referer in their geographic region gets locked out at the gate. Darn. I really thought they were gone.

Should be a short-lived robot fad, though. Unlike blank referers, it would never happen with bona fide humans. Unless, ahem, you goofed in your own page code.

dalyea




msg:4390111
 2:45 am on Nov 23, 2011 (gmt 0)

@wilderness, you wouldn't happen to have a link to that thread would you? I didn't find it right away. @lucy24, for sure it's not my code - the redirect script is plain and simple. Thanks for the info, I didn't know it, but it sure makes sense what you're saying. The interesting thing is this: (1) there are 120,000 such item.cgi links that could be formulated, (2) these types of false hits to the page were arriving 1 at a time every few minutes or even over 10's of minutes or more, and (3) for the 750 such hits I saw, about 400 of them were attributable to unique IP addresses! The biggest single IP count of the 750 was 24. That was what was confounding, b/c I would have expected a huge bunch from one IP or a bunch of hits every few seconds. And the same URL over and over! Not one after another in numerical sequence. Strange.

lucy24




msg:4390128
 4:42 am on Nov 23, 2011 (gmt 0)

Look at the UA. My latest batch had to be a botnet, because they were all different IPs, but they all used, er, forged the same UA. I looked up all the IPs anyway. Found another half-dozen ranges from a region I block routinely, so it was worth it.

Is there a distinguishing feature to the URL they wanted? With me it's always my fattest file. Actually second-fattest, but the one that's slightly bigger is in a directory that robots seem to dislike. I don't mean statistically plumper, I mean that the HTML is bigger by an order of magnitude than most of my pages. (It's a MiSTing, if you must know.) So there's something they expect to find based on filesize.

topr8




msg:4390176
 8:38 am on Nov 23, 2011 (gmt 0)

>>Unlike blank referers, it would never happen with bona fide humans. Unless, ahem, you goofed in your own page code.

how about say if you are on the home page and you click the homepage logo/link, or if you have a navigation system where all the categories are shown down the left side, including the category page the user is actually on and they click that link.

... wouldn't that reload the page and send a self referencing referrer header

g1smd




msg:4390177
 8:39 am on Nov 23, 2011 (gmt 0)

It would.

Pfui




msg:4391322
 6:16 pm on Nov 26, 2011 (gmt 0)

Thoughts:

"Referrer equals request url" bot
Identify a bot where the referrer repeats the requested url
[webmasterworld.com...]

URI=REF is definitely not a short-lived fad.

I've observed literally tens of thousands of URI=REF hits and very, very rarely does a real user click one same-page directory'esque link. (And that single link appears to be an image map thing because it only happens in certain browsers.)

lucy24




msg:4391381
 12:45 am on Nov 27, 2011 (gmt 0)

how about say if you are on the home page and you click the homepage logo/link, or if you have a navigation system where all the categories are shown down the left side, including the category page the user is actually on and they click that link.

I count that as a goof, because you don't need to link to the page you're already on. My current version shows the name of the page but instead of being an active link it's in boldface. (I do it manually, but I can't imagine it would be very hard to have php do the same thing.)

As a user it tends to make me feel stupid when I click on a link and all I get is the current page refreshing. Oh. Oops. Guess that means I was there already.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved