Forum Moderators: phranque

Message Too Old, No Replies

Apache 2.2 logs to include # in web page access

         

delar

3:07 pm on Mar 22, 2012 (gmt 0)

10+ Year Member



I can not seem to find an answer to this through Google, so I'm hoping I can get some help here. I have a requirement to track # (pound mark) web page access through the Apache 2.2 logs.

An example is:

www.test.com/test.cfm#SomeID

It has been requested that I compile a report using the Apache Logs to identify web page access statistics. The logs currently record the hit access to test.cfm but truncates the #SomeID information and I've been unable to find an Apache log command to include it.

Anyone have an idea on how to include the #SomeID information in the Apache Logs?

wilderness

3:16 pm on Mar 22, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Don't believe it's possible.

I've 200 pages within two directories which include multiple specific section bookmarks on each page and the # references do not show in visitor logs or SE referrals.

There may be a PHP solution and suggest you inquire in that forum.

lucy24

8:18 pm on Mar 22, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The hash mark never reaches the logs, so there's nothing you can do to track it. This question has previously come up when talking about search engines sending fragment links. Apparently some browsers include the information so it comes through in the logs. But if the browser doesn't send it, it's gone.

:: detour for spot-check of raw logs ::

Ugh. Massive contamination, because the # is included in the UA string of the SogouSpider. The one place I often find a hash mark in the referer is in message-board links. But if there's any unifying feature here, it's more than I can tell:
p=815160#815160" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB7.2; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; MDDR; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"

p=815427#815427" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618)"

p=815484#815484" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

p=823234#823234" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; InfoPath.2; .NET CLR 2.0.50727; .NET4.0E)"


Someone said it was Safari that includes fragments in search-engine referers. It may in fact be webkit:
search?site=images&{snip, snip}maction=&q=farmer%27s+wife#p=6" "Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/7534.48.3"

"http://www.example.com/ebooks/title/FullTitle.html#musictext" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"

g1smd

8:35 pm on Mar 22, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The data after the # mark is an in-page named anchor and is processed entirely by the browser.

That data isn't even sent to the server, so there's no way to track it, unless you add some Javascript to the page to detect the in-page click and some AJAX to send that information in a separate data stream to the server to record it.

delar

8:44 pm on Mar 22, 2012 (gmt 0)

10+ Year Member



Thank you for the information.

phranque

1:14 am on Mar 23, 2012 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



strictly speaking "named anchors" are deprecated as of xhtml1 and are considered obsolete in html5 and should be referred to as "fragment identifiers".

4.10. The elements with 'id' and 'name' attributes:
http://www.w3.org/TR/xhtml1/#h-4.10

http://dev.w3.org/html5/markup/a.html [dev.w3.org]:
The name attribute on the a element is obsolete. Consider putting an id attribute on the nearest container instead.


Fragment Identifiers -- Axioms of Web architecture:
http://www.w3.org/DesignIssues/Fragment.html [w3.org]

URIs: Fragment identifiers:
http://www.w3.org/Addressing/URL/4_2_Fragments.html [w3.org]

wilderness

1:21 am on Mar 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



many thanks phranque.

Despite being deprecated, they still function fine and as orignally intended (except in this particularly threads inquiry), as does much other deprecated html. <pre> is another such example.