homepage Welcome to WebmasterWorld Guest from 107.22.45.61
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
HTTP Status Headers
Would you do this differently?
DrDoc




msg:4591272
 9:17 pm on Jul 8, 2013 (gmt 0)

I have a number of advertising campaigns which I run through a proprietary impression/click tracking software.

  1. Someone clicks the link
  2. I record the click
  3. Redirect to offer page
  4. Record a visit


Well, lately I have started experiencing a large number of dropped clicks. In other words -- I see a recorded click (#2 above), but not a corresponding visit (#4 above).

In my code, I perform a simple redirect:

print "Status: 302 Found\n"; 
print "Location: $url\n\n";


Would you do this differently? I can't see how this would account for all the lost clicks, but I am totally open to suggestions.

 

phranque




msg:4591292
 10:26 pm on Jul 8, 2013 (gmt 0)

have you analyzed the traffic that doesn't follow the redirect?
are they typically requesting all resources on the click page such as favicon and other images like a human visitor?
my first guess would be a clickbot.

DrDoc




msg:4591293
 10:32 pm on Jul 8, 2013 (gmt 0)

Well, they simply do not appear to follow the redirect at all. I have set up a separate tracking table where I record the click, pass an ID along in the URL redirect, and immediately -- before any other processing -- record an entry in the table again.

In other words, if they follow the redirect and request the page _at all_, then a second entry would be recorded. That simply doesn't happen.

I doubt it's a clickbot, however, since I can vouch for the source of all initial traffic.

phranque




msg:4591325
 11:52 pm on Jul 8, 2013 (gmt 0)

a Referer header can be spoofed.
i would be looking at "the visit" in the server access log file for clues.

Key_Master




msg:4591332
 12:01 am on Jul 9, 2013 (gmt 0)

Check error logs for clues, if any. If not, turn on warnings in script and keep an eye on logs.

$url is probably invalid or the script or htaccess is adding an extra header thats conflicts with the redirect.

Brett_Tabke




msg:4591363
 1:00 am on Jul 9, 2013 (gmt 0)

I'd agree with everything phranque and Key_master had to say.

I would get a copy of proxomitron and turn on header logging (there are some extensions for other browsers too). Look at what your server is actually kicking back in response to the GET for the original url.
[proxomitron.info...]

DrDoc




msg:4591554
 5:34 pm on Jul 9, 2013 (gmt 0)

i would be looking at "the visit" in the server access log file for clues
For the lost clicks, there is never a visit anywhere following the redirect.

I've poured over the headers, but can't see anything that would be amiss:
HTTP/1.1 302 Found
Date: Tue, 09 Jul 2013 16:51:31 GMT
Server: Apache/2.2.3 (CentOS)
Set-Cookie: tpid=3751430_53174; path=/
Location: http://www.example.com
Cache-Control: max-age=0
Expires: Tue, 09 Jul 2013 16:51:31 GMT
Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=2
Connection: Keep-Alive
Content-Type: text/html

Key_Master




msg:4591601
 7:53 pm on Jul 9, 2013 (gmt 0)

There shouldn't be a Content-Type header. In fact, most of those headers wouldn't be appropriate for a redirect. Keep your redirect headers simple, like you have in your original post.

** Added **

When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed. The server adds extra headers to html pages, e.g. Content-Length and Expires which is why they're present. I'm not sure about the Vary header on a redirect but I think the use of this header would be better served on the page that the user-agent is redirected to. The same thing with Set-Cookie. I've run into issues with cookies served on a redirect.

lucy24




msg:4591609
 8:15 pm on Jul 9, 2013 (gmt 0)

For the lost clicks, there is never a visit anywhere following the redirect.

I've pored over the headers, but can't see anything that would be amiss

I think phranque meant: look at the rest of the visit. Did they get all the non-html files (css, images, favicon)?

A robot can say almost anything it likes in a header, but very very few of them pick up all associated files-- and fewer still get them with the same timing as a human.

DrDoc




msg:4591611
 8:29 pm on Jul 9, 2013 (gmt 0)

There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.

I have also tried the redirect with/without HTML (standard recommends HTML with an anchor to the new location), with/without setting the cookie ...

The only two headers I set directly are the Status and Location headers. The rest are set by the server. Vary comes from the gzip functionality to ensure that proxies serve the right content.

Key_Master




msg:4591622
 8:55 pm on Jul 9, 2013 (gmt 0)

Content-Type is standards but it must be properly constructed. Make sure it doesn't end with two newlines. It should end with one newline.

302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it. It the page that the 302 redirects is directly visited by a user agent (without following a redirect) it will not know about a Vary or Set-Cookie header.

phranque




msg:4591631
 9:25 pm on Jul 9, 2013 (gmt 0)

DrDoc:
Someone clicks the link

There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.


Link A is on a page, correct?
that page (or earlier) is where the visit starts.
study that request first and it will possibly explain more about the subsequent non-redirect.
if the page load doesn't look human then it's not a normal browser and the redirect could be ignored.

phranque




msg:4591632
 9:26 pm on Jul 9, 2013 (gmt 0)

Key_Master:
When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed.


according to the HTTP specification an HTML response is suggested with a 302 status code:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

phranque




msg:4591633
 9:28 pm on Jul 9, 2013 (gmt 0)

Key_Master:
302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it.


this 302 response is cacheable since a Cache-Control: header is provided:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
This response is only cacheable if indicated by a Cache-Control or Expires header field.


if a proxy accepts compressed content, the proxy would want to cache both compressed and non-compressed versions of the HTML document if provided.
hence the need for a Vary: header.

DrDoc




msg:4591642
 9:38 pm on Jul 9, 2013 (gmt 0)

Link A is on a page, correct?
that page (or earlier) is where the visit starts.

Oh, I see what you mean. Well, unfortunately I have no access to study the details of that initial request, as that page is owned by one of my publishers.

Key_Master




msg:4591643
 9:39 pm on Jul 9, 2013 (gmt 0)

phranque, like I said, Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page, not a redirect. Only the Location header should end with two newlines.

I don't disagree with your second response. I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.

DrDoc




msg:4591645
 9:47 pm on Jul 9, 2013 (gmt 0)

... 'cept I don't always host or control the target page. But I know what you're saying.

phranque




msg:4591646
 9:51 pm on Jul 9, 2013 (gmt 0)

Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page

the Location: header is generated by the cgi script and requires the two newlines, with or without a subsequent HTML document.
the Content-Type: header in this case appears to be generated by the server, in which case the server would properly insert this and additional headers without extraneous newlines.

I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.

the headers for the 302 response and the headers on the page the user-agent is redirected to would control the caching of those two response independently.
both response require their own appropriate headers.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved