Welcome to WebmasterWorld Guest from 54.226.62.26

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

HTTP Status Headers

Would you do this differently?

     

DrDoc

9:17 pm on Jul 8, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I have a number of advertising campaigns which I run through a proprietary impression/click tracking software.

  1. Someone clicks the link
  2. I record the click
  3. Redirect to offer page
  4. Record a visit


Well, lately I have started experiencing a large number of dropped clicks. In other words -- I see a recorded click (#2 above), but not a corresponding visit (#4 above).

In my code, I perform a simple redirect:

print "Status: 302 Found\n"; 
print "Location: $url\n\n";


Would you do this differently? I can't see how this would account for all the lost clicks, but I am totally open to suggestions.

phranque

10:26 pm on Jul 8, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



have you analyzed the traffic that doesn't follow the redirect?
are they typically requesting all resources on the click page such as favicon and other images like a human visitor?
my first guess would be a clickbot.

DrDoc

10:32 pm on Jul 8, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Well, they simply do not appear to follow the redirect at all. I have set up a separate tracking table where I record the click, pass an ID along in the URL redirect, and immediately -- before any other processing -- record an entry in the table again.

In other words, if they follow the redirect and request the page _at all_, then a second entry would be recorded. That simply doesn't happen.

I doubt it's a clickbot, however, since I can vouch for the source of all initial traffic.

phranque

11:52 pm on Jul 8, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



a Referer header can be spoofed.
i would be looking at "the visit" in the server access log file for clues.

Key_Master

12:01 am on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Check error logs for clues, if any. If not, turn on warnings in script and keep an eye on logs.

$url is probably invalid or the script or htaccess is adding an extra header thats conflicts with the redirect.

Brett_Tabke

1:00 am on Jul 9, 2013 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'd agree with everything phranque and Key_master had to say.

I would get a copy of proxomitron and turn on header logging (there are some extensions for other browsers too). Look at what your server is actually kicking back in response to the GET for the original url.
[proxomitron.info...]

DrDoc

5:34 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



i would be looking at "the visit" in the server access log file for clues
For the lost clicks, there is never a visit anywhere following the redirect.

I've poured over the headers, but can't see anything that would be amiss:
HTTP/1.1 302 Found
Date: Tue, 09 Jul 2013 16:51:31 GMT
Server: Apache/2.2.3 (CentOS)
Set-Cookie: tpid=3751430_53174; path=/
Location: http://www.example.com
Cache-Control: max-age=0
Expires: Tue, 09 Jul 2013 16:51:31 GMT
Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=2
Connection: Keep-Alive
Content-Type: text/html

Key_Master

7:53 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There shouldn't be a Content-Type header. In fact, most of those headers wouldn't be appropriate for a redirect. Keep your redirect headers simple, like you have in your original post.

** Added **

When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed. The server adds extra headers to html pages, e.g. Content-Length and Expires which is why they're present. I'm not sure about the Vary header on a redirect but I think the use of this header would be better served on the page that the user-agent is redirected to. The same thing with Set-Cookie. I've run into issues with cookies served on a redirect.

lucy24

8:15 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



For the lost clicks, there is never a visit anywhere following the redirect.

I've pored over the headers, but can't see anything that would be amiss

I think phranque meant: look at the rest of the visit. Did they get all the non-html files (css, images, favicon)?

A robot can say almost anything it likes in a header, but very very few of them pick up all associated files-- and fewer still get them with the same timing as a human.

DrDoc

8:29 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.

I have also tried the redirect with/without HTML (standard recommends HTML with an anchor to the new location), with/without setting the cookie ...

The only two headers I set directly are the Status and Location headers. The rest are set by the server. Vary comes from the gzip functionality to ensure that proxies serve the right content.

Key_Master

8:55 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Content-Type is standards but it must be properly constructed. Make sure it doesn't end with two newlines. It should end with one newline.

302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it. It the page that the 302 redirects is directly visited by a user agent (without following a redirect) it will not know about a Vary or Set-Cookie header.

phranque

9:25 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



DrDoc:
Someone clicks the link

There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.


Link A is on a page, correct?
that page (or earlier) is where the visit starts.
study that request first and it will possibly explain more about the subsequent non-redirect.
if the page load doesn't look human then it's not a normal browser and the redirect could be ignored.

phranque

9:26 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Key_Master:
When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed.


according to the HTTP specification an HTML response is suggested with a 302 status code:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

phranque

9:28 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Key_Master:
302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it.


this 302 response is cacheable since a Cache-Control: header is provided:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
This response is only cacheable if indicated by a Cache-Control or Expires header field.


if a proxy accepts compressed content, the proxy would want to cache both compressed and non-compressed versions of the HTML document if provided.
hence the need for a Vary: header.

DrDoc

9:38 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Link A is on a page, correct?
that page (or earlier) is where the visit starts.

Oh, I see what you mean. Well, unfortunately I have no access to study the details of that initial request, as that page is owned by one of my publishers.

Key_Master

9:39 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



phranque, like I said, Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page, not a redirect. Only the Location header should end with two newlines.

I don't disagree with your second response. I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.

DrDoc

9:47 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



... 'cept I don't always host or control the target page. But I know what you're saying.

phranque

9:51 pm on Jul 9, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page

the Location: header is generated by the cgi script and requires the two newlines, with or without a subsequent HTML document.
the Content-Type: header in this case appears to be generated by the server, in which case the server would properly insert this and additional headers without extraneous newlines.

I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.

the headers for the 302 response and the headers on the page the user-agent is redirected to would control the caching of those two response independently.
both response require their own appropriate headers.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month