Welcome to WebmasterWorld Guest from 54.205.106.138

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

HTTP Status Headers

Would you do this differently?

     
9:17 pm on Jul 8, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


I have a number of advertising campaigns which I run through a proprietary impression/click tracking software.

  1. Someone clicks the link
  2. I record the click
  3. Redirect to offer page
  4. Record a visit


Well, lately I have started experiencing a large number of dropped clicks. In other words -- I see a recorded click (#2 above), but not a corresponding visit (#4 above).

In my code, I perform a simple redirect:

print "Status: 302 Found\n"; 
print "Location: $url\n\n";


Would you do this differently? I can't see how this would account for all the lost clicks, but I am totally open to suggestions.
10:26 pm on July 8, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


have you analyzed the traffic that doesn't follow the redirect?
are they typically requesting all resources on the click page such as favicon and other images like a human visitor?
my first guess would be a clickbot.
10:32 pm on July 8, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


Well, they simply do not appear to follow the redirect at all. I have set up a separate tracking table where I record the click, pass an ID along in the URL redirect, and immediately -- before any other processing -- record an entry in the table again.

In other words, if they follow the redirect and request the page _at all_, then a second entry would be recorded. That simply doesn't happen.

I doubt it's a clickbot, however, since I can vouch for the source of all initial traffic.
11:52 pm on July 8, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


a Referer header can be spoofed.
i would be looking at "the visit" in the server access log file for clues.
12:01 am on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 27, 2001
posts:1472
votes: 0


Check error logs for clues, if any. If not, turn on warnings in script and keep an eye on logs.

$url is probably invalid or the script or htaccess is adding an extra header thats conflicts with the redirect.
1:00 am on July 9, 2013 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38069
votes: 15


I'd agree with everything phranque and Key_master had to say.

I would get a copy of proxomitron and turn on header logging (there are some extensions for other browsers too). Look at what your server is actually kicking back in response to the GET for the original url.
[proxomitron.info...]
5:34 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


i would be looking at "the visit" in the server access log file for clues
For the lost clicks, there is never a visit anywhere following the redirect.

I've poured over the headers, but can't see anything that would be amiss:
HTTP/1.1 302 Found
Date: Tue, 09 Jul 2013 16:51:31 GMT
Server: Apache/2.2.3 (CentOS)
Set-Cookie: tpid=3751430_53174; path=/
Location: http://www.example.com
Cache-Control: max-age=0
Expires: Tue, 09 Jul 2013 16:51:31 GMT
Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=2
Connection: Keep-Alive
Content-Type: text/html
7:53 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 27, 2001
posts:1472
votes: 0


There shouldn't be a Content-Type header. In fact, most of those headers wouldn't be appropriate for a redirect. Keep your redirect headers simple, like you have in your original post.

** Added **

When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed. The server adds extra headers to html pages, e.g. Content-Length and Expires which is why they're present. I'm not sure about the Vary header on a redirect but I think the use of this header would be better served on the page that the user-agent is redirected to. The same thing with Set-Cookie. I've run into issues with cookies served on a redirect.
8:15 pm on July 9, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13442
votes: 390


For the lost clicks, there is never a visit anywhere following the redirect.

I've pored over the headers, but can't see anything that would be amiss

I think phranque meant: look at the rest of the visit. Did they get all the non-html files (css, images, favicon)?

A robot can say almost anything it likes in a header, but very very few of them pick up all associated files-- and fewer still get them with the same timing as a human.
8:29 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.

I have also tried the redirect with/without HTML (standard recommends HTML with an anchor to the new location), with/without setting the cookie ...

The only two headers I set directly are the Status and Location headers. The rest are set by the server. Vary comes from the gzip functionality to ensure that proxies serve the right content.
8:55 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 27, 2001
posts:1472
votes: 0


Content-Type is standards but it must be properly constructed. Make sure it doesn't end with two newlines. It should end with one newline.

302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it. It the page that the 302 redirects is directly visited by a user agent (without following a redirect) it will not know about a Vary or Set-Cookie header.
9:25 pm on July 9, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


DrDoc:
Someone clicks the link

There's really no "rest" of the visit.
Link A redirects to Link B using a 302 status. Link A has no CSS or images.


Link A is on a page, correct?
that page (or earlier) is where the visit starts.
study that request first and it will possibly explain more about the subsequent non-redirect.
if the page load doesn't look human then it's not a normal browser and the redirect could be ignored.
9:26 pm on July 9, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


Key_Master:
When you add a Content-Type header to a redirect, it considers the 302 to be a html page- no redirect is performed.


according to the HTTP specification an HTML response is suggested with a 302 status code:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).
9:28 pm on July 9, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


Key_Master:
302's are temporary so I don't see the value in using Vary or Set-Cookie on them if you can help it.


this 302 response is cacheable since a Cache-Control: header is provided:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
This response is only cacheable if indicated by a Cache-Control or Expires header field.


if a proxy accepts compressed content, the proxy would want to cache both compressed and non-compressed versions of the HTML document if provided.
hence the need for a Vary: header.
9:38 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


Link A is on a page, correct?
that page (or earlier) is where the visit starts.

Oh, I see what you mean. Well, unfortunately I have no access to study the details of that initial request, as that page is owned by one of my publishers.
9:39 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 27, 2001
posts:1472
votes: 0


phranque, like I said, Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page, not a redirect. Only the Location header should end with two newlines.

I don't disagree with your second response. I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.
9:47 pm on July 9, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member drdoc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 15, 2002
posts:6807
votes: 0


... 'cept I don't always host or control the target page. But I know what you're saying.
9:51 pm on July 9, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10684
votes: 33


Content-Type is standards with a redirect, but if it ends in two newlines and it is served before the Location header, it will be treated as a HTML page

the Location: header is generated by the cgi script and requires the two newlines, with or without a subsequent HTML document.
the Content-Type: header in this case appears to be generated by the server, in which case the server would properly insert this and additional headers without extraneous newlines.

I'm just pointing out that it would be more efficient to serve those headers on the page the user-agent is redirected to. That way if a user-agent directly accesses the page without a redirect, it will be served the proper headers.

the headers for the 302 response and the headers on the page the user-agent is redirected to would control the caching of those two response independently.
both response require their own appropriate headers.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members