homepage Welcome to WebmasterWorld Guest from 54.211.70.79
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google AdSense
Forum Library, Charter, Moderators: incrediBILL & jatar k & martinibuster

Google AdSense Forum

    
Analyzing AdSense Source Code
otem




msg:3246569
 5:07 pm on Feb 8, 2007 (gmt 0)

When looking at the src of the AdSense iframe, I see several variables being passed to Google.

Some are pretty obvious, but others not so. I was hoping I could get some feedback on what some of these variables are, and if there are others Google sometimes passes too:

client = Publisher Id, Identifies which account to credit the ad click
dt =? Its a number that changes. Maybe the time?
lmt =? Also a number that changes.
format = Type of ad format used. Links, Ads, maybe other types too. Includes dimensions.
output = "html". I presume its different for RSS ads? Is this always html for html embedded ads?
channel = Unique id to represent which Ad channel to credit the click.
url = The page the ad was displayed on.
color_bg = The background color (hexadecimal) of the ad.
color_text = The text color of the ad.
color_link = The link color of the ad. (Ad Title, ie: Sprockets)
color_url = The url color of the ad. (Ad Url, ie: www.sprockets.tld)
color_border = The border color of the ad.
ref = The page the visitor came from before viewing the current page.
cc =? Number that seems to stay constant.
u_h =? Number that seems to change.
u_w =? Number that seems to change.
u_ah =? Number that seems to change.
u_aw =? Number that seems to change.
u_cd =? Number that seems to change.
u_tz =? Number that seems to change. Is this a negative interger?
u_his = Number that changes. Sometimes I do see this variable passed.
u_java = "true". Is this always the case? My guess is this means the user has JavaScript enabled? But wouldn't they have to in order to get this far?
prev_fmts = A variation of the format variable maybe? Doesn't always get passed.
ad_type =? A variable set to "text" if an AdSense link, maybe?

These are my guesses. Anybody else wish to share their insight on this?

Thanks.

 

MThiessen




msg:3246680
 6:48 pm on Feb 8, 2007 (gmt 0)

Just curious, why do you care?

otem




msg:3246710
 7:34 pm on Feb 8, 2007 (gmt 0)

Valid question. I'm curious for several reasons.

Since this is a piece of information I can know about when a user clicks an ad, I want to be able to get this information so I analyze it to help me optimize my ads.

We only have so many channels we can associate to the ads, and it has been SUGGESTED that mearly using channels can alter an ad's performance (Smart Pricing maybe?). This information doesn't rely on channels and won't require an ad to be changed from its original state leading to better and more accurate observations. It allows us to monitor our ads with the limitations of using channels.

I'm also curious to know as much as I can about what Google knows about my site. Obviously I can't see from here all the information they gather, but I was surprised to find out that they know the page to user was on previous to being on the current page.

I've been able to analyze some of the pieces of information, but I need help to be able to identify others.

jomaxx




msg:3246725
 7:50 pm on Feb 8, 2007 (gmt 0)

this is a piece of information I can know about when a user clicks an ad

I don't see any legit way of intercepting the value of those variables.

otem




msg:3246745
 8:10 pm on Feb 8, 2007 (gmt 0)

I don't see any legit way of intercepting the value of those variables.

I'm not reinventing the wheel. I'm just using a couple lines of code from the AdLogger script (which is within the AdSense TOS). There's a line in the script that grabs the iframe's src. The piece of information contains a long url to googlesyndication.com that contains all these variables concatenated.

otem




msg:3246754
 8:18 pm on Feb 8, 2007 (gmt 0)

Here's my guess to a couple variables from above:

u_tz = User's Time Zone. Number of seconds shifted. Can be either negative of positive. Don't know if this is UTC or California based.

lmt = Unix Time Stamp. Number of seconds since the Unix Epoch. If it is Unix time then its UTC based, which might suggest u_tz to be UTC based as well. However, this could be a California based derivative of Unix Time?

justageek




msg:3246768
 8:29 pm on Feb 8, 2007 (gmt 0)

I don't see any legit way of intercepting the value of those variables.

Yep...the specific vars about the ad will not be had by the site owner.

For example...these you can get:

u_h =? Number that seems to change. (users browser window height)
u_w =? Number that seems to change. (users browser window width)

But these you cannot get:

dt =? Its a number that changes. Maybe the time? (yes)
lmt =? Also a number that changes. (last modified time)
u_ah =? Number that seems to change. (ad height)
u_aw =? Number that seems to change. (ad width)

and so on.

JAG.

otem




msg:3249363
 6:29 pm on Feb 11, 2007 (gmt 0)

OK, I did more data analysis and this is what I have so far:

Thanks JAG on the dt confirmation and for the u_h adn u_w values.

This is not a complete list of variables, but are the ones I have seen myself in my research. If you have more, please add them, but be careful not to share exact values of the variables.

If you have comments or feedback of my interpretation of the variables, please add them.

client - Unique ID representing the account that is showing this ad. This is the account that will get credit for the ad click.

dt - This is the time the ad was served in Unix Timestamp format with three numbers concatenated at the end. The time zone is UTC. I have no clue as to what the last three numbers at the end stand for. I thought microseconds, with three digits of microseconds, that seems way too extreme for Google to want to keep track of.

lmt - This also seems to be the time the ad was served, but without the last three digits that dt has at the end. Otherwise dt and lmt are equivalent.

alternate_ad_url - The publisher's specified url for the alternate ad if there are no ads for the visitor to be shown.

alt_color - The publisher's specified color to be shown if there is no ads for the visitor to be shown.

format - The publisher's specified format for the ad.

prev_fmts - Other ad formats coded in the HTML document that are higher to the top of the HTML code than the current format.

output - The output type for the ad. For me this always seems to contain the same value for my HTML ads, but this might be different for other ad formats like RSS.

channel - The publisher assigned tracking channels for the current ad.

pv_ch - Other tracking channels assigned to ads that are coded in the HTML document higher to the top of the HTML code than the current ad.

url - The location of the current page that is showing the document.

color_bg - The publisher's specified background color for the ad.

color_text - The publisher's specified text color for the ad.

color_link - The publisher's specified link color for the ad.

color_url - The publisher's specified url color for the ad.

color_border - The publisher's specified border color for the ad.

ad_type - The publisher's specified type of ad to run, such as text ads, image ads, or both. This is not specified for link ads.

loc - Rarely shows, but so far seems to be equal the the url value.

cc - No clue! This value seems to be a constant integer, but after reviewing my logs I did see this value change to a different integer. No idea way.

u_h - Visitor's screen height.

u_w - Visitor's screen width.

u_ah - Visitor's available window height (when you take the screen height and factor the size of the browser window, and space taken by browser toolbars).

u_aw - Visitor's available window width.

u_cd - Visitor's color depth of their screen.

u_tz - Visitor's timezone shift in seconds from UTC.

u_his - Not sure. This might correspond to how many times this visitor has seen ads on this page, without accounting for the number of ads on the page. However, that's just a guess.

u_java - Not sure. My guess is if the user has javascript available or not, which seems strange as they would have to have javascript available to get this far.

ref - The referring URL the visitor was on immediately prior to viewing the publisher's page with the ads.

qsrc - Not sure. Seems to be an integer that shows if the ref value is from specific domains.

o - Not sure. Seems to be an integer that shows along with qsrc.

l - Not sure. Seems to be a text string that shows along with qsrc and o.

ptnrS - Not sure. Seems to be a text string that shows if the ref value is from specific domains. (Does not seem to be related to domains corresponding to qsrc, o and l values).

searchfor - Words searched by visitors. Shows with ptnrS. These terms are not visible in ref value.

u_nplug - No clue. Rarely shows. Seems to be an integer.

u_nmime - No clue. Rarely shows. Seems to be an integer.

justageek




msg:3249423
 8:00 pm on Feb 11, 2007 (gmt 0)

cc - No clue! This value seems to be a constant integer, but after reviewing my logs I did see this value change to a different integer. No idea way.

This is directly related to screen size so change your screen size slightly and you'll see it change :-)

u_his - Not sure. This might correspond to how many times this visitor has seen ads on this page, without accounting for the number of ads on the page. However, that's just a guess.

This is total number of unique pages the browser has for the current browser session. I use this (and a few other tricks I'll not mention) to identify invalid clicks. I'm guessing Google does also.

JAG

linear




msg:3249740
 5:23 am on Feb 12, 2007 (gmt 0)

otem, I did a runthough of the javascript back in '04 or so. Looks like we mostly came to the same conclusions.

[webmasterworld.com...]

[edited by: Brett_Tabke at 2:17 pm (utc) on Feb. 12, 2007]
[edit reason] fixed url [/edit]

System
redhat



msg:3250083
 7:57 am on Feb 12, 2007 (gmt 0)

The following 3 messages were cut out to new thread by brett_tabke. New thread at: google_adsense/3250081.htm [webmasterworld.com]
8:40 am on Feb. 12, 2007 (cst -6)

otem




msg:3250250
 5:21 pm on Feb 12, 2007 (gmt 0)

Linear, seems like I'm behind the boat. :) Though it is very reassuring that we came up with the same values.

otem




msg:3250704
 12:51 am on Feb 13, 2007 (gmt 0)

Linear, your thread opened my eyes to the source where I can go to help me better understand these, as well as to see potential others that I have yet to come across.

I was able to confirm that dt is a Unix timestamp expressed in milliseconds.

I am still having no clue what cc is. I'm was thought it concerned a relationship to the browser size and screen size, or dealing with the position of the scroll bar, however I'm now pretty sure its neither of these.

I am very surprised at the information Google collects and uses. I've always looked up to them for their data-mining, but I am impressed. I'm just hoping some of the variables I see might be passed on more to me.

I know that some of this information there are scripts and databases I could buy and install that will tell me some of this information, but I'll always have to deal with upgrading hassles. I can grab this information and use it to help build a better profile of my visitors. I also am more trusting of this information knowing this is the data-mining opinions from Google.

Things I'm seeing include age, gender, job...

I'll have to spend more time tonight on this, but I'll definitely be using the new source to flesh out my list further.

linear




msg:3250721
 1:17 am on Feb 13, 2007 (gmt 0)


var u=d.body.scrollHeight,v=d.body.clientHeight;
if(v&&u){f("cc",Math.round(v*100/u))}

It would seem that cc is the ratio of scrollHeight to clientHeight for the body object, expressed as a percentage. I'm not entirely sure how that's useful. It may be a sly way of checking for frames. It would seem to be less than 100 in some frame situations.


if(navigator.mimeTypes){c("u_nmime",navigator.mimeTypes.length)}

So n_mime is the length of the naviagtor.mimeTypes array. How many MIME types does this browser recognize. Possibly useful in fingerprinting a client, since it should remain the same for a particular user-agent.


if(navigator.plugins){c("u_nplug",navigator.plugins.length)}

And likewise, how many plugins do you have installed.

I can't help but point out that the set of these client data points that stay relatively constant throughout a browser session, plus the history object that would show the order pages were loaded, would be enough to build a click trail in most cases, even absent any IP address info.

dollarshort




msg:3251036
 9:04 am on Feb 13, 2007 (gmt 0)

Looking on at other sites I have found a "hint" tag then some keywords that relate to the topic, can we add this?

example

hint="sports, car, engine, wheel, exaust"

otem




msg:3251297
 2:18 pm on Feb 13, 2007 (gmt 0)

Definitely!

I'm interested though, was this in the iframe src? And if so, did you see this also in the AdSense code on the page, where certain variables like ad colors and alternate ads are defined?

Leonard0




msg:3251347
 2:53 pm on Feb 13, 2007 (gmt 0)

Looking on at other sites I have found a "hint" tag then some keywords that relate to the topic, can we add this?

Hint tags are only for premium publishers, unless there's been a recent change.
[webmasterworld.com ]

Leonard0




msg:3251398
 3:26 pm on Feb 13, 2007 (gmt 0)

The Adsense link appears to have changed, but just for the ad units. Ad links are unchanged. The link used to be:
pagead2.googlesyndication.com/pagead/ads?client=ca-pub-1234567890123456&dt=...

Now it appears as:
pagead2.googlesyndication.com/pagead/iclk?sa=l&ai=...
where ... = approximately 350 alphanumeric characters
Looks like they are encrypting that info now

The links can found under the Links tab in the Tools menu of Firefox.

justageek




msg:3251410
 3:30 pm on Feb 13, 2007 (gmt 0)

I can't help but point out that the set of these client data points that stay relatively constant throughout a browser session, plus the history object that would show the order pages were loaded, would be enough to build a click trail in most cases, even absent any IP address info.

This is true since it is very easy to detect the majority of fraud out there by doing the simple things. To my amazement, Google does not do some of the other basic checks for fraud. Makes me wonder how much they really want to stop it.

JAG

otem




msg:3251505
 5:12 pm on Feb 13, 2007 (gmt 0)

Looking on at other sites I have found a "hint" tag then some keywords that relate to the topic, can we add this?


Hint tags are only for premium publishers, unless there's been a recent change. [webmasterworld.com...]

Sorry, to clarify, I meant you could definitely add hint to the list on this page, as it is a parameter that might be passed in the url.

It would definitely NOT be alright to modify your adsense code to include this, as modifying your code is strictly prohibited!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google AdSense
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved