Welcome to WebmasterWorld Guest from 54.167.29.254

Forum Moderators: incrediBILL & martinibuster

Message Too Old, No Replies

Secret AdSense Publisher ID Data Harvesting by Domain Company

Privacy compromised revealing entire domain/site portfolios

     
2:08 am on Feb 27, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 26, 2006
posts:1397
votes: 0


I just happened to Google my AdSense publisher ID a short time ago--not sure why, just curiosity, I guess--and, low and behold, I find a domain company has been crawling the web with its bots, paying special attention to the Adsense ID of each website it crawls. Then it automatically compiles a list of every single website that uses that ID.

I can't reveal the URL of the company doing it, but you may want to see if your ID has been lifted from your site. (I checked somebody else's ID found via View Source, but Google doesn't have it indexed in SERPs.)

Hmmm... not sure it's a bad thing for me personally, but I do have a mixture of websites some of which are personal, some only business. That's because Google only lets publishers have one account, and that has been its strict policy for a long time/always.

In a related issue, it seems the same website that harvests your Google AdSense information also looks at Google Analytics IDs--the UA number.

Is it impossible for Google to give publishers privacy? Couldn't the snippet of Google Adsense code be set up by domain instead of the ID tag?

I don't know how many publishers would be interested in this but I suspect potentially a few. The bad news is I don't know of anything that can be done immediately on the publishers' side to enable privacy.

Speaking of privacy, even if you have Domain Privacy for every domain but one in your portfolio, but each domain gives away your AdSense ID, people can find out who you are via this site. This is good to know for anyone with a controversial site amongst a large portfolio of domains!

p/g

5:31 am on Feb 27, 2009 (gmt 0)

Full Member

10+ Year Member

joined:Nov 27, 2005
posts:255
votes: 0


Google only lets publishers have one account, and that has been its strict policy for a long time/always.

I have several accounts (with Google's prior approval) - One corporate, one personal, another for a separate joint venture. It seems that all Google really wants is 1) a separate tax id, and 2) to know that you aren't scamming them.

And yes, someone harvesting AdSense Publisher IDs is disturbing, though I'm sure it's been going on longer than we all thought. After all, those Publisher IDs are sitting there just ready to be plucked...

[edited by: inactivist at 5:35 am (utc) on Feb. 27, 2009]

5:31 am on Feb 27, 2009 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2005
posts:1735
votes: 19


I agree with you this would be troubling.

Is it impossible for Google to give publishers privacy? Couldn't the snippet of Google Adsense code be set up by domain instead of the ID tag?

One problem with that is some web sites have multiple publishers working on it. Other web sites use some kind of revenue sharing system.

7:00 am on Feb 27, 2009 (gmt 0)

Preferred Member

5+ Year Member

joined:Nov 1, 2007
posts:436
votes: 0


My ID# has been out there since 2004 and is not indexed by Google.

Your search - "my pub id#" - did not match any documents.

7:07 am on Feb 27, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts:3224
votes: 9


Mine does, but only on myspace where kids have copied things from my site incorrectly.
7:57 am on Feb 27, 2009 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14221
votes: 226


I don't know of anything that can be done immediately on the publishers' side to enable privacy.

Identify their spider and the IP ranges they're coming from then block. More information here [webmasterworld.com].

8:59 am on Feb 27, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts: 1642
votes: 0


Interesting - when I google my pub id, I find the site - german site.
It only has one of my sites.

Very odd thing. Could any legitimate use be made of this?

Mods - should we out the site?

9:21 am on Feb 27, 2009 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14221
votes: 226


No site outing, thanks. However you may want to check with incredibill the spider mod and see what can be done about identifying the spider and what can be published in the spider forum.
9:36 am on Feb 27, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts: 1642
votes: 0


Good idea, MB - I'll shoot him a sticky
10:00 am on Feb 27, 2009 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 10, 2008
posts:60
votes: 0


@leadegroot "Could any legitimate use be made of this? "

well, I know two sites to spy google adsense id and I do use such database of trackable site IDs because it's a nice way to check what my competitors are doing. Also you may uncover site networks by spying on Adsense IDs and learn about quite a few interesting things

Now about how to keep them off your butt: it's not really possible as of now if they are determined to get your id... But you may make things more difficult for their bots by using a script that call your adsense script (encapsulating your adsense script in an other one i mean). but if the bot is collecting IDs once the page is in the DOM 9once html and javascript have been interpreted), then there could be not protection unless Google Adsense decides to add some new feature to create random ids on a site basis or ad basis.

12:43 pm on Feb 27, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1144
votes: 114


Clever trick. I'm not sure what's worse, this particular site that's fully indexed by Google, or another similar service (in English) that actually sells this information and has a much bigger database but isn't publicly available (i.e. the results aren't indexed, you have to pay first). The latter, which is quite easy to find, seems to be relatively popular and has over 700k sites in its index, whereas the German site claims 200k. I tried a few Adsense IDs, without paying, and they seemed to provide pretty complete results (they show you how many domains and subdomains use the ID and then make you pay to see them). This allows anyone to see if competitors are doing anything shady like arbitrage. A new can of "outing" worms. Additionally, the owner claims that what they are doing is perfectly legal and says they do not listen to, or even open, robots.txt files. Any blocking of the robot would have to happen on the basis of IP(s), but for some reason I doubt they are crawling with a user-agent that will easily identify them.

Correction: it looks like they have a 'sitemap' of search result pages, so it appears they are at least trying to get all the IDs indexed by search engines.

6:44 am on Feb 28, 2009 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2005
posts:1735
votes: 19


they do not listen to, or even open, robots.txt files

Which is actually great if you have bot traps that block them automatically. Spam bots that follow robots.txt rules, slowly index your site from many different IP, with various user agent, and at a reasonable speed, are a lot more difficult. Did I say too much?

8:06 am on Feb 28, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts: 1642
votes: 0


and spookily enough, someone visited my site via the shadow site some 3 hours after I posted.
Coincidence? Probably reconfirming the data once it had been viewed.
I think we aren't supposed to post IPs? But the whois says its a static ADSL in .nl
8:14 am on Feb 28, 2009 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14221
votes: 226


I think we aren't supposed to post IPs...

Discuss with incredibill. He's the Spider Mod. ;)

1:10 pm on Feb 28, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member hobbs is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 19, 2004
posts:3055
votes: 4


Someone I know reported the German site to be removed from the Google index last week.

Finding and blocking those sites is a simple matter of Googling a pub id, finding out the ip of the site, doing an arin.net or ripe.net ip lookup then blocking the hosting company's ip range (if you have cPanel IP deny manager does it for you), but they could very well be crawling and populating their database from an adsl connection, which renders the above useless.

The interesting question is whether it is illegal to list someone else's pub id, and whether Google has a legal standing to go after those sites other than traffic starving them by removing them from the index.

Perhaps the owners of such sites know that their odds for monetizing are grim, and they are doing it just to gain notoriety.

2:17 pm on Feb 28, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1144
votes: 114


and spookily enough, someone visited my site via the shadow site some 3 hours after I posted.

Oops! Sorry, Lea, that was me, actually, conducting what I thought was a harmless little experiment to find out how difficult it would be to find anyone's site network with these tools. Not too difficult, as it turned out. I apologize for the confusion and, particularly, the intrusion.

I don't think the owners of the other, more popular, English tool will care much about complaints. Everything looks rather shady. Some research on the info found in their (Hong-Kong) whois info reveals a stunning amount of criminal allegations, although I can't be sure if the registrant info also reflects the ownership of their site. The site is apparently hosted in the UK and payments are processed in The Netherlands. Fishy. Will Google care enough to find a solution for publishers? I doubt it.

2:58 pm on Feb 28, 2009 (gmt 0)

Full Member

10+ Year Member

joined:Sept 14, 2005
posts:272
votes: 0


I wonder if we are allowed to put our pub ID in an external javascript file and not put it in the Adsense script in the HTML file.

<script type="text/javascript"><!--
// google_ad_client is global
google_ad_width = 300;
google_ad_height = 250;
...

But the Adsense bot may take exception to not seeing a proper ID and trigger an account review.

6:37 pm on Feb 28, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 7, 2003
posts:4783
votes: 0


I'd love to have multiple publisher IDs on my one account ... Can't be that hard to allow us to have alternate IDs due to us not wanting to link some of our sites.
7:46 pm on Feb 28, 2009 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 13, 2005
posts:361
votes: 0


my id is out there since '05, not listed anywhere except one site, but not the one mentioned here.
6:01 am on Mar 1, 2009 (gmt 0)

Full Member

5+ Year Member

joined:Jan 29, 2008
posts:243
votes: 0


Why not use doubleclick style tags? Google now owns doubleclick. Each website's javascript comes with unique url like:
doubleclick.net/yoursite.com/blah

Or they can simply use domain or encrypted public ID for each ad slot.

6:13 am on Mar 1, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts:3224
votes: 9


Better still...
Now that Google owns Doubleclick, why not just shut it down ?
Things have become worse since including them.
My 2c
3:22 pm on Mar 2, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2005
posts:2259
votes: 0


Create a few pages on your site that are pretty worthless. Interlink them. Have just one minor link to one of these pages i.e. give Google and other bots a way in. Then put other people's publisher IDs on those pages. Lots of different IDs on different pages. Get reputable IDs from the big boys - NYT, Amazon, Ask, AOL. You don't lose any income, you don't lose any PR, it's within the Adsense TOS and you frustrate those data collectors ;)
2:08 pm on Mar 20, 2009 (gmt 0)

Full Member

10+ Year Member

joined:Feb 6, 2003
posts:262
votes: 0


How is this different than any other "spy" tool out there that grabs advertisers keyword lists, ad copy, SEO rankings, backlinks, etc?

It's been going on for years in other areas of online marketing it was only a matter of time before it hit the adsense side of things.