homepage Welcome to WebmasterWorld Guest from 54.145.182.50
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google AdWords
Forum Library, Charter, Moderators: buckworks & eWhisper & skibum

Google AdWords Forum

    
AdwordsBot, please stop lowercasing destination urls!
They won't work on some servers... especially mine :(
Sujan

5+ Year Member



 
Msg#: 3614910 posted 8:48 pm on Mar 30, 2008 (gmt 0)

Dear AdwordsBot,

today you started to lowercase some of the ad destination (click) urls of my ads when checking for availability and quality score.

This may be fine with some servers, and perhaps you even have a valid reason for doing so, but with my server this will cause lots of bad 404 errors you don't want to see.

So, please, go back to your old behaviour and just use the urls I entered in your big brother's GUI interface.

Thanks,
Sujan

 

Sujan

5+ Year Member



 
Msg#: 3614910 posted 8:51 pm on Mar 30, 2008 (gmt 0)

PS: Some info for your makers:

IP: 66.249.85.68
Hostname: ff-in-f68.google.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Details: "kdlii" instead of "KDLII" in url, not replacing DKI

Sujan

5+ Year Member



 
Msg#: 3614910 posted 10:37 am on Mar 31, 2008 (gmt 0)

Last night 74.125.16.37 started to do the same...

nakita_dog

5+ Year Member



 
Msg#: 3614910 posted 1:32 pm on Mar 31, 2008 (gmt 0)

Why not just account for the lowercase version and do a 301 redirect to the mixed case version.

Sujan

5+ Year Member



 
Msg#: 3614910 posted 7:10 pm on Mar 31, 2008 (gmt 0)

Because that would be ~ 1.000.000 possible redirects :)

Receptional Andy



 
Msg#: 3614910 posted 7:13 pm on Mar 31, 2008 (gmt 0)

that would be ~ 1.000.000 possible redirects

Is there a pattern between URLs that would need to redirect? If so, it's likely an accomplishable task.

Sure, if bots make mistakes the makers should fix them, but there's no harm in a helping hand if possible :)

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3614910 posted 7:49 pm on Mar 31, 2008 (gmt 0)

Are you sure that's AdwordsBot, and not a scraper crawling through a Google proxy?

I've seen abuse from those "ff-in-fNN.google.com" hosts, and my impression is that they're not addresses used internally by Google. No, I'm not sure, but googlebots normally identify themselves as such, and not as browsers.

Jim

Receptional Andy



 
Msg#: 3614910 posted 8:01 pm on Mar 31, 2008 (gmt 0)

Good catch Jim. If it's Google then they should sort out their (r)DNS/proxies. They're registered as allocated 66.249.64.0/19 so I suspect it's their responsibility in any case.

[edited by: Receptional_Andy at 8:03 pm (utc) on Mar. 31, 2008]

Sujan

5+ Year Member



 
Msg#: 3614910 posted 10:07 pm on Mar 31, 2008 (gmt 0)

Receptional Andy, I of course already did. Just entered a lowercased column in the database that is now used as a fallback. Works for now, but what if the bot decides to reverse strings one day?

jdMorgan, I am. Google Adwords bots use at least +/- 65 IPs and 3 to 15 useragents (depends on how you count) these days. The days they always identified themselves as "Googlebot" or "Adsbot" are long gone. Second place is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", followed by "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)".
Because both of them didn't replace dynamic keyword insertion like they should have done, I'm sure it can't anybody from outside - because nobody knows these urls (and can't because from the outside you can't distinguish our params and e.g. the creative id).
You see, I did my homework this time ;)

And of course, I just hope somebody from Google Adsbot team reads this and reacts. Maybe...

Jan

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3614910 posted 2:28 am on Apr 1, 2008 (gmt 0)

66.249.84.nn is a crawler range
66.249.86.nn is also a crawler range

66.249.85.nn however, is not a crawler range. Of the addresses within that range that do resolve, all resolve to the ff-in-fNN.google.com hosts.

I'd like to find out exactly what the ff-in-fNN.google.com hosts are intended to be used for.

Jim

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 11:49 am on Apr 1, 2008 (gmt 0)

i've been getting small activity only from .88 - going back 5 weeks i see:
- once some but not all weeks i see HTTP GET of /google0123456789abcdef.html and /noexist_0123456789abcdef.html by user agent "Google-Sitemaps/1.0". (returning 200's and 404's.)

- 26 Feb HTTP GET one page and associated urls by user agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; InfoPath.1)", which was referred by a Gsearch in which this url was top 5. doesn't appear at first glance to have anything to do with adwords.

- 27 Feb one HTTP GET of a .pdf by user agent "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12", which was referred by a Gsearch in which this url was top 5. doesn't appear at first glance to have anything to do with adwords.

- 30 Mar HTTP GET one adwords destination url by user agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)", which was referred by a Gsearch referrer that appears to be manufactured from the destination url.
as in if the dest url was "www.example.com/scriptname.cgi?yadda=yadda", then the search is made to look like "site:www.example.com scriptname", a search for which no results appear.

if i had to guess, it's human testing.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3614910 posted 12:48 pm on Apr 1, 2008 (gmt 0)

Phranque, please confirm:
...from .88

It's 66.249.85.nn we're discussing here; and 66.249.88.nn *is* within their known "internal use" range. For some reason, .85.nn constitutes a "hole" in their otherwise-contiguous range from 66.249.64.00 through 66.249.95.254

Jim

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 8:56 pm on Apr 1, 2008 (gmt 0)

66.249.85.88

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 10:12 pm on Apr 1, 2008 (gmt 0)

forgot to add this:
that analysis was for any traffic from 66.249.85. and ...88 was the only visitor on that adwords site.
(it's a very small campaign)
i'll do more analysis on a much larger campaign/site later...

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 1:16 am on Apr 2, 2008 (gmt 0)

ok here's the Sitemaps bot access pattern from the .85. range for 5 domains on one server over a 5 week period (with some fields edited or removed for clarity AND obscurity):
66.249.85.133 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [27/Feb/2008:20:34:18] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [27/Feb/2008:20:34:18] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [27/Feb/2008:20:34:18] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [27/Feb/2008:20:34:18] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

66.249.85.133 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

66.249.85.133 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

so all 5 hit simultaneously from various ip's, 3 of the 5 weeks, and the ip "sticks" to the domain from week to week.
i'm guessing i will find similar patterns for all sites we are tracking in GWT.
i haven't correlated these times with those from the server mentioned in a previous post but the dates look familiar.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 2:09 am on Apr 2, 2008 (gmt 0)

non-Sitemaps-bot access on that server from the following ip's:
66.249.85.68
66.249.85.69
66.249.85.85
66.249.85.88
66.249.85.133

and using the following user agents:
- (as in non-specified)
Mozilla/4.0 (Windows XP 5.1) Java/1.6.0_04
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; WOW64; SV1)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; IEMB3; IEMB3)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 2.0.50727)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; MEGAUPLOAD 2.0)
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.8;MEGAUPLOAD 1.0
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (compatible; Google Desktop)

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3614910 posted 4:05 am on Apr 2, 2008 (gmt 0)

this could be interesting.
an unreferred GET of a keyword destination but without the typical adwords parameters from 74.125.16.37 using "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)".
subsequent GETs of objects on that page are request from 74.125.16.37 as well as 66.249.85.68 using the same agent, including one case of a 301 returned to one ip and the subsequent request by the other.
maybe some kind of proxy thing happening?

again, i'm guessing a manual check for flagged situations since this particular keyword phrase happens to be one that gets high impressions but low CTR for us and the landing page would certainly pass the relevance test.
from a largish campaign with thousands spent per month.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google AdWords
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved