Welcome to WebmasterWorld Guest from 54.234.8.146

Forum Moderators: buckworks & eWhisper & skibum

Message Too Old, No Replies

AdwordsBot, please stop lowercasing destination urls!

They won't work on some servers... especially mine :(

     

Sujan

8:48 pm on Mar 30, 2008 (gmt 0)

10+ Year Member



Dear AdwordsBot,

today you started to lowercase some of the ad destination (click) urls of my ads when checking for availability and quality score.

This may be fine with some servers, and perhaps you even have a valid reason for doing so, but with my server this will cause lots of bad 404 errors you don't want to see.

So, please, go back to your old behaviour and just use the urls I entered in your big brother's GUI interface.

Thanks,
Sujan

Sujan

8:51 pm on Mar 30, 2008 (gmt 0)

10+ Year Member



PS: Some info for your makers:

IP: 66.249.85.68
Hostname: ff-in-f68.google.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Details: "kdlii" instead of "KDLII" in url, not replacing DKI

Sujan

10:37 am on Mar 31, 2008 (gmt 0)

10+ Year Member



Last night 74.125.16.37 started to do the same...

nakita_dog

1:32 pm on Mar 31, 2008 (gmt 0)

5+ Year Member



Why not just account for the lowercase version and do a 301 redirect to the mixed case version.

Sujan

7:10 pm on Mar 31, 2008 (gmt 0)

10+ Year Member



Because that would be ~ 1.000.000 possible redirects :)

Receptional Andy

7:13 pm on Mar 31, 2008 (gmt 0)



that would be ~ 1.000.000 possible redirects

Is there a pattern between URLs that would need to redirect? If so, it's likely an accomplishable task.

Sure, if bots make mistakes the makers should fix them, but there's no harm in a helping hand if possible :)

jdMorgan

7:49 pm on Mar 31, 2008 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Are you sure that's AdwordsBot, and not a scraper crawling through a Google proxy?

I've seen abuse from those "ff-in-fNN.google.com" hosts, and my impression is that they're not addresses used internally by Google. No, I'm not sure, but googlebots normally identify themselves as such, and not as browsers.

Jim

Receptional Andy

8:01 pm on Mar 31, 2008 (gmt 0)



Good catch Jim. If it's Google then they should sort out their (r)DNS/proxies. They're registered as allocated 66.249.64.0/19 so I suspect it's their responsibility in any case.

[edited by: Receptional_Andy at 8:03 pm (utc) on Mar. 31, 2008]

Sujan

10:07 pm on Mar 31, 2008 (gmt 0)

10+ Year Member



Receptional Andy, I of course already did. Just entered a lowercased column in the database that is now used as a fallback. Works for now, but what if the bot decides to reverse strings one day?

jdMorgan, I am. Google Adwords bots use at least +/- 65 IPs and 3 to 15 useragents (depends on how you count) these days. The days they always identified themselves as "Googlebot" or "Adsbot" are long gone. Second place is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", followed by "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)".
Because both of them didn't replace dynamic keyword insertion like they should have done, I'm sure it can't anybody from outside - because nobody knows these urls (and can't because from the outside you can't distinguish our params and e.g. the creative id).
You see, I did my homework this time ;)

And of course, I just hope somebody from Google Adsbot team reads this and reacts. Maybe...

Jan

jdMorgan

2:28 am on Apr 1, 2008 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



66.249.84.nn is a crawler range
66.249.86.nn is also a crawler range

66.249.85.nn however, is not a crawler range. Of the addresses within that range that do resolve, all resolve to the ff-in-fNN.google.com hosts.

I'd like to find out exactly what the ff-in-fNN.google.com hosts are intended to be used for.

Jim

phranque

11:49 am on Apr 1, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



i've been getting small activity only from .88 - going back 5 weeks i see:
- once some but not all weeks i see HTTP GET of /google0123456789abcdef.html and /noexist_0123456789abcdef.html by user agent "Google-Sitemaps/1.0". (returning 200's and 404's.)

- 26 Feb HTTP GET one page and associated urls by user agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; InfoPath.1)", which was referred by a Gsearch in which this url was top 5. doesn't appear at first glance to have anything to do with adwords.

- 27 Feb one HTTP GET of a .pdf by user agent "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12", which was referred by a Gsearch in which this url was top 5. doesn't appear at first glance to have anything to do with adwords.

- 30 Mar HTTP GET one adwords destination url by user agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)", which was referred by a Gsearch referrer that appears to be manufactured from the destination url.
as in if the dest url was "www.example.com/scriptname.cgi?yadda=yadda", then the search is made to look like "site:www.example.com scriptname", a search for which no results appear.

if i had to guess, it's human testing.

jdMorgan

12:48 pm on Apr 1, 2008 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Phranque, please confirm:
...from .88

It's 66.249.85.nn we're discussing here; and 66.249.88.nn *is* within their known "internal use" range. For some reason, .85.nn constitutes a "hole" in their otherwise-contiguous range from 66.249.64.00 through 66.249.95.254

Jim

phranque

8:56 pm on Apr 1, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



66.249.85.88

phranque

10:12 pm on Apr 1, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



forgot to add this:
that analysis was for any traffic from 66.249.85. and ...88 was the only visitor on that adwords site.
(it's a very small campaign)
i'll do more analysis on a much larger campaign/site later...

phranque

1:16 am on Apr 2, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



ok here's the Sitemaps bot access pattern from the .85. range for 5 domains on one server over a 5 week period (with some fields edited or removed for clarity AND obscurity):
66.249.85.133 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [27/Feb/2008:20:34:18] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [27/Feb/2008:20:34:18] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [27/Feb/2008:20:34:18] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [27/Feb/2008:20:34:18] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [27/Feb/2008:20:34:17] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [27/Feb/2008:20:34:17] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

66.249.85.133 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [21/Mar/2008:13:02:26] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [21/Mar/2008:13:02:26] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

66.249.85.133 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.87 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.133 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.85 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [28/Mar/2008:14:09:11] "GET /google*.html HTTP/1.1" "Google-Sitemaps/1.0"
66.249.85.130 [28/Mar/2008:14:09:11] "GET /noexist_*.html HTTP/1.1" "Google-Sitemaps/1.0"

so all 5 hit simultaneously from various ip's, 3 of the 5 weeks, and the ip "sticks" to the domain from week to week.
i'm guessing i will find similar patterns for all sites we are tracking in GWT.
i haven't correlated these times with those from the server mentioned in a previous post but the dates look familiar.

phranque

2:09 am on Apr 2, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



non-Sitemaps-bot access on that server from the following ip's:
66.249.85.68
66.249.85.69
66.249.85.85
66.249.85.88
66.249.85.133

and using the following user agents:
- (as in non-specified)
Mozilla/4.0 (Windows XP 5.1) Java/1.6.0_04
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; WOW64; SV1)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; IEMB3; IEMB3)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 2.0.50727)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; MEGAUPLOAD 2.0)
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.8;MEGAUPLOAD 1.0
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Mozilla/5.0 (compatible; Google Desktop)

phranque

4:05 am on Apr 2, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



this could be interesting.
an unreferred GET of a keyword destination but without the typical adwords parameters from 74.125.16.37 using "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)".
subsequent GETs of objects on that page are request from 74.125.16.37 as well as 66.249.85.68 using the same agent, including one case of a 301 returned to one ip and the subsequent request by the other.
maybe some kind of proxy thing happening?

again, i'm guessing a manual check for flagged situations since this particular keyword phrase happens to be one that gets high impressions but low CTR for us and the landing page would certainly pass the relevance test.
from a largish campaign with thousands spent per month.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month