homepage Welcome to WebmasterWorld Guest from 54.166.14.218
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Googlebot Has Ears
and they seem to like good music
Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 6:37 am on Apr 9, 2012 (gmt 0)

For many years I have done web radio broadcasts for small personal projects.

My current station streams an archive loop 24/7 and I do regular live shows. Most of the content is me jamming with other musicians over the internet.

For the past three years I have used a particular Icecast streaming service, linking to it from my website using the server IP, port number and mountpoint as the URL.

In the past month Googlebot has started tuning in.

Over ten hours listening (500+ Mb) according to the stats.

This has never happened before. I am not aware of any changes in the streaming service and I can't do any bot control as I have very limited server access.

What interests me is Googlebot's apparent change of behaviour.

It seems to be copying my music.

...

 

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4438640 posted 6:57 am on Apr 9, 2012 (gmt 0)

Is it the normal Googlebot UA doing this?

Got an exact IP and UA you can post?

Considering they just recently launched Google Music, and are also trying to identify original content authors in other medium, it wouldn't surprise me that they are profile music files as well and possibly coming up with some type of music search like Shazam does.

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 4:01 pm on Apr 9, 2012 (gmt 0)

Is it the normal Googlebot UA doing this?

Not sure yet - it's a third-party service with limited stats.

Got an exact IP and UA you can post?

66.249.66.36 - "listened" for over two hours
66.249.66.114 - "listened" for over two hours
66.249.66.202 - "listened" for almost two hours
66.249.72.151 - "listened" for ninety minutes

A few more in the same range didn't like my music quite so much.

All this is in the last month, I've never seen it before.

music files

It is actually a continuous stream running 24/7 that is being accessed.

Fortunately I am not charged for bandwidth.

...

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 6:44 pm on Apr 9, 2012 (gmt 0)

After a little more digging I have a culprit:

SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

As I recall Googlebot-Mobile has several UAs but only this one seems to listen in.

Interestingly, it appears to "listen" for about ten minutes then reconnect immediately.

One conclusion might be that it is misconfigured, as no other bots seem to do it.

Another (less likely) is that I have a robotic fan.

...

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4438640 posted 7:44 pm on Apr 9, 2012 (gmt 0)

I would ask how G got the URL in the first place.

My guess would be someone with a) G toolbar; b) gmail; c) chrome; d) android; e) logged in to G.

That is assuming, of course, that the URL is not in the SERPS.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4438640 posted 7:54 pm on Apr 9, 2012 (gmt 0)


I suggest calling an off-page script to serve the "listen" link so bots don't trip it and set a timer that the (human) user needs to refresh to continue to listen. I serve an audio stream this way and have never had an issue.

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 8:39 pm on Apr 9, 2012 (gmt 0)

I would ask how G got the URL in the first place

As stated above, it is linked from my website, and has been for years.

As you suggest, there are probably a few other links out there too.

The point is that this activity is very recent.

And it only comes from the one UA - standard Googlebot, other Googlebot-Mobile variations, Bingbot and the rest must all know about the URL, but none of them ever listen in (they presumably detect it as a continuous audio stream in the same way iTunes does).

Only the Samsung variant appears to have headphones.

...

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 9:10 pm on Apr 9, 2012 (gmt 0)

I suggest calling an off-page script to serve the "listen" link so bots don't trip it and set a timer that the (human) user needs to refresh to continue to listen. I serve an audio stream this way and have never had an issue.

Thanks keyplyr, one problem is that the stream is not served from my own site.

Another is that the stream URL is available from other websites.

No other bot has ever actually loaded the stream before, though.

...

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4438640 posted 9:14 pm on Apr 9, 2012 (gmt 0)

Inevitable follow-up question: A while back, Image Search added the "looks like" option. The one where you drag in a picture of a pet rat dozing in a bakery bag, and it brings up pictures of (a) assorted close-ups of critters with eyes, and (b) pictures that are dark in the middle and white all around. (There was a (c), but I forget.)

Are we about to get a Google Music Search where you drag in something and it spits out audio files that "sound like" your specimen?

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 10:36 pm on Apr 9, 2012 (gmt 0)

Inevitable follow-up question

I don't know the answer Lucy, but I do sound like a rat in a bakery bag.

Congratulations on completing a year on WebmistressWorld.

Over 3,000 posts too - impressive.

...

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4438640 posted 10:39 pm on Apr 9, 2012 (gmt 0)



Thanks keyplyr, one problem is that the stream is not served from my own site.

Doesn't need to be. Serve the *link* through a script. If you also wish to set a timer that needs to be refreshed to continue listening, then put it all in a child window and control the life of the window with the timer script.

Examples of all this stuff you can find on the web.

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4438640 posted 2:09 am on Apr 10, 2012 (gmt 0)

Another possibility is someone working in the search division is actually listening to your music, there's one in every crowd, using their own VPN that uses the googlebot UAs and IPs.

Probably not, but anything is possible. ;)

Worse case you're on the Googlebot playlist.

blend27

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4438640 posted 8:17 pm on Apr 10, 2012 (gmt 0)

Let them listen to the Chrome adverts for 2 hours ;)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved