Forum Moderators: open

Message Too Old, No Replies

Yahoo Python

Yahoo using a common site scraper

         

dstiles

4:18 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Noticed this a couple of days ago as a once-or-twice thing. Today got lots of them...

Python-urllib/1.17 (exact UA)

Coming from Yahoo/Inktomi blocks 72.30.*.* and 74.6.*.* and (a couple only, yesterday) from research20.corp.sp1.yahoo.com on 68.180.144.*.

Mostly (but not only) hitting the home page at the moment. Those that aren't are relatively popular pages.

Just what are these clowns doing? This isn't the first time they've used a scraper bot (yes, I know it has non-scraper uses as well!).

Anyway, it's blocked.

incrediBILL

6:04 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Python isn't a bot, it's a programming language, and the "Python-urllib" [webmasterworld.com] is the component that communicates with web protocols.

Obviously someone at Yahoo has written a tool in Python they're using for some purpose.

keyplyr

6:17 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've also seen it coming from Yahoo's 69.147. and 66.94. blocks.

dstiles

8:37 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, Bill, it's a bot signature. Amounts to the same - nothing good ever came with a name like urllib. :)

incrediBILL

10:00 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



it's a bot signature

Um, no, it's actually a lack of original signature, which is usually worse because it means the programmer was either too lazy to change it which is trivial or too stupid to know it needed changing or how to change it in the first place.

And you're right, nothing good comes from that!

[edited by: incrediBILL at 10:02 pm (utc) on Oct. 1, 2008]

Demaestro

10:06 pm on Oct 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lets not call my beloved Python such things. It is not a bot nor a bot signature.

Can your write a bot in Python? Yes but you can in PHP too... doesn't make it a bot.

I am not overly surprised Yahoo is tinkering with Python seeing as Google uses it everywhere and there is a merger taking place.