Welcome to WebmasterWorld Guest from 54.166.224.46

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Quora link checkers

Something useful from AWS

     
12:10 am on Sep 6, 2012 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




I was tinkering with Quora and posted a link to see what they used for validation and sure enough they're using AWS. Finally, something useful that gives me an actual reason to punch a hole in the AWS firewall.

50.16.83.146 - - [05/Sep/2012:23:59:18 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)"
23.20.62.58 - - [05/Sep/2012:23:59:19 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)"
23.20.14.25 - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7"
23.20.14.25 - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7"

Not sure why they needed to check it 4 times. Two IPs properly identified their user agents, the next two used default python UAs, pretty sloppy programming all around.

Sadly, punching a hole in the firewall for them leaves a gaping hole for scrapers using AWS. Need to put my noodle to work and figure out some ID scheme that allows vendors to use a shared modem pool and ID themselves without using rDNS because this situation is only going to get bigger as more sites transition to cloud computing.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month