| Quora link checkers Something useful from AWS |
incrediBILL

msg:4491717 | 12:10 am on Sep 6, 2012 (gmt 0) | I was tinkering with Quora and posted a link to see what they used for validation and sure enough they're using AWS. Finally, something useful that gives me an actual reason to punch a hole in the AWS firewall. 50.16.83.146 - - [05/Sep/2012:23:59:18 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)" 23.20.62.58 - - [05/Sep/2012:23:59:19 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)" 23.20.14.25 - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7" 23.20.14.25 - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7" Not sure why they needed to check it 4 times. Two IPs properly identified their user agents, the next two used default python UAs, pretty sloppy programming all around. Sadly, punching a hole in the firewall for them leaves a gaping hole for scrapers using AWS. Need to put my noodle to work and figure out some ID scheme that allows vendors to use a shared modem pool and ID themselves without using rDNS because this situation is only going to get bigger as more sites transition to cloud computing.
|
|