Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Quora link checkers

Something useful from AWS

12:10 am on Sep 6, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
votes: 99

I was tinkering with Quora and posted a link to see what they used for validation and sure enough they're using AWS. Finally, something useful that gives me an actual reason to punch a hole in the AWS firewall. - - [05/Sep/2012:23:59:18 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)" - - [05/Sep/2012:23:59:19 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Quora Link Preview/1.0 (http://www.quora.com)" - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7" - - [05/Sep/2012:23:59:35 +0000] "GET /sample.php HTTP/1.1" 200 8810 "-" "Python-urllib/2.7"

Not sure why they needed to check it 4 times. Two IPs properly identified their user agents, the next two used default python UAs, pretty sloppy programming all around.

Sadly, punching a hole in the firewall for them leaves a gaping hole for scrapers using AWS. Need to put my noodle to work and figure out some ID scheme that allows vendors to use a shared modem pool and ID themselves without using rDNS because this situation is only going to get bigger as more sites transition to cloud computing.

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members