homepage Welcome to WebmasterWorld Guest from 184.73.87.85
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
AJAX Can Turn Browsers into Scrapers and Spammers
incrediBILL




msg:3830839
 11:36 pm on Jan 20, 2009 (gmt 0)

The topic should be unsettling because anyone surfing the web, especially in seedy and untrusted sites, can easily be turned into an unwanted scraper or even a web spammer for that matter with combinations of commonly used web technology.

Most of us that block scrapers and form spammers work on the premise that scrapers are primarily operating out of hosting centers, therefore we block massive ranges of IPs thus creating a firewall between our web sites and malicious hosting center activity.

Unfortunately, the game has become much more complicated as some of the scrapers are combining technologies in such a way that a naive surfer is facilitating scraping and spamming from his own IP address without having an infected machine, it all happens in the browser.

Let's use the javascript spamming as a simple example.

To cause a javascript enabled browser to automatically send a spam to a site you simply trigger an OnSubmit() event, perhaps even on page load, that sends form data to a 3rd party location using the surfers IP address. It's possible the browser might even show some sort of warning but people tend to click OK to just about anything therefor the success rate is unfortunately higher than expected thanks to less than savvy surfers.

The form content that is sent to a 3rd party is hidden from view, you would never be the wiser, and could even be combined with a another form on the same page to mask a possible 3rd party site alert under the guise of clicking "search", "feedback", etc. not to raise suspicions.

Now, let's expand on that simple concept and apply it to AJAX which can do all sorts of interesting things.

Many higher end technologies can interact such as AJAX, Flash and other technologies, and you can easily create a combination of technology that will collect data from a 3rd party site and send it to the scraper using the surfers IP address.

Fortunately not a lot of people know how this works which is why I won't delve into specifics to avoid more of this from happening.

Suffice it to say that it's easily possible to misinterpret and block innocent residential customers that have become unwitting pawns in the online game of cat and mouse.

[edited by: incrediBILL at 9:57 am (utc) on Jan. 21, 2009]

 

Megaclinium




msg:3831063
 7:00 am on Jan 21, 2009 (gmt 0)

I've had some bizarre experiences browsing web. Major sites that simply "don't respond", blank screen. This was some ISP I looked up from ARIN entries from IP range. this makes me think that some script might be hitting me when is not responding. I usually quickly end out, and make sure I'm using FF with 'no-sript' and scripts disabled for new sites. Maybe that's the prob, they have 100% scripted sites that won't display anything if scripts disabled.

i suppose even having a large IP range doesn't mean your site isn't loaded with malware intentionally or not. Ok, maybe I'm too paranoid but I don't think I have spyware and haven't for quite a while. After seeing article about virtualy undetectable rootkit spyware that has infected a huge# of PCs I think this isjustified.

[edited by: incrediBILL at 10:09 am (utc) on Jan. 21, 2009]
[edit reason] removed blog URL [/edit]

enigma1




msg:3831298
 2:45 pm on Jan 21, 2009 (gmt 0)

Fortunately not a lot of people know how this works which is why I won't delve into specifics to avoid more of this from happening.

Bill, this is not very comforting, as you know people don't need to know a technology in order to use it. There are ready "packages" that do just that. Clickjacking. Once deployed on a popular server, any visitor who access it and has the active content on with his browser, is a potential carrier.

As an example just take a look on the plethora of forums around the web that allow avatars to be uploaded by anyone and be externally referenced. Try it, then check your server logs to see whether they're accessed or not. And those gif or jpg files of course can be changed to some active content (not always but sometimes lets say, or perhaps to a login dialog?) See what information you can collect.

And then you got to love wifi. How many routers are not protected? Anyone can have internet without an ISP. Who's behind those IPs I wonder?

Anyways perhaps we should thank the browser vendors - for all this - who compete to process faster and efficiently the flash and ajax scripts instead of setting them off by default with the browser installation. But if they do the later they may go out of business isn't it?

simonuk




msg:3831304
 2:56 pm on Jan 21, 2009 (gmt 0)

For family and friends I always remove IE from view and install FireFox with noscript and flashblock.

I used to think they were safe but my brother, who has had his new laptop for a week, was targeted in this way. He had managed to tick the "allow scripts globally" so was no longer safe.

If only I could get them all on linux ;)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved