homepage Welcome to WebmasterWorld Guest from 54.196.18.51
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Can Javascript Be activated By WebBotes?
mrtws




msg:4279734
 5:22 pm on Mar 10, 2011 (gmt 0)

Hello, im a 10 year veteran in web design and im having an argument with someone about the fact that modern robots that trawl the internet can trigger javascript to make it look like they are human. He says they cant. They have a vested interest because they are a directory service that i pay for. they are saying 750 hits and im showing 250 hits with my php based counter that is on each page. essentially its not possible for one of my pages to be visited without the php script logging it into a database. Whilst there are other reasons, such as they use reffering links and redirects, the argument currantly stands that robots can trigger javascript and appear like humans, triggering their counter, but not following through to my site.

As web masters on a web master forum, can anyone shed any light? Any good articles or papers or sites detailing robots triggering javascript?

Any help appreciaited.

i'll be nervously refreshing this topic for the next few hours, but i probably wont be here next month.

many thanks in advance.

 

Demaestro




msg:4279742
 5:31 pm on Mar 10, 2011 (gmt 0)

I am positive that they can although I would say most don't.

It is a lot of work to get a crawler to run JS depending on how it was written and how it does the crawling.... using wget type methods or a hacked browser up browser running automatically.

In the case of the former I have heard of people going into a website and downloading all the javascript, then "recreating" that JS in a server side language... then they send the bot in to crawl and when it encounters JS it uses the converted server-side method to take appropriate actions and continue on.

Can it be done... yes.

mrtws




msg:4279754
 5:48 pm on Mar 10, 2011 (gmt 0)

thanks for the reply

its simple code, its the link here:

[businessmagnet.co.uk ]

im convinced its happening as there are just so many complications with modern code. I have loads of sites and robots are a constant hassle, no matter what i do there are always robots doing stuff.

In this case they have a pretty straight forward link almost:

[businessmagnet.co.uk...]

And when that link is clicked, its firing a hit. As far as im concerned thats suseptable to robot traffic, and as long as the robot conceals its headers and makes itself look like a browser or does something i cant quite put my finger on then its going to generate some degree of false hits.

its my belief that advanced data mining robots will trigger that link and like any indexing system might not follow through as all they want it the result of that link, ie whats at the other end, without actually going there, or something like that.

the peson i deal with is a telemarketing type guy that in trained to say that 100% of hits are accurate because of javascript, which is frustrating. Thing is, i cant quite say exactly why that is bull other than my counter reading far less AND common sense. As i said, robots get everywhere.

mrtws




msg:4279757
 5:50 pm on Mar 10, 2011 (gmt 0)

hhmm, maybe i should put the link in code.

http://www.businessmagnet.co.uk/aspx/redirect.aspx?website=www.cableandcrimpingservices.co.uk&PageID=companycrimpingmachine-72050.htm

i have nothing to do with that company by the way so dont think im spamming by linking etc.

As you'll see, hopfully, its a redirect, which presumably goes to their magical JAVASCRIPT although im not actually sure it does as its clearly ASPX code.

mrtws




msg:4279761
 5:53 pm on Mar 10, 2011 (gmt 0)

in fact the more i look at this the more im thinking that this is truly open to robot traffic.

i cant see how its not.

what can the aspx do? check the headers? check a list of robot IP's?

im beginning to feel very confident that they dont have any magical bot proof formula in place like they say?

topr8




msg:4279764
 6:00 pm on Mar 10, 2011 (gmt 0)

the more sophisticated bots absolutely read javascript, i know this for certain.

we know the likes of google can, but i'm assuming you mean more the nefarious types of bot that are up to no good.(or at least no good that benefits you)

but bots don't 'click' links, they store page data, parse the links and store them as pages they will visit in the future, they will then at some point visit the newly discovered page (this can happen very quickly)

... the directory should be taking steps to stop bots scraping their content.

mrtws




msg:4279765
 6:01 pm on Mar 10, 2011 (gmt 0)

i guess the javascript which is client side is in the redirect page. theory being that a robot can not fire up that javascript on the redirect page.

however, that seems like a bit of an achilles heal because all you need is some pearl? or a web service? to fire that actual redirect page, again and again with various domains and you can cause no ends of havoc?

i think its obviously a reasonable method but i just cant see it as being foolproof?

mrtws




msg:4279779
 6:11 pm on Mar 10, 2011 (gmt 0)

thanks for the reply topr8

i think its obvious that the link i have posted above is going to be subject to robot activity which the question being to what extent. Its also clear that the directory is already doing quite a bit to stop it. However, im just saying to them at least that its just impossible to put a 100% gaurantee on that, especially today.

I think that what im trying to satisfy myself with is the FACT that most web masters will agree that its impossible to discount 100% robot activity? I said that, I said, what your saying is unbelievable. But he is saying that their system, because it uses javascript, means that ALL PEOPLE CLICKING IT ARE HUMAN. I cant accept that, especially when they say that 750 people clicked it when only 250 people arrived.

Not expecting any hard coded solutions to this, but thoughts and options very eagerly recieved as its really frustrating, especially when they want nearly 700 for the pleasure of this nebulous activity.

mrtws




msg:4280151
 12:23 pm on Mar 11, 2011 (gmt 0)

thanks for all your replies.

i have to close this browser down so i wont see any further replies.

Hisoka




msg:4293284
 5:17 pm on Apr 6, 2011 (gmt 0)

If you're a developer, it's super easy to create a crawler that can read Javascript. Take a look at this open source tool called "Selenium". You can automate opening up a Firefox browser, click on links, go to websites, download images, click on buttons, etc. Super Easy, nothing rocket science about it.

That said, 99% of robots dun use it

mrtws




msg:4293313
 5:50 pm on Apr 6, 2011 (gmt 0)

Cheers for the reply.

i still cant figure it out but im sure there are bots that do that. personally i believe there is a massive cyber war going on, just like in the Matrix and i think the countries like China etc with a lot of technical competance have developed very sophisticated intelligence gather systems such as robots that leave no stone unturned, and i think they love business directories which are perfect for profiling a countries economic picture as they have all the companies in them. As such a business directory would possibly be a target for a clever robot.

i do agree its a bit far fetched, but truth is often stranger than fiction.

Status_203




msg:4295261
 8:01 am on Apr 11, 2011 (gmt 0)

How are you tracking where hits come from? Are you checking the Referer header or do they pass in a parameter in the request?

chatfuns




msg:4300294
 8:17 am on Apr 19, 2011 (gmt 0)

referer header field can be filled by anything that are you wanted.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved