Forum Moderators: open

Message Too Old, No Replies

Fetch API Request - Help!

I know more about it but I'm even more puzzled now!

         

Dreamquick

6:35 pm on Jul 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Over the past few days I've seen an increase in requests from the mysterious "Fetch API Request" user-agent. Finally I got fed up with not knowing what it was and so I inserted some code to see the requests it was making

Judging by the request headers it is a pretty much standard robot - it only ever seems to request the root document and then leave.

One thing all the requests I've seen have in common is that they carry a "Via:" header which (according to the http documentation) is there if the request gets passed through a proxy.

At this point in time I'm wondering what the application actually is and who makes it...

It might be a proxy (they are non-open if true - I checked common proxy ports of 80+3128+8080) but if that were the case why would it only ask for one page?

It might be a script controlled version of the browser - but if that were the case why would so may standard request headers that appeared in IE5.5 by default be missing?

It might be some sort of heart-beat check, but if that were the case why would it be so geographically distributed? e.g. take a random handful of the requests so far;

209.139.184.90 = Verio, Inc. (US)
209.7.199.68 = Illinois State Board of Education (US)
208.13.156.12 = Poe and Brown Benefits (US)
216.102.208.237 = City of Redwood (US)
193.133.109.17 = Frontline Distribution Limited (UK)
151.200.174.166 = Bell Atlantic (US)
67.98.187.23 = Novoste (US)
67.89.244.125 = Internet Allegiance, Inc. (US)
66.113.23.2 = Vanion, Inc. (US)
65.201.211.175 = InterOne Marketing (US)
63.236.133.235 = Qwest Communications (US)
63.144.41.70 = Nucor Building Sys (US)

Puzzled... Help - I'm out of ideas and would appreciate any other feedback any of the members are willing to offer!

Tony

p.s. here are some of the results I was seeing from the capture exercise;

Test example from IE6 :
---
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, */*
Accept-Language: en-gb,en-us;q=0.7,en;q=0.3
Connection: Keep-Alive
Host: mysite.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461)
Accept-Encoding: gzip, deflate
---

Example request headers from the Fetch API:

Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Fetch API Request

---
Connection: Keep-Alive
Host: mysite.com
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Fetch API Request
Via: 1.0 A-U4355WHP55XO9
---
Connection: Keep-Alive
Host: mysite.com
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Fetch API Request
Via: 1.0 TAURUS
---
Connection: Keep-Alive
Host: mysite.com
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Fetch API Request
Via: 1.0 PROXY
---

jdMorgan

2:40 am on Jul 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



DreamQuick,

Find a reference to it as a possible harvester or site scanning tool
here: www.psychedelix.com/agents.html

Jim

Dreamquick

8:52 am on Jul 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for that.

I've already had a conversation with Andreas (that link is his page) after spotting the same page with google yesterday prior to asking for help.

My conversation with him amounted to "we both know very little about this bot", there was some talk of it belonging to jaring.my but that was discounted after mooching through some logs and finding it was geographically distributed beyond Myanmar.

The links, although interesting, don't really tell you enough to form a solid opinion (someone put it in robots.txt & an article on what an API tool is).

So tempting just to 403 the bot...

wilderness

6:46 pm on Jul 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've looked through my files and url's without success. I recall something saved about Fetch API just cannot recall the details.

Last night went through a search at Google without much success.

This search at google gorups; suggests Fetch API is "the Cache refreshing itself
[groups.google.com...]

If the above is true than the fetch API is not malicious at all. Unless you don't desire your pages cached by servers.

HandwovenRug

8:03 pm on Jul 13, 2002 (gmt 0)

10+ Year Member



Thank you wilderness. Another mystery is revealed.

bird

12:15 am on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not really convinced by this explanation.

What we know now is that the term "Fetch API" (without the "Request" part) is *also* used in the context of Microsoft ISA proxy servers. But this doesn't explain the access patterns I'm seeing at all.

In my logs, there are two distinct patterns:
One is a single IP in china trying to download ALL my files.
The other are seemingly arbitrary IPs all around the world, fetching ONE file in intervals of a few days (always the same file, with one curious exception).

Neither behaviour is consistent with the typical tasks of a caching proxy server.

wilderness

12:49 am on Jul 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>One is a single IP in china trying to download ALL my files.>

Hey bird,
Perhaps this explains why I'm not seeing it?
I have all of 203. denied. As well as portions of 202 and 210. Prior to doing so, because I have both Aussie and NZ visitors which are part of APNIC block, I went through by eastern country eliminating blocks I didn't desire traffic from. Although an occasional visitor slips through (last week somebody here enlightend me to a 50's ip; I also had a far east spider not identify itself recently from a 61. block. Grabbing quite a few pages in the process before I cut them off.

These steps are NOT for everybody. I have no market in the far east nor should the content of my site have benefit for them either.

Back to the drawing board. :-(

Josk

9:15 am on Jul 15, 2002 (gmt 0)

10+ Year Member



I used to get Fetch API a couple of months ago, but this stopped. Regarding the different IP addresses this would look like some type of proxy server being used from a particular location. ie, the person is trying to hide their ip address... This would fit in with email address harvesting idea...

jdMorgan

3:43 pm on Jul 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From what I read, Fetch API is used to by one computer to connect to the API of
the server on another computer - simplistically, to "remote control" it. No
thanks, not on my site! Another security issue from our friends in Redmond...

Jim