Forum Moderators: open

Message Too Old, No Replies

Microsoft Scheduled Cache Content Download Service/Fetch API Request

         

DrGUID

2:13 pm on Nov 15, 2005 (gmt 0)

10+ Year Member



Hi,

I manage a large non-commercial Internet site (about half a million requests a day) and we're getting a lot of traffic from a single IP address (which I have found out is a popular commercial web proxy content filtering service). The thing is hammering our website - in fact it is responsible for almost 40% of all our requests! This has been going on 24/7 for months.

The thing's user agent identifies itself as either:

Microsoft Scheduled Cache Content Download Service
or
Fetch API Request

The thing visits most of the pages on our site, but the overwhelming number of hits are to the pages that serve up content from large database of publically accessible content we maintain. The thing seems to know how to post keywords that closely match our industry sector into the search form in order for search results to be returned. Sometimes the keywords are even mis-spelt.

It seems to request each page twice.

Does anyone know what it might be doing?

wilderness

5:24 pm on Nov 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does anyone know what it might be doing?

I don't know, however it's a safe assumption that they are either caching your pages or filtering the content for a user that has software installed.

1) Fetch API is in most every ban list ever created.

2) "Microsoft Scheduled Cache Content Download Service"
If a UA appeared like this in my visitor logs? It wouldn't matter to me, who or where the visitior was coming from! Both the UA and the IP range would be denied.

There are key words that nearly demand denial of access:

reap, fetch, download, cache, spider, link, agent, crawl, email, find, gather, loader, java, larbin, library, LWP, probe, capture, ANYTHING that begins with the word web, and there even be others that I have missed or not seen.

Don

volatilegx

8:24 pm on Nov 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I love your list, Don :)

You kind of remind me of the "Soup Nazi" from Seinfeld... "No soup for you!"

wilderness

8:45 pm on Nov 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



love your list, Don

I'd hardly refer to it as my list Dan?

If your check most every ban list in this forums archives you will see these names repeated over and again.

I'm just passing the soup down the line ;)

Don

Leosghost

9:27 pm on Nov 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



include "access"

wilderness

6:56 am on Nov 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



include "access"

Leo,
I'm not sure if the above was tongue-in-cheek or if access is a legitimate scraper/reaper?

To say the use of the previously mentioned terms does not allow access is not correct.

Don