Welcome to WebmasterWorld Guest from 3.94.129.211

Forum Moderators: Ocean10000

Message Too Old, No Replies

Microsoft Scheduled Cache Content Download Service/Fetch API Request

     
2:13 pm on Nov 15, 2005 (gmt 0)

New User

10+ Year Member

joined:Feb 28, 2005
posts:17
votes: 0


Hi,

I manage a large non-commercial Internet site (about half a million requests a day) and we're getting a lot of traffic from a single IP address (which I have found out is a popular commercial web proxy content filtering service). The thing is hammering our website - in fact it is responsible for almost 40% of all our requests! This has been going on 24/7 for months.

The thing's user agent identifies itself as either:

Microsoft Scheduled Cache Content Download Service
or
Fetch API Request

The thing visits most of the pages on our site, but the overwhelming number of hits are to the pages that serve up content from large database of publically accessible content we maintain. The thing seems to know how to post keywords that closely match our industry sector into the search form in order for search results to be returned. Sometimes the keywords are even mis-spelt.

It seems to request each page twice.

Does anyone know what it might be doing?

5:24 pm on Nov 15, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


Does anyone know what it might be doing?

I don't know, however it's a safe assumption that they are either caching your pages or filtering the content for a user that has software installed.

1) Fetch API is in most every ban list ever created.

2) "Microsoft Scheduled Cache Content Download Service"
If a UA appeared like this in my visitor logs? It wouldn't matter to me, who or where the visitior was coming from! Both the UA and the IP range would be denied.

There are key words that nearly demand denial of access:

reap, fetch, download, cache, spider, link, agent, crawl, email, find, gather, loader, java, larbin, library, LWP, probe, capture, ANYTHING that begins with the word web, and there even be others that I have missed or not seen.

Don

8:24 pm on Nov 15, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 22, 2001
posts:2450
votes: 0


I love your list, Don :)

You kind of remind me of the "Soup Nazi" from Seinfeld... "No soup for you!"

8:45 pm on Nov 15, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


love your list, Don

I'd hardly refer to it as my list Dan?

If your check most every ban list in this forums archives you will see these names repeated over and again.

I'm just passing the soup down the line ;)

Don

9:27 pm on Nov 15, 2005 (gmt 0)

Senior Member from FR 

WebmasterWorld Senior Member leosghost is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Feb 15, 2004
posts:7139
votes: 413


include "access"
6:56 am on Nov 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


include "access"

Leo,
I'm not sure if the above was tongue-in-cheek or if access is a legitimate scraper/reaper?

To say the use of the previously mentioned terms does not allow access is not correct.

Don

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members