Forum Moderators: open
205.234.253.123 - - [03/Mar/2006:04:23:15 -0600] "GET /some_page.html HTTP/1.1" 403 1492 "-" "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
The IP address resolves only as far as HostForWeb Inc, in Chicago, Il. Digging deeper, you get only
Asking ns1.scservers.com. for 123.253.234.205.in-addr.arpa PTR record: Reports unknown.ord.scnet.net. [from 66.225.250.250]where scnet.net no longer resolves.
As I understood/remember it, Fast divested AllTheWeb to concentrate on enterprise search solutions a few years ago. And I'm not aware of any relationship at all between Lycos in MA, and any other company in IL.
Since it didn't fetch robots.txt, it didn't get anywhere, but keep an eye out for this one.
If it's legitimate, then it needs two emergency repairs: Fetch and obey robots.txt, and provide valid contact info in the UA string.
Jim
Then it started using this:
FAST-WebCrawler/2.2.5+-+Lycos/Alltheweb/Fast - -
So far it has come from:
72.2.24.#*$!
85.13.206.#*$!
83.142.29.xxx
205.234.253.xxx
66.148.68.xxx
72.232.67.xxx
It mostly uses the Fast UA now.
I can't confirm any of these IPs as Fast
I've blocked them all by IP since the random UA can't be used.
Anyone else seen these IP or UAs?
FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast
209.190.21.*
Partial WHOIS:
OrgName: Columbus Network Access Point, Inc.
OrgID: CNAP
Address: 50 W, Broad St, Suite 627
City: Columbus
StateProv: OH
PostalCode: 43215
Country: US
NetRange: 209.190.0.0 - 209.190.127.255
CIDR: 209.190.0.0/17
NetName: COLUMBUS-NAP
NetHandle: NET-209-190-0-0-1
Parent: NET-209-0-0-0-0
NetType: Direct Allocation
NameServer: NS1.NETSERVICE.THENAP.NET
NameServer: NS2.NETSERVICE.THENAP.NET
Comment: ADDRESSES WITHIN THIS BLOCK ARE NON-PORTABLE
RegDate: 1997-12-19
Updated: 2005-03-29
Here's a mini assortment of UAs from my robots.txt, not that FAST reliably heeds them:
User-agent: FAST
User-agent: FAST Enterprise Crawler
User-agent: FAST-WebCrawler
User-agent: FAST MetaWeb Crawler
Disallow: /
Here are some older hit/hosts:
cr022r01-2.sac2.fastsearch.net - - [13/Oct/2002:11:20:21 -0700]
"FAST-WebCrawler/3.6 (atw-crawler at fast dot no; [fast.no...]
cr022r01-3.sac2.fastsearch.net - - [21/Jan/2004:10:40:32 -0800]
"FAST-WebCrawler/3.8 (crawler at trd dot overture dot com; [alltheweb.com...]
cr022r01-3.sac.overture.com - - [04/Apr/2004:15:06:38 -0700]
"FAST-WebCrawler/3.8 (crawler at trd dot overture dot com; [alltheweb.com...]
And here are a couple of the newest:
sch-fast-se-crawl01.dev.osl.basefarm.net - - [01/Mar/2006:00:58:31 -0800]
"GET /robots.txt HTTP/1.1"
"FAST Enterprise Crawler 6 used by Schibsted Sok (webcrawl@schibstedsok.no)"
216.255.229.241 - - [01/Mar/2006:08:31:28 -0800]
"GET /robots.txt HTTP/1.1"
"FAST Enterprise Crawler 6 used by FAST (iverjor (at) fast.no)"
(Eight minutes later, this one hit my homepage. Grrr...)
Over the years, FAST IPs have tracked back to Norway, and Massachusetts, as I recall, and goodness knows where else. This time around, "216.255.229.241" hails from Tokyo, Japan.
Nowadays, the minute I see a FAST IP, if it's getting 403'd for not reading/heeding robots.txt, I block it in the firewall. An overreaction, perhaps, but too many FAST-running individuals/companies have scraped the paint off the walls too many times.
But I suspect the user-agent in the title of this thread is a spoof.
Jim
sch-fast-se-isearch02.dev.osl.basefarm.net
schibstedsokbot (compatible; Mozilla/5.0; MSIE 5.0; FAST FreshCrawler 6; +http://www.schibstedsok.no/bot/)
03/18 20:25:59 /robots.txt 200 -
It's not consistent re robots.txt -- sometimes that's all it hits, sometimes not at all -- which is why I block all FAST spawn.
.
P.S. to adb64
It's been my experience that nonsensical UAs typically aren't FAST-related, and perhaps that explains why the IPs didn't match for you. I'm not sure who/what is behind the nonsense -- could be individuals playing with their browsers or a browser extension, or some program covering its tracks. (I suspect an extension or program.) Here are some similar fake UAs I've seen recently:
m bm9nswptqddtxqtrfjfqwur
kknfrskhn cxydbj9fymyhklr
rpy edmsjvblflwdx0tsromet0n0v
mqjngxaksvvBhtdshvgwBdgf8tBvh
Those get blocked automatically, but if I see repeated hits from the same ISP, or the IP/host name tracks back to a server farm, I rewrite the host, too. FWIW