Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Inktomi's DB still corrupted...

More interesting 404's



3:24 am on Dec 22, 2003 (gmt 0)

10+ Year Member

Looks like Inktomi came through again in force. More corrupted links were the result. Inerestingly, not as many adult-oriented subdirectories were requested this time as there were last time. Perhaps they've pruned back some of the most egregious errors?


UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

As mentioned before, the errors seem to stem from a mélangement of directory and filenames across various sites. Thus, while I have a directory called "Ventilation-John" on my site, there is no, and there never has been, a html file called reportwz.htm anywhere.

One would think that Inktomi would have noticed the plethora of 404's by now if this is a widespread problem. So, is this a widespread issue?

[edited by: tedster at 5:38 am (utc) on Dec. 22, 2003]
[edit reason] cloak adult words [/edit]


5:47 am on Dec 22, 2003 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

I've been seeing this kind of thing on some, but not all of the sites I watch over. I sure would feel better about PFI and trusted feed spending if this kind of thing wasn't happening.


5:57 pm on Dec 22, 2003 (gmt 0)

10+ Year Member

We were sending these requests to detect server configurations. We were probably sending too many of these requests and including some explicit language in the requests and will be correcting this. You may see some requests like this in the future but hopefully not with the erotic words in the directory structures. This is not an issue with any databases being corrupt.


1:43 pm on Dec 23, 2003 (gmt 0)

10+ Year Member


If you note, this is not the first time the topic has come up and I hope you can illuminate for me why search engines like Baidu and Inktomi deliberately mispell, make bad requests, etc.

As far as you are concerned, what do you gain by knowing the "server configuration"?

Inktomi and others fail to detect my server configs because I intercept all 404's with a Perl script that does not spit out the configuration of the site. Perhaps the adult content Inktomi requested was supposed to embarass my site into disclosing it?


1:14 pm on Dec 24, 2003 (gmt 0)

WebmasterWorld Senior Member essex_boy is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Another thing with INK, have you noticed the time it taks to update?

From the 12 - 20th!



9:04 pm on Feb 6, 2004 (gmt 0)

10+ Year Member

Here comes Inktomi... again.

Tim's statements last time notwithstanding, the same queries made their rounds... again. In the light of the recent Superbowl controversy, it is surprising that Inktomi continues to search for Adult-oriented files it *knows do not exist*, just to get a "server configuration".

I guess Tim isn't as effective or as forthcoming as I wanted to believe regarding the need for "server configuration" data. Considering how much I need Inktomi, perhaps it's time to give them the boot. For those who wonder, here is a sanitized versions of what Inktomi was looking for.

2004-02-04 15:55:50 .../nyc-es****/Ventilation-John/reportwz.htm
2004-02-04 15:56:05 .../flowerhat.htm
2004-02-05 10:20:02 .../latina-*****-rch/Configuration.htm
2004-02-06 03:16:28 .../clock-samourai-text.htm

As always, the UA is Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...] and the IP#'s still align with the servers at Inktomi.

Pretty Pathetic.


3:00 am on Feb 7, 2004 (gmt 0)

10+ Year Member

It has been fixed recently and I dont think you will be seeing that again.


3:05 pm on Feb 9, 2004 (gmt 0)

10+ Year Member

...except for this charming entry that just came through.

Time: 2004-02-07 08:48:39 (CST)
Originating IP:
URL Request: ../free-latina-***-it/index.html/function.nc****-ungetch.htm
UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

Either Inktomi is a lot more complex to administrate than I, a mere mortal, can comprehend, or Tim simply keeps playing the same record over and over promising fixes instead of delivering them. The fact that the URL requests do not change makes responding to them all the easier.

Interestingly, Inktomi, Baidu, et. al have yet come clean with respect to why they intentionally create 404's for the sake of discovering "server configurations". Perhaps we should all consider creating custom 403 pages for Inktomi, Baidu, etc. that poison its "server configuration" database with bogus information. It would be pretty funny to see what happens when this sort of probing activity backfires.


Featured Threads

Hot Threads This Week

Hot Threads This Month