homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderator: open

Paid Inclusion Engines and Topics Forum

  posting off  
Inktomi's DB still corrupted...
More interesting 404's

10+ Year Member

Msg#: 2403 posted 3:24 am on Dec 22, 2003 (gmt 0)

Looks like Inktomi came through again in force. More corrupted links were the result. Inerestingly, not as many adult-oriented subdirectories were requested this time as there were last time. Perhaps they've pruned back some of the most egregious errors?


UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

As mentioned before, the errors seem to stem from a mélangement of directory and filenames across various sites. Thus, while I have a directory called "Ventilation-John" on my site, there is no, and there never has been, a html file called reportwz.htm anywhere.

One would think that Inktomi would have noticed the plethora of 404's by now if this is a widespread problem. So, is this a widespread issue?

[edited by: tedster at 5:38 am (utc) on Dec. 22, 2003]
[edit reason] cloak adult words [/edit]



WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 2403 posted 5:47 am on Dec 22, 2003 (gmt 0)

I've been seeing this kind of thing on some, but not all of the sites I watch over. I sure would feel better about PFI and trusted feed spending if this kind of thing wasn't happening.


10+ Year Member

Msg#: 2403 posted 5:57 pm on Dec 22, 2003 (gmt 0)

We were sending these requests to detect server configurations. We were probably sending too many of these requests and including some explicit language in the requests and will be correcting this. You may see some requests like this in the future but hopefully not with the erotic words in the directory structures. This is not an issue with any databases being corrupt.


10+ Year Member

Msg#: 2403 posted 1:43 pm on Dec 23, 2003 (gmt 0)


If you note, this is not the first time the topic has come up and I hope you can illuminate for me why search engines like Baidu and Inktomi deliberately mispell, make bad requests, etc.

As far as you are concerned, what do you gain by knowing the "server configuration"?

Inktomi and others fail to detect my server configs because I intercept all 404's with a Perl script that does not spit out the configuration of the site. Perhaps the adult content Inktomi requested was supposed to embarass my site into disclosing it?


WebmasterWorld Senior Member essex_boy us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 2403 posted 1:14 pm on Dec 24, 2003 (gmt 0)

Another thing with INK, have you noticed the time it taks to update?

From the 12 - 20th!



10+ Year Member

Msg#: 2403 posted 9:04 pm on Feb 6, 2004 (gmt 0)

Here comes Inktomi... again.

Tim's statements last time notwithstanding, the same queries made their rounds... again. In the light of the recent Superbowl controversy, it is surprising that Inktomi continues to search for Adult-oriented files it *knows do not exist*, just to get a "server configuration".

I guess Tim isn't as effective or as forthcoming as I wanted to believe regarding the need for "server configuration" data. Considering how much I need Inktomi, perhaps it's time to give them the boot. For those who wonder, here is a sanitized versions of what Inktomi was looking for.

2004-02-04 15:55:50 .../nyc-es****/Ventilation-John/reportwz.htm
2004-02-04 15:56:05 .../flowerhat.htm
2004-02-05 10:20:02 .../latina-*****-rch/Configuration.htm
2004-02-06 03:16:28 .../clock-samourai-text.htm

As always, the UA is Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...] and the IP#'s still align with the servers at Inktomi.

Pretty Pathetic.


10+ Year Member

Msg#: 2403 posted 3:00 am on Feb 7, 2004 (gmt 0)

It has been fixed recently and I dont think you will be seeing that again.


10+ Year Member

Msg#: 2403 posted 3:05 pm on Feb 9, 2004 (gmt 0)

...except for this charming entry that just came through.

Time: 2004-02-07 08:48:39 (CST)
Originating IP:
URL Request: ../free-latina-***-it/index.html/function.nc****-ungetch.htm
UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

Either Inktomi is a lot more complex to administrate than I, a mere mortal, can comprehend, or Tim simply keeps playing the same record over and over promising fixes instead of delivering them. The fact that the URL requests do not change makes responding to them all the easier.

Interestingly, Inktomi, Baidu, et. al have yet come clean with respect to why they intentionally create 404's for the sake of discovering "server configurations". Perhaps we should all consider creating custom 403 pages for Inktomi, Baidu, etc. that poison its "server configuration" database with bogus information. It would be pretty funny to see what happens when this sort of probing activity backfires.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved