homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderator: open

Paid Inclusion Engines and Topics Forum

  posting off  
Inktomi's DB still corrupted...
More interesting 404's

 3:24 am on Dec 22, 2003 (gmt 0)

Looks like Inktomi came through again in force. More corrupted links were the result. Inerestingly, not as many adult-oriented subdirectories were requested this time as there were last time. Perhaps they've pruned back some of the most egregious errors?


UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

As mentioned before, the errors seem to stem from a mélangement of directory and filenames across various sites. Thus, while I have a directory called "Ventilation-John" on my site, there is no, and there never has been, a html file called reportwz.htm anywhere.

One would think that Inktomi would have noticed the plethora of 404's by now if this is a widespread problem. So, is this a widespread issue?

[edited by: tedster at 5:38 am (utc) on Dec. 22, 2003]
[edit reason] cloak adult words [/edit]



 5:47 am on Dec 22, 2003 (gmt 0)

I've been seeing this kind of thing on some, but not all of the sites I watch over. I sure would feel better about PFI and trusted feed spending if this kind of thing wasn't happening.


 5:57 pm on Dec 22, 2003 (gmt 0)

We were sending these requests to detect server configurations. We were probably sending too many of these requests and including some explicit language in the requests and will be correcting this. You may see some requests like this in the future but hopefully not with the erotic words in the directory structures. This is not an issue with any databases being corrupt.


 1:43 pm on Dec 23, 2003 (gmt 0)


If you note, this is not the first time the topic has come up and I hope you can illuminate for me why search engines like Baidu and Inktomi deliberately mispell, make bad requests, etc.

As far as you are concerned, what do you gain by knowing the "server configuration"?

Inktomi and others fail to detect my server configs because I intercept all 404's with a Perl script that does not spit out the configuration of the site. Perhaps the adult content Inktomi requested was supposed to embarass my site into disclosing it?


 1:14 pm on Dec 24, 2003 (gmt 0)

Another thing with INK, have you noticed the time it taks to update?

From the 12 - 20th!



 9:04 pm on Feb 6, 2004 (gmt 0)

Here comes Inktomi... again.

Tim's statements last time notwithstanding, the same queries made their rounds... again. In the light of the recent Superbowl controversy, it is surprising that Inktomi continues to search for Adult-oriented files it *knows do not exist*, just to get a "server configuration".

I guess Tim isn't as effective or as forthcoming as I wanted to believe regarding the need for "server configuration" data. Considering how much I need Inktomi, perhaps it's time to give them the boot. For those who wonder, here is a sanitized versions of what Inktomi was looking for.

2004-02-04 15:55:50 .../nyc-es****/Ventilation-John/reportwz.htm
2004-02-04 15:56:05 .../flowerhat.htm
2004-02-05 10:20:02 .../latina-*****-rch/Configuration.htm
2004-02-06 03:16:28 .../clock-samourai-text.htm

As always, the UA is Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...] and the IP#'s still align with the servers at Inktomi.

Pretty Pathetic.


 3:00 am on Feb 7, 2004 (gmt 0)

It has been fixed recently and I dont think you will be seeing that again.


 3:05 pm on Feb 9, 2004 (gmt 0)

...except for this charming entry that just came through.

Time: 2004-02-07 08:48:39 (CST)
Originating IP:
URL Request: ../free-latina-***-it/index.html/function.nc****-ungetch.htm
UA: Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

Either Inktomi is a lot more complex to administrate than I, a mere mortal, can comprehend, or Tim simply keeps playing the same record over and over promising fixes instead of delivering them. The fact that the URL requests do not change makes responding to them all the easier.

Interestingly, Inktomi, Baidu, et. al have yet come clean with respect to why they intentionally create 404's for the sake of discovering "server configurations". Perhaps we should all consider creating custom 403 pages for Inktomi, Baidu, etc. that poison its "server configuration" database with bogus information. It would be pretty funny to see what happens when this sort of probing activity backfires.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved