Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

***They're Baaaaaack***

INK's large DB back on line



1:22 pm on Aug 5, 2000 (gmt 0)

10+ Year Member

I was surprised and elated this morning when my checks for SE position showed that all of my 5-28 and 5-30 submissions were back on line.

Kindof puzzled that submissions made on 7-11 and 7-28 of the exact same pages (with only a date identifier) did not make it in. If INK spiders a page that is the same as a previous submission due they just let the earlier submission stay in and ignore the new one?

Good luck all,

7:52 pm on Aug 8, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member


INK has tons of XXX in their database - why would they pick on someone with a site for mature viewers an keep the real hardcore stuff? I am also hearing about pure "G" sites with this problem. I think there's something we are overlooking or something that's perhaps redirecting the spider, (.htaccess file) or some config of the server. Don't know the answer but think it's worth chasing...


8:10 pm on Aug 8, 2000 (gmt 0)

WebmasterWorld Senior Member nffc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

>blocked just by having PG. terms in the URL

Got a fresh, clean domain for my clients site, will point it at the existing site and submit. Will let you know if this makes a difference.

10:05 pm on Aug 8, 2000 (gmt 0)

10+ Year Member


I do use .htaccess files to direct to my custom 404 page, but I would not think that would do it, would it? You've seen them yourself, I'm not sure that could be the problem since the sites that are listed use them as well and always have.

The code in the htaccess is:
ErrorDocument 404 [domainname.com ]


1:42 pm on Aug 9, 2000 (gmt 0)

10+ Year Member

>blocked just by having PG. terms in the URL

I disagree about the above. I have many domains with these terms in them and i have no problem getting listed at all with these. I have found that INK will BAN a domain(not IP) but if you farm that domain on an inactive list and then bring it back in about a month it will be cleared and you will get it listed again.. I use a month break on a domain that will not be listed and then hit it again and it gets listed. This is just my 2 cents.

2:05 pm on Aug 9, 2000 (gmt 0)

10+ Year Member


Thanks for your input. I'm confused as to why any of my domains, be they adult or not, would be banned. I have never done anything to cause such an action. I don't even submit the alloted maximum pages.

As I said before, many of these sites do have pages already listed in ink. Many of these sites are brand new and have never been submitted before this time. So I am still completely baffled.


5:59 pm on Aug 9, 2000 (gmt 0)

10+ Year Member

Jilly. I have had the same problems bought 10 domains and of those I can only get 8 of them listed.. it is strange.. I can't explain that one.. BTW if you get a chance Jilly ICQ me I would like to compare a few notes with you on the PG sites:) ICQ 34492767
10:31 am on Aug 10, 2000 (gmt 0)

10+ Year Member

I just checked to see what (if anything) was indexed today by inktomi and here's what I found. I submitted 3 pages from the same domain. Two I submitted through hotbot, anzwers and canada and they got in. One I sent through anzwers alone and that didn't get in. Then I lost one page which I think was submitted about 2 weeks ago. I still am not getting how things are picked up , dropped etc.

On a different note, I know this is pretty basic but what is the code one can insert on one's page if I only want inktomi to index and no one else? I'm not sure if this is a good idea but at this point I'm thinking of uploading some pages from domains that aren't getting indexed in inktomi to a new domain. However, they are indexed in other engines and I don't want to have mirror pages or have them pick up these new pages. And could this code also be put on a hallway page that links to the new pages?



11:43 am on Aug 10, 2000 (gmt 0)

10+ Year Member

After getting nothing into ink for a month, I took them out of my submissions for a week. Then submitted 6 domains that all had the index in ink, with other pages (25 per day) starting the 7th. The submissions for the 7th are in this morning, however only 10 to 20 pages made it into ink, with the average of 15. I have never seen the spider in my logs. It will be interesting to see if the 25 I submitted for each domain on the 8th get in tomarrow. I am mystified!
12:31 pm on Aug 10, 2000 (gmt 0)

10+ Year Member

I'm being paid a visit by ip3000 this morning, and its spidering everything, including the kitchen sink. I use meta tag <meta name="robots" content="index,follow"> Doubt it matters.

I've had the same problems as everyone else with the in and out. This is the first time I've noticed slurp hitting everything on my site in one crawl, usually just my index page two or three times a day. I generally resubmit only when pages are dropped, every other week or so, no more than 50 at a time.

Here we all sit and scratch our heads looking for the holy grail, and it's probably just random. Their database is maxed out. The new stuff comes in, good door opens, bad door closes, it's a shrubbery, wizard says what is your favorite color, if it's blue you go in the chasm!

Anyway I'll let you know if this full crawl of my site results in greater stability. Won't do any good, but...

Oops, cron just deleted my logs from last night, I may have mistaken IP3000 for Slurp... Wishful thinking?

I need to stop watching this stuff. Is there a Logwatcher's Anonymous chapter?

1:12 pm on Aug 10, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member


<< it's probably just random.

I think you hit the nail on the head with the above. As sent to you yesterday (but not the rest of the forum) [support.inktomi.com...] There is a lot of insight in this not so current (at least the info) doc from INK.

Jilly - yes I do use .htaccess for "404". Somehow slurp gets into that page and google lives there - but no problems so far, knock on wood. In fact, I am using the exact same code you sent me. That same file has code to convert .shtml to .html. The question was just to unearth all the possibilities. We'll get to the bottom of this about the time the rules change - but getting to the bottom of it is half the fun...


12:06 pm on Aug 11, 2000 (gmt 0)

10+ Year Member

I had 15 average show up, out of the 25 I submitted on the 8th. I have put a 1 minute delay between submissions starting today. There is definately a filter of some sort...tomarrow I will decrease to 15 per domain per day. I still have never seen the spider..but it may come late at night and the log is wiped by the time I get up.
8:07 am on Aug 12, 2000 (gmt 0)

WebmasterWorld Senior Member nffc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

>blocked just by having PG. terms in the URL

The new domain is in the dB today.

8:20 am on Aug 12, 2000 (gmt 0)

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

NFFC - would you mind emailing me an example of the url?
1:10 pm on Aug 12, 2000 (gmt 0)

10+ Year Member


I don't know whether to be jealous of you or glad that you got it into the database. I'm still working on this but I have a lot more to do than try to win over Inktomi, which in my case seems to be a losing battle.

Care to share the details on how you did this?


3:51 pm on Aug 12, 2000 (gmt 0)

WebmasterWorld Senior Member nffc is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Hi Jill,

As much detail as I have;

The domain name that I have been unable to get listed consists of a common word with an x tagged on the end, nothing controversial or adult about the name although the site is a little risqué.

The domain has existed for about a year, belongs to a friend of mine and I just help him out occasionally. I did some previous submissions at the back end of last year which were listed by Ink, he then lost interest in the site and as AFAIK no submissions were made between mid-December 99 until mid-May this year.

About late May I tweaked the default page and created a "site map" with links back to the site and to some doorways, the doorways share very similar content and are created by the site owner. Submissions to Ink began towards the end of May.

The site has not been listed in Ink or even visited by slurp or BSD since that time [not once, never].

After reading Littleman's comments regarding terms in the URL, I took a domain name the client had parked and pointed it at an old doorway page. This new URL was submitted on the 8th and listed today.

Both the old domain name [the one with the x] and the new "clean" one are multi-homed on the same IP.

The old domain, with the x, has an unrelated .com version, nothing is listed for that domain after late May. Do your problems stem from around that time?

8:16 pm on Aug 12, 2000 (gmt 0)

10+ Year Member

Thanks for the detailed reply! My last submissions to make it into the directory, or to be spidered at all for that matter, were approximately June 28th. Since then zippo.
This 46 message thread spans 2 pages: 46

Featured Threads

Hot Threads This Week

Hot Threads This Month