Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: open
Kindof puzzled that submissions made on 7-11 and 7-28 of the exact same pages (with only a date identifier) did not make it in. If INK spiders a page that is the same as a previous submission due they just let the earlier submission stay in and ignore the new one?
Good luck all,
INK has tons of XXX in their database - why would they pick on someone with a site for mature viewers an keep the real hardcore stuff? I am also hearing about pure "G" sites with this problem. I think there's something we are overlooking or something that's perhaps redirecting the spider, (.htaccess file) or some config of the server. Don't know the answer but think it's worth chasing...
I do use .htaccess files to direct to my custom 404 page, but I would not think that would do it, would it? You've seen them yourself, I'm not sure that could be the problem since the sites that are listed use them as well and always have.
The code in the htaccess is:
ErrorDocument 404 [domainname.com ]
I disagree about the above. I have many domains with these terms in them and i have no problem getting listed at all with these. I have found that INK will BAN a domain(not IP) but if you farm that domain on an inactive list and then bring it back in about a month it will be cleared and you will get it listed again.. I use a month break on a domain that will not be listed and then hit it again and it gets listed. This is just my 2 cents.
Thanks for your input. I'm confused as to why any of my domains, be they adult or not, would be banned. I have never done anything to cause such an action. I don't even submit the alloted maximum pages.
As I said before, many of these sites do have pages already listed in ink. Many of these sites are brand new and have never been submitted before this time. So I am still completely baffled.
On a different note, I know this is pretty basic but what is the code one can insert on one's page if I only want inktomi to index and no one else? I'm not sure if this is a good idea but at this point I'm thinking of uploading some pages from domains that aren't getting indexed in inktomi to a new domain. However, they are indexed in other engines and I don't want to have mirror pages or have them pick up these new pages. And could this code also be put on a hallway page that links to the new pages?
I've had the same problems as everyone else with the in and out. This is the first time I've noticed slurp hitting everything on my site in one crawl, usually just my index page two or three times a day. I generally resubmit only when pages are dropped, every other week or so, no more than 50 at a time.
Here we all sit and scratch our heads looking for the holy grail, and it's probably just random. Their database is maxed out. The new stuff comes in, good door opens, bad door closes, it's a shrubbery, wizard says what is your favorite color, if it's blue you go in the chasm!
Anyway I'll let you know if this full crawl of my site results in greater stability. Won't do any good, but...
Oops, cron just deleted my logs from last night, I may have mistaken IP3000 for Slurp... Wishful thinking?
I need to stop watching this stuff. Is there a Logwatcher's Anonymous chapter?
<< it's probably just random.
I think you hit the nail on the head with the above. As sent to you yesterday (but not the rest of the forum) [support.inktomi.com...] There is a lot of insight in this not so current (at least the info) doc from INK.
Jilly - yes I do use .htaccess for "404". Somehow slurp gets into that page and google lives there - but no problems so far, knock on wood. In fact, I am using the exact same code you sent me. That same file has code to convert .shtml to .html. The question was just to unearth all the possibilities. We'll get to the bottom of this about the time the rules change - but getting to the bottom of it is half the fun...
I don't know whether to be jealous of you or glad that you got it into the database. I'm still working on this but I have a lot more to do than try to win over Inktomi, which in my case seems to be a losing battle.
Care to share the details on how you did this?
As much detail as I have;
The domain name that I have been unable to get listed consists of a common word with an x tagged on the end, nothing controversial or adult about the name although the site is a little risqué.
The domain has existed for about a year, belongs to a friend of mine and I just help him out occasionally. I did some previous submissions at the back end of last year which were listed by Ink, he then lost interest in the site and as AFAIK no submissions were made between mid-December 99 until mid-May this year.
About late May I tweaked the default page and created a "site map" with links back to the site and to some doorways, the doorways share very similar content and are created by the site owner. Submissions to Ink began towards the end of May.
The site has not been listed in Ink or even visited by slurp or BSD since that time [not once, never].
After reading Littleman's comments regarding terms in the URL, I took a domain name the client had parked and pointed it at an old doorway page. This new URL was submitted on the 8th and listed today.
Both the old domain name [the one with the x] and the new "clean" one are multi-homed on the same IP.
The old domain, with the x, has an unrelated .com version, nothing is listed for that domain after late May. Do your problems stem from around that time?