Forum Moderators: open
Just a thought. What does everyone else think?
Is it possible that the Portals have some financial incentive to use the small DB when traffic is heavy or when their quota is used up?
It just doesn't make sense to me that the DB is completely stabile in the smaller portals and only erratic (for me anyway) in MSN and AOL. In my case I only have about six pages but they are highly indexed to many key search terms.
Any ideas?
Gil
Yes, we are getting the same results. The reason as to why,
your guess is as good as mine.
MSN IS wacky and AOL is way off base for us as well. It
seems (for us) that AOL is off doing its own thing of sorts
which leads us to believe they are aware of the ink problem/ issue, at least.
Does anyone have any solid info about the structure of Ink. Is it one massive db, or is Kamikaze right about there being two db's one old and one new. Can anyone confirm this information is correct? If this is right, are there any sources we can use to get information on some kind of timescale for when the full db will come back online?
The problem has got to be with Ink. because I'm still getting no joy all across the Ink. based SE's
So many questions... So many headaches to cure... So many clients to placate...
Cheers,
zero6
However, this does mean that there may be some light at the end of the tunnel as the current situation is not considered 'normal'. Now the question has got to be - when is it going to be back and if this new submission process has anything to do with the NSI deal?
**** provides a wide range of services to meet the needs of our customers. This list provides the most common issues that must be decided.
When you receive services from ****, you may receive them by means of a Web Mastercluster or a Private Minicluster.
Web Mastercluster
A Mastercluster is a large cluster of interconnected computers, designed to support searching the entire web. When you use a Mastercluster, you share it with other **** customers. By providing shared access to **** resources, **** can deliver search services more efficiently and economically. And, our use of cluster technology allows us to easily expand capacity to grow with our customers.
Private Minicluster
A Minicluster is a smaller cluster of computers. Whereas Masterclusters typically number more than 100 interconnected computers, a Minicluster typically consists of a smaller number of computers. Miniclusters are usually private and dedicated to a single **** customer. Miniclusters are well-suited to custom search functionality and allow higher frequency of data refresh rate.
Mastercluster Choices
When you receive service from **** we will work to come to agreement with you on the following areas:
Database Size
**** provides database access in the following sizes:
110 million documents
54 million documents
Tiering Level
Many **** customers use multiple database sizes. **** can send a percentage of queries to one database tier and the remainder to a different database tier. This can be useful to economically provide complicated queries with access to a larger list of data than might otherwise be possible. For example, a customer might choose to send 80% of their queries to a 54-million document database and the remaining 20% of their queries to the 110-million document database.
Refresh Frequency
Currently, all **** Masterclusters are refreshed approximately monthly.
Physical Access Point
**** provides service from multiple databases in several geographical locations. **** customers are expected to connect to **** datacenters in one of the following locations:
Santa Clara, California (Exodus facility)
Herndon, Virginia (Exodus facility)
Europe (Location TBD, availability 1Q99)
Japan (contact ****)
Editorial Tagging
**** can tag a subset of a MasterCluster index. This enables a customer to create a custom search by specially identifying a useful set of MasterCluster data. For example, if a customer had a list of California web pages, we could tag them all with a single tag that enabled the customer to provide a California focused search. Only documents that already exist in the **** database can be tagged on the Web MasterCluster.
Tagging is implemented in one of two ways. Customers may attach a metaword to selected documents in the index. This method can only be applied when less than 10% of the total web index will be tagged. These tags can only be modified or changed when a page is refreshed on our normal refreshed crawling schedule (approximately once per month.)
The second method allows a customer to tag with a series of bits. These flags may be turned on or off for each URL in the database. Customers may then filter search results based on the values of these bit fields. Bitfield tagging may be done either weekly or semi-weekly.
Customers can choose whether or not to allow other **** customers to access the customers’ tags.
Private Minicluster
****’s Private Minicluster service delivers the maximum amount of flexibility and control to ****’s customers.
We will come to agreement with you regarding the following areas:
Database Size
Private Minicluster databases can be any size, and actual implementations range from 20,000 documents to 20 million documents. Generally, we will need to estimate the number of pages in advance to determine your hardware needs.
Data Source
**** can readily accommodate the following data sources:
Customer provides **** with a list of URL’s which **** crawls to generate a search index
Customer provides **** with web content that has been preprocessed for indexing. This file is indexed directly, without crawling.
Refresh Frequency
Standard **** refresh options include the following:
Monthly refresh
Weekly refresh
Daily refresh
Hourly refresh
Editorial Tagging
**** can apply tags to Minicluster data. This enables a customer to create multiple custom searches out of one piece of crawling infrastructure. Tagging is generally done in conjunction with crawling and indexing.
Physical Access Point
Currently, all Private Miniclusters are deployed in ****’s California facilities.
As I previously mentioned, my pages are ALL stabile in Hotbot, Canada, Anzwers. Freeserve, Iwon, Looksmart, and GOO.jp. By this I mean the positions are all within two/three ranking spots each time I do a test search.
Only MSN has been erratic lately. AOL is still on the small/bad
DB for us. In MSN I will get the large (good) DB sometimes and probably fifty percent of the time the small DB that only contains my home page.
This of course only relates to my small number of pages but they are highly optimized for several hundred key search terms (obscure medical terminology) and I currently am in the top ten for most of the key search words. There is definately a small data base as that is where I find my single home page when everything else in INK tanks.
Regards,
Gil
The Ink.based portals are obviously using this system; one big mastercluster crawling and indexing the entire web and portals buying into a certain number of documents to filter and index with the aim to provide unique and relevant results.
However if one part of this system fails, like the db provider loosing any part of the mastercluster network then the portal will only recieve partial results.
I would imagine the reason that Gil and Kamikaze are getting different results is because there must be some kind of breakdown in providing larger quantities of documents for all the portals, and Gil's pages must be coming from smaller more stable db's being provided to and filtered by Hotbot, Canada, Anzwers. Freeserve, Iwon, Looksmart, and GOO.jp. This might be due to having more obscure content. MSN and AOL could be buying larger chunks of the Ink. mastercluster and therefore relying more heavily upon the larger quantities of documents which Ink.could be having the problem with. This would explain why Gil is getting nothing out of MSN and AOL but gets results from the others.
Kamikaze is in the same situation as myself, pages that once ranked well have now been totally dropped or buried by all Ink. based SE's. It could be because our keyword content is more popular, forcing the portals to get results from the larger document db they have with Ink.
So it could be that for one reason or another, Ink. is having problems providing larger db info to its clients. I can only assume that this is a temporary situation and either a technical breakdown, or a clear out.
This is of course all theory so dont hold me to it.
Please shoot me down if need be. Would like to think I've got my head around it.
Cheers,
zero6