Forum Moderators: IanTurner & engine

Message Too Old, No Replies

Market Research On Establishing A UK SE/Directory

         

jmccormac

4:52 am on May 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am doing some market research to establish whether it is better to start a new UK search engine/directory or supply feeds of new UK websites to existing UK search engines and directories on a monthly (or weekly) basis.

The project has already detected 1.2 Million UK com/net/org websites and the Dmoz.org UK dataset would give it another 144K or so sites. This float would probably put it close to the Google class in terms of depth and it keeps on updating the UK lists every month.

What I have to work out is whether there is a market for the feed to existing UK directories/SEs or there is a market in selling 'searches' on a new UK search engine to existing directories and SEs. Have any of the UK directory/SE operators here any ideas on this?

Regards...jmcc

Bobby_Davro

10:48 am on May 26, 2003 (gmt 0)

10+ Year Member



The problem is that the only UK crawler that I know of is Mirago. There are plenty of directories out there, some are good, others are poor; but how will they use your raw URL lists?

I think that if you can set up a UK specific search engine, then other SEs may be interested in using the results. If the results are good enough, you may be able to offer the results to Espotting and Overture as their backfill.

How would your list of URLs be helpful to you starting a new UK directory?

If you are trying to get started as a new engine, then you will be competing with 100+ UK specific directories and "engines" in the UK. Having your own search database will be a big advantage, but it will still have to provide better (or at least comparable) results than the others. Can you do that?

jmccormac

6:12 am on May 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The problem is that the only UK crawler that I know of is Mirago. There are plenty of directories out there, some are good, others are poor; but how will they use your raw
URL lists?

The URL lists could be tweaked to provide title/keywords/description where available. However categorisation may be more time-consuming. However if the directory follows the Dmoz architecture, it probably would be possible to add this categorisation. This part would be the more people-intensive part of the project as the rest is highly automated.

I think that if you can set up a UK specific search engine, then other SEs may be interested in using the results. If the results are good enough, you may be able to offer
the results to Espotting and Overture as their backfill.

That is something I had not considered. It is an interesting option though.

How would your list of URLs be helpful to you starting a new UK directory?

The main problem with sites relying on Dmoz/ODP data is that the sites often do not exist any more. Since the Dmoz float would be actively spidered, the quality of what that float would be better. It would provide an immediate footprint with the smallest possible outlay of resources. Adding the CNO sites then would begin to grow the directory.

If you are trying to get started as a new engine, then you will be competing with 100+ UK specific directories and "engines" in the UK. Having your own search database
will be a big advantage, but it will still have to provide better (or at least comparable) results than the others. Can you do that?

Competing with that many directories/engines will be difficult. However I think that the results will be better based on a more active spidering and a better selection process. Having a 'global' view of websites means that the project also has a lot more data on servers and clusters of websites. This makes it easier to remove linkswamps. The unique thing about this SE would be that it would be actively acquiring new UK websites that are wrongly classified as not being UK by Google and the others. The main rules that Google seems to use is that a UK website has to either have a .uk extension or be hosted on UK IP space. By targeting UK CNO sites that are not hosted in the UK (in addition the existing UK based CNOs), it would make the results somewhat better. I think that it would be filling a niche that the big engines cannot fill just yet.

Regards...jmcc

Bobby_Davro

10:48 am on May 27, 2003 (gmt 0)

10+ Year Member



I am not familiar with "CNO" - can you translate please?

I think that you may be better off separating the services that you are looking to offer. Spidering the ODP data for UK SEs is very different to offering a million uncategorized URLs.

The URL lists could be tweaked to provide title/keywords/description where available

The main problem is that the title and description that spidering provides isn't up to the standards of ODP descriptions. In fact it can be misleading, poor and even offensive. Mixing such results with ODP listings would dramatically reduce the quality. Hence, each site would have to be human reviewed anyway.

In terms of a directory, it doesn't seem that you are offering a great deal. The main problem for the UK directories is not finding UK sites to add, but funding the editors to add the sites. Anything people intensive, like editing, is expensive, and not many of the UK search/directory sites can afford it properly.

I think that your database of URLs is only useful from a spidering perspective. Either it can be used to form a new search engine from scratch, or you could use it to provide a UK filter for a larger engine, such as Fast or Inktomi, with UK sites being flagged as such.

You could do what Fast, Google and Inktomi do and charge a fee per thousand searches performed. I believe that Fast, for example, charge a $50K set up fee and $2 per thousand searches. You could easily undercut that, perhaps even waiving the set up fee. That would make it much more attractive to the likes of Espotting.