[edited by: Brett_Tabke at 8:12 pm (utc) on June 16, 2004]
[edit reason] [webmasterworld.com...] [/edit]
I'm not sure about where it got its initial float of UK websites but I would not be surprised if it was from the Dmoz RDF dump. It is very much a work in progress though but a fair bit of crawling seems to have been done to get it to this stage.
Regards...jmcc
[edited by: Brett_Tabke at 8:11 pm (utc) on June 16, 2004]
[edit reason] no linkless urls please anywhere on webmasterworld [/edit]
To be honest, in the UK market their is need for a new engine, i think google does very well of answering most queries , but their is still a need for a stand alone uk engine. I believe presently the ones that exist are not doing that good.I'm almost tempted to set up a UK search engine. :) But it looks very much like Google et al have reached the limits with their simplistic localisation based on IP and cctld. The last time I ran a test on UK domains/sites in the com/net/org, I had a list of approximately one million websites.
Maybe with a bit of modification and good marketing it can definitely do pretty well in the UK market.It probably needs a significant budget to do any decent marketing in the UK. I am not sure about the modifications though since I don't know what kind of s spider cycle it is on. The other key factor is the SE's acquisition strategy - how it detects new sites. It has got to have some kind of edge over the competition and though the initial results look promising, there is little to sustain it in the face of well funded and well organised competition. It is a tough business and most SE start-ups seem to have operational lifetimes of about fifteen months, especially if they cannot monetize the results. There is no indication that the SE is commercial.
Regards...jmcc
The UK is a complex country to model. It is a major hoster country and a lot of non-UK websites are hosted in UK IP space. This means that you can often find the websites from a few countries hosted on a large UK hoster. The simple nameserver location approach may give you a coarse set of domains, it is necessary to go deeper to the domain level for precision. And this is why level 2 and level 3 analysis is necessary to produce a good search index for any country.
Regards...jmcc
[edited by: jmccormac at 4:58 am (utc) on June 16, 2004]
Regards...jmcc
UKWizz - 0 results, Google UK (on UK only) 963 results.
It needs a lot more in the index if it has not got one page with that term in it, as it is quite popular.
I don't know who runs the site, but they could get legal action from Google. The whole site is 'playing off' (which is a legal term that can get you sued) Google's design. Take a look:
1) First result (Google has I'm feeling lucky)
2) Cache results (take a look at the text at the top of the page - it is clear that UKWizz have copied and changed Google's text - copyright infringment)
That was just from a very quick couple of searches. Needs a bit more thought on the design front. (Esp. first result - does anybody ever use it anyway? - accidents don't count!)
1) First result (Google has I'm feeling lucky)
2) Cache results (take a look at the text at the top of the page - it is clear that UKWizz have copied and changed Google's text - copyright infringment)
It's ASPseek which does all that originally (not UKWizz, well, it uses ASPseek so it does it too): [aspseek.com...]
Sid
Now, back on topic. A vortal is a tough thing for a one man show to run. The results have a tendency to get contaminated with side topics pretty easy. Hats off to [ukwizz.com...] for giving it a try.
What is the purpose of not giving it's contact info?
Also, from looking at the format of the search results, they look like they can't be their own, because who on earth would write descriptions in that manner.
No company would stagger it's descriptions in that way.
Company descriptions look too messy, which would reflect badly on a company, plus if you input companies like that, then you may as well make the descriptions tidy and readable in the first place.
It's a feed.
Company descriptions look too messy, which would reflect badly on a company, plus if you input companies like that, then you may as well make the descriptions tidy and readable in the first place.It's a feed.
This is how searches really look from a search engine/search engine results point of view. There is a serious difference between running a simple directory and running a search engine. When you are running a simple directory, you can control the quality of the data included and render it in a more readable fashion. With search results, you can only include or block sites. You can limit the spidering level but the reality is that the vast majority of websites are NOT search engine optimised. This is not a mere anecdotal estimate, this is based on hard data from running country level search engines for the last four years or so. And even more surprising, the majority of websites in any given country are brochureware sites that may have one or two updates per year. Running a SE covering the UK is a tough operation and it is not the same as running a little directory where you have a lot of control. You are dealing with at least 5 million potential *.uk websites (though this may drop to approximately 2 million active websites) and at least another few million com/net/org websites. At a guess, a UK SE could be looking at approximately 4 million top level sites before it even gets around to personal subdomain/directory websites. Sorting these sites, and figuring out what is and is not a UK relevant website is very difficult. And then you have to factor in the processes of acquisition of new sites and spidering. Regards...jmcc [1][edited by: IanTurner at 8:48 am (utc) on June 21, 2004]
[1] Luckily it is a small country.
[edit reason] language edit [/edit]
I can't imagine what it would be like to own a search engine and get accused of using a feed after all the hard work you've done for it -- absolutely terrible.
Also, for those people who think UKWizz was setup within 5 minutes using ASPseek - you're absolutely wrong.
Because for all I know, ASPseek does not include a IP/domain filter - and the people who own UKWizz have done a fine job with that.
Sid
[edited by: IanTurner at 8:50 am (utc) on June 21, 2004]
Exactly. Taking a challenge is good - but doing it is excellent.
What some don't understand is that not everybody is Google, or Yahoo, or MSN, or Ask.
Hats off to the people who take challenges as such, even knowing they have bigger and better competition.
I actually wish there was a funding organisation for these kind of projects -- imagine what the world could be! ;)
Sid
I imagine that the other guys wanted to establish that it wasn't a brochure site.
And I have taken on a project as large as this.
[edited by: IanTurner at 8:51 am (utc) on June 21, 2004]
And I have taken on a project as large as thisWith all due respect Christopher, I don't think that you even have any idea of the magnitude and complexity of a country level search engine project.
Just in the initial dataset, excluding the personal and subdomain websites, you are looking at approximately 30 million potential sites. It is a case of reducing this set, through various techniques and then making decisions on what to spider and what not to spider. The big problem is that by the time you start spidering these sites, some sites have been added to the initial dataset and some have been removed. And of course since the UK is a key hoster country, a lot of the sites hosted in the UK are not necessarily UK relevant. So just on the basic acquisition phase, it is very like cryptanalysis. (Not James Bond but rather the Purple/Enigma/Bletchley Park/Breaking Sky stuff.)
Running a country level search engine is a very difficult thing to do. It takes a lot of guts, a lot of work and a lot of on-going research. This is why there are so few country level search engines.
Regards...jmcc
They spent years searching for revenue streams, and are now growing very quickly indeed. If their plan is anything to go by, some reasonably deep pockets are required.
Correct me if I am wrong, but aren't they the only genuine UK index of any serious size?
I think gigablast (maybe someone else?) has a fascinating diary of how they built there engine, mentioning that a bare minimum they needed 7 beefy servers with 600gb storage on each to do the job...and that's before they looked at documents such as pdf etc....
I take my hat off to anyone that is doing this. Not an easy job at all. In Christoper's defence though, is this a genuine index? Haven't had the chance to have a look?
But I'm curious to know why they don't outsource to hosting services. It would cut their (Mirago) personal server costs wouldn't it?
--------------------------
"With all due respect Christopher, I don't think that you even have any idea of the magnitude and complexity of a country level search engine project"
--------------------------
hmmmm, although I'm not a techy person - I think it's unfair to suggest the above, mainly because you are talking in vague terms, and without any knowledge of my operation or the people involved in the running of it.
I'm not going to explain my technology, plans or naming of clients etc on here, as members have previously accused me of lying etc.
But I guess we shall see, what we shall see.
Outsource of PR and marketing to external company
And Specialist in Finance Funding and business partnership
The above would only scim the surface and would also require sufficient funding to allow 3 - 6 years before a breakeven point
and most important lots of skill and dedication and a large sprinkling of luck
Having looked at current offering there is much work to do with data collection and manipulation , but as any on here could verify if we looked at our original web offering 3 years later we learn to adapt and grow to change and change again ,
I hope that those involved with UKWIZZ have the finance , tenacity and skills to do so ,
As there is the room and traffic in the UK market to support a new search engine
best of luck
steve