homepage Welcome to WebmasterWorld Guest from 107.20.34.144
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / UK Search and Internet Marketing News
Forum Library, Charter, Moderators: IanTurner & engine

UK Search and Internet Marketing News Forum

This 36 message thread spans 2 pages: 36 ( [1] 2 > >     
Worth Building a UK Search Engine?
Build a UK only search engine?
jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 1:55 pm on Apr 29, 2012 (gmt 0)

Given Google's faffing about with its Panda and Penguin updates, is it worth building a UK search engine for just UK websites? This, as yet hypothetical, search engine would just have UK websites and UK targeted sites rather than sites from the rest of the world.

Regards...jmcc

 

Rosalind

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4447195 posted 2:31 pm on Apr 29, 2012 (gmt 0)

I don't see the utility, given that major search engines already offer the option to filter results by country.

Also, given the way UK domains are sold, it doesn't suggest the opportunity to cut out spam by examining the link graph from a more restricted set of domains, such as you might get with Irish ones.

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 3:10 pm on Apr 29, 2012 (gmt 0)

Yes but the user has to select the option to filter by country and Google's filter by country performance is not that good. Many US and Canadian cities and towns also have British equivalents and these results can often show up in the SERPs.

What I had in mind was a search engine with just UK and UK targeted websites so that the user could actually find what they wanted without the frustration of excluding out of area results. The initial set would be about 4 million .UK sites and a similar number of gTLD sites. A process that I've developed would cull spam/PPC parked and holding pages (this would significantly reduce the size of the initial set) from this set automatically so that even the early version of the SE would have relatively good content. The main problem at the moment is scalability.

And I am somewhat familiar with the way UK domains are sold. I certainly track a lot of them. :)

Regards...jmcc

Staffa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4447195 posted 3:28 pm on Apr 29, 2012 (gmt 0)

If you could manage to filter out the '150' review sites and the '150' price comparison sites before getting to an actual vendor's site then I'm your first user ;o)

I always shop online and know exactly what I want when I do but having to trawl through umpteen (for me) useless sites before getting to the first actual vendor is aggravating.

Informational sites (for me) can be located anywhere in the world as long as they present relevant information, though even those are hard to find these days.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4447195 posted 3:42 pm on Apr 29, 2012 (gmt 0)

And you've got how many million quid to get started?

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 4:00 pm on Apr 29, 2012 (gmt 0)

If you could manage to filter out the '150' review sites and the '150' price comparison sites before getting to an actual vendor's site then I'm your first user ;o)
The algo that I've developed identifies those sites and deals with them in a way that accentuates good sites.

And you've got how many million quid to get started?
Just a worried creditcard, a few servers and a pile of data.

Regards...jmcc

scooterdude



 
Msg#: 4447195 posted 4:10 pm on Apr 29, 2012 (gmt 0)

Why not comparison sites?

Why not review sites ?

And it the user knows what they want already , and know how to type amazon.c,, or ebay.co,,,

what do they need you for exactly :)

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 4:23 pm on Apr 29, 2012 (gmt 0)

They may know what they want but not where to find it. :)

Regards...jmcc

Andy Langton

WebmasterWorld Senior Member andy_langton us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4447195 posted 4:36 pm on Apr 29, 2012 (gmt 0)

I think in terms of whether a niche search engine could be useful (and even better that Google in its niche) I believe that it could.

But I think there are problems in terms of acquiring an audience.

Search is an activity everyone is accustomed to, and my belief is that searchers are used to picking a search engine, and sticking with it. I don't know many people that go to more than one website to search for things, other than perhaps on-site searches at Amazon and similar.

So, to be successful, it would seem you would need to both take people away from Google, and then send them back to Google when their search was not location-specific - without them just sticking with Google! I think that's a pretty big challenge!

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 4:47 pm on Apr 29, 2012 (gmt 0)

There's the Social Media element that could help gain an audience but what would really do it would be better SERPs than Google's current SERPs. It is a challenge but that's part of the fun.

Regards...jmcc

londrum

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4447195 posted 5:00 pm on Apr 29, 2012 (gmt 0)

do you think there's still room for an edited search engine, like the yahoo directory of old?

let's be honest, 99% of the web is a load of rubbish. and a guarantee that every site has been human checked for quality might be a good selling point.

i know that the yahoo directory ultimately failed. but the web is a million times bigger now, with a billion times more lousy pages in it. maybe it's time for a human edited search engine to make a comeback?

you could also make it socially-driven -- with people rating the sites themselves, rather than your staff. a bit like stumbleupon and (sorry to say it) google plus

Staffa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4447195 posted 6:36 pm on Apr 29, 2012 (gmt 0)

Why not comparison sites?
Why not review sites ?
And it the user knows what they want already , and know how to type amazon.c,, or ebay.co,,,
what do they need you for exactly :)

For me personally, comparison sites are a waste of time. If I can find a few direct vendors of what I'm looking for I can do my own comparison on all points offered, not just price
As for review sites, how many are actually unbiaised and/or unpaid for

Amazon and Ebay, I haven't bought there for years, too big, too many fly-by-nights, etc

Over the years I have collected some good small/medium business sites for various products that I use regularly and which offer a good price, good service and no complaints products and where I return whenever I need something that fits in their range.

I don't know many people that go to more than one website to search for things

I routinely use 4 SEs for my searches and that does not include the big 3 individually ;o)

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 6:55 pm on Apr 29, 2012 (gmt 0)

do you think there's still room for an edited search engine, like the yahoo directory of old?
It is a good idea but I think that a high quality index is a good starting point. The problem is that the set of sites is potentially in the millions. That means having to rely on some algorithms to pare that set down to something that could be dealt with manually but it would be a Google-killer idea if it could be marketed as a manually approved index.


i know that the yahoo directory ultimately failed. but the web is a million times bigger now, with a billion times more lousy pages in it. maybe it's time for a human edited search engine to make a comeback?
Perhaps.

you could also make it socially-driven -- with people rating the sites themselves, rather than your staff. a bit like stumbleupon and (sorry to say it) google plus
Jimbo Wales tried something similar with his Searchwikia venture but that bombed because it did not pay any attention to search quality and adopted the Google approach of spidering everything and hoping that the public would help establish the quality of the sites. Naturally it failed. What would make this SE different is that it would only spider UK sites thus minimising the GIGO effect.

Regards...jmcc

Andy Langton

WebmasterWorld Senior Member andy_langton us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4447195 posted 7:58 pm on Apr 29, 2012 (gmt 0)

I routinely use 4 SEs for my searches and that does not include the big 3 individually ;o)


Yes, but you're one of the "not many" ;)

I've found that even the "tech savvy" have rather basic search skills, from the point of view of someone in the search industry.

"Better than Google" is a selling point - but then it's also the one that Bing or even Cuil (remember them?) used. But I think it;s the right ambition for any search startup. But "different from Google" should be part of the mix - not sure if UK-only is enough on that score.

scooterdude



 
Msg#: 4447195 posted 9:52 pm on Apr 29, 2012 (gmt 0)

Cuil is a sobering reminder,

back in the day, when I took my first steps online, there where multiple 'portals' ,

them all gone,

maybe there will be a new wave of entrants

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4447195 posted 10:52 pm on Apr 29, 2012 (gmt 0)

I've been running a town-based directory since 1996 and a UK shopping-sites directory since 2002. I am not claiming these two are important, merely using them as a jumping-off place for the comments below.

We get a few submissions per week on each. Both are fairly small but all submissions are manually checked. Scaling this up would be reasonable only if the directories made money, which they don't. So, a way of monetising would be necessary.

In my experience it's easier to get submissions than visitors - SEOs take the trouble to find the sites and submit; users often don't know about the sites and never find them.

After accepting a site there is the problem of periodically checking to see if it is still "live". Because of parked domains which return a 200, re-used domains (one I recall used to be a pet site and is now an "adult" site) and a variety of other points of failure it is not easy to check for such things automatically. A file of (eg) parking companies (with IPs) would need to be included in the link-checker and interpretation of return codes and inherent delays would need to be taken into account. I haven't yet found a link checker that will do this so one would have to be built that checked not only the above but altered site content as well. My own solution is to periodically go through the links taking out as much as klink can find as dead and then check the rest manually, but since the directories are now almost "hobby" sites the checks do not get made that often. As I say, a good automatic site checker would need to be built and (I think) then backed up by a human.

However, the main problem is lack of people who actually use the directories. Traffic is low since few people actually type (eg) uk directory into a major search engine in the first place, so SE listings are useful only if the directory is listed for important (and probably most) keywords. Keeping people returning to the site is a serious task.

Publicity is a problem. Without money behind such a scheme publicity (eg press, TV, radio, web) is unlikely.

Someone mentioned above that only e-commerce sites need be included. I disagree: people need to find local health, community, leisure and informational sites as well. Without those any UK directory/SE would not be very useful. I can't remember the last time I ever used a comparison site other than to verify it, but I think they should also be included. And what about auction sites and similar? Ebay isn't the only auction site used in the UK.

And what about "no records found"? I think at least data should be fed from one or more engines direct through the site for items not found, world sites (which WILL be requested!) and similar: if you let them off the site they may not come back. DuckDuckGo recently became a semi-meta engine because of the problem in maintaining an index themselves (from memory - there is info on the site).

As a response to the above, I would also require, if using such a directory/SE, that "localised" searches could be easily made (eg using drop-downs), at least by town and possibly down to postcode for local-only sites (eg local shops, takeaways, restaurants etc).

Finally, there are already some UK-based engines around. These should certainly be examined.

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 12:11 am on Apr 30, 2012 (gmt 0)

Cuil is a sobering reminder
The epitomy of the expression "if you can't be a good example then it is best to be a terrible warning."? :)

In my experience it's easier to get submissions than visitors - SEOs take the trouble to find the sites and submit; users often don't know about the sites and never find them.
I've solved the submissions problem to some extent. On the monitoring aspect, the algorithms will detect any change in a site and its hosting.

The theoretical side is perhaps the easiest part. The hard parts are going to be scaling it and publicising it, and, inevitably, getting users.

Regards...jmcc

Staffa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4447195 posted 12:58 am on Apr 30, 2012 (gmt 0)

inevitably, getting users

for that we can all help by advertising for your SE on our sites :o)

mojeek



 
Msg#: 4447195 posted 4:57 pm on Apr 30, 2012 (gmt 0)

Hi, interesting discussion.

I agree the main problem will be getting the audience, although I know there are some people out there looking for a true UK specific and founded engine, but unfortunately most seem to prefer to stick with the obvious. So that's probably where the millions would be needed, not the implementation, if done efficiently.

But best of luck if you decide to go ahead.

Marc

jmccormac

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 2:14 am on May 1, 2012 (gmt 0)

Thanks all. What I think the real USP for a good SE is a combination of new, relevant, sites and clean content. In the last ten years, Google has been the 800 pound gorilla but it is the quality of content that has really hit small and large search engines that sought to compete with Google/Yahoo/Microsoft. The Searchwikia venture with its Social Media involvement was a good test but it paid very little attention to search index quality. All sorts of junk (in addition to compromised sites) went into the deployed index. The other aspect is that many small SEs adopt the same blind crawler approach used by Google and the larger SEs. This is akin to the Brute Force Attack in cryptography of trying all combinations of keyword to break a code. It requires a lot of computing power and a lot of servers. Since I don't have Google's technological resources and find the BFA method of site detection inelegant, I reckon that this project will just have to innovate, adapt and overcome (or at least be well enough designed to survive). But the marketing angle is where Google has the advantage.

Regards...jmcc

mojeek



 
Msg#: 4447195 posted 7:47 am on May 1, 2012 (gmt 0)

Hi jmcc, I think you're spot on. Blindly crawling everything in sight is definitely not the way to go for smaller engines, it's then just a matter of deciding what to crawl or not, either algorithmically or manually.

piatkow

WebmasterWorld Senior Member piatkow us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4447195 posted 9:44 pm on May 1, 2012 (gmt 0)



Why not comparison sites?

Why not review sites ?

And it the user knows what they want already , and know how to type amazon.c,, or ebay.co,,,

what do they need you for exactly :)

Example: I need an independent hotel or guest house in a particular small town. A search on the town name will usually be swamped by comparison sites listing the nearest big chain hotels maybe 30 or 40 miles away.

scooterdude



 
Msg#: 4447195 posted 10:54 pm on May 1, 2012 (gmt 0)

Perhaps some practice on query technique , using more specific keywords would assist you in narrowing down your required result, assuming said independent has a website that search engines are happy to return

piatkow

WebmasterWorld Senior Member piatkow us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4447195 posted 10:15 pm on May 2, 2012 (gmt 0)


Perhaps some practice on query technique

Yes, stupid of me, naturally Google knows that when I ask for "hotels in Smallville" what I really want is to compare Holiday Inn and Days Inn in the next county!

Its the search engines job to answer the ****ing question.

Marketing Guy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4447195 posted 10:02 am on Oct 9, 2012 (gmt 0)

I worked on the SEO for a directory startup a few years back. They had decent funds to get going and put serious resources into getting the business off the ground, but the result was a multi million pound business that was getting 1 million+ visitors per month at it's height.

During the 4 years I spent on the project I was involved at a high level of the business and saw a lot of copycat competitors pop up and subsequently fail (including one funded by an investor that used to appear on Dragon's Den).

It's a tough market. My former client relied on SEO and PPC to drive traffic, which isn't a strategy I think would scale well these days (post-Panda). But similar budgets could drive traffic from other sources, so I do think it's doable.

The directory model has failed so many times it's a little bit tragic though. DMOZ - spammed to hell (they never reacted to market changing around them and the model didn't scale). Yahoo - died a slow death (again, they never reacted with the market changes). Any link directory - was always spam. Yellow Pages, etc - slow to embrace technology, slow to adapt their model. But I do think a directory model of some sorts could work.

The hard part is developing a business model around the fact that you are constantly serving vastly different customers. But if you can present the information in such a way that it works for the user, regardless of whether or not that's in directory, search engine or another format, then you're on to a winner.

Monetisation is tough. It's hard enough for agencies to convince clients to assign PPC budgets outside of Adwords, let alone spend on a new player. Maybe a model which rewards people for moderating content or submissions might work, but remember the industry we're in - that system would get abused to death.

And another concern is the regionalisation - there are loads of queries where local is good and anything else is superfluous. But all the queries where non-UK results are useful too? I need a hotel, then UK results = win. I want to find SEO blogs = maybe I want the best information, not necessarily the closest.

Or you could simply focus on one aspect of what a SE delivers - i.e. matching the informational needs of a particular group. Nail that niche and expand through different verticals.

All that aside, my gut is that people are just plain lazy. Retaining visitors for something like this is tough - you need a lot of added value features to get people on board and advocating your service. That's one of the areas that my former client did well for businesses (website owners), but not so well for end users. And ultimately you are serving those two distinct groups and it's almost like running two completely different businesses.

There's definitely room in the market for another player though, and it's long overdue IMO. Innovating is the way forward and not just replicating like Bing have been doing. ;)

TypicalSurfer

5+ Year Member



 
Msg#: 4447195 posted 11:54 am on Oct 9, 2012 (gmt 0)

But the marketing angle is where Google has the advantage.


You compete on price. Lower ad costs to businesses = consumer savings. Search here, save money.

That's the basis of a marketing program.

Google got legs because of an initial value proposition, webmasters and advertisers had a low cost vehicle to reach an audience therefore they went forth like an army of town criers with the "use google" refrain which was actually self promotion (use google, find ME).

[edited by: TypicalSurfer at 12:04 pm (utc) on Oct 9, 2012]

brotherhood of LAN

WebmasterWorld Administrator brotherhood_of_lan us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 12:03 pm on Oct 9, 2012 (gmt 0)

jmcc, a formidable task but one that you're familiar with. Why not try a city, like London to start off with? The UK is a big country and makes the task bigger, London would be a good 'proof of concept'.

The first issue I see is knowing where to spider and the redundancy of discarding 'non-UK' sites. I guess you would want to start off with a seed-list of sites and follow the links a couple of levels deep and you could have a good proportion of related sites.

TypicalSurfer

5+ Year Member



 
Msg#: 4447195 posted 12:09 pm on Oct 9, 2012 (gmt 0)

discarding 'non-UK' sites


A web crawler can be designed to store all found links but only crawl those that meet certain criteria (TLD in this case), so it wouldn't be a matter of discarding documents, you just don't crawl off target pages.

crawl seed list > collected links stored in a crawl db > crawl db cleaned of unwanted TLDs > crawl selected urls > repeat

brotherhood of LAN

WebmasterWorld Administrator brotherhood_of_lan us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4447195 posted 12:20 pm on Oct 9, 2012 (gmt 0)

Indeed, .uk domains, UK IP addresses & even so far as looking at keywords in domains seem to be the more trivial aspects of it but kind of precision would it get? A pure guess is it would give about 50% of relevant sites that need to be spidered. It totally depends on how many UK sites are out there that are not hosted in the UK or have a .uk domain.

If the % isn't high enough then more crawling has to happen and the discarding won't be an option until it's spidered.

No doubt some techniques could be used at this point too though, maybe after X number of pages have been spidered, decide whether it's UK-related or not. I'd want to be careful here though, as some portals may have UK specific content buried a few levels deep.

A lot depends on what you would define as a 'UK site', e.g. a Malaysian website promoting package holidays to the UK, in our out?

btw an idea for a seed list, use something like Majestic and enter a bunch of UK specific search terms.

TypicalSurfer

5+ Year Member



 
Msg#: 4447195 posted 12:36 pm on Oct 9, 2012 (gmt 0)

Your initial crawl could just be bone-headed in terms of TLD, that would give a good base, a decent document collection. Since large scale search indexes are distributed across multiple machines you would have to split that accordingly, roughly a million pages per gig of RAM. So a machine with 24G of physical memory would be able to store 24MM documents and serve queries without disk seeks (newer SSD disks could be of assistance as well), that's how you maintain speed.

Since you are really just storing separate indexes you can run separate crawls, just updating individual collections. This is where you could do more "nook and cranny" type crawling, bring in new or relevant urls into an individual index.

There are still some human edited directories that would be a good starting point.

This 36 message thread spans 2 pages: 36 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / UK Search and Internet Marketing News
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved