Forum Moderators: open

Message Too Old, No Replies

U.K. Search Engine : Mojeek

         

aristotle

8:45 pm on Nov 7, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Host: 5.102.173.71 
/
Http Code: 200 Date: Nov 07 15:25:37 Http Version: HTTP/1.1 Size in Bytes: 23327
Referer: -
Agent: Mozilla/5.0 (compatible; MojeekBot/0.6; +https://www.mojeek.com/bot.html)

I don't recall seeing this before, although it's apparently been around for years, and claims to be the largest crawler based independent search engine in the United Kingdom.

brotherhood of LAN

5:51 pm on Nov 8, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



WebmasterWorld member @glacai owns mojeek.

lucy24

7:53 pm on Nov 8, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't recall seeing this before, although it's apparently been around for years

Yup, been around forever. I had to check my files: I've never got around to ignoring them, because they come by so infrequently it's not worth it.
claims to be the largest crawler based independent search engine

It's a very limited crawl, if so. On one site I've seen them 12 times in two years-- that is, 12 occurrences of robots.txt followed by one page. Either the front page, or the page linked from my profile here, or-- just once-- a different internal page. Timing suggests they got the last-named from a specific other site that they follow. Always the exact IP 5.102.173.71. (Digging deeper I found a lone 195.74.55.164, but that was from 2011.) So yes, they've got their own crawler, but it doesn't spider a whole site.

aristotle

8:11 pm on Nov 8, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I just went to their website and searched for my own sites, and found that all five of them are fully (or almost fully) indexed. So their crawler must have visited all of my sites once upon a time or other, but I just don't remember seeing it.

Maybe they're starting to become more active. Here is a quote from their website:

Mojeek (https://www.mojeek.co.uk), the UK's leading crawler-based search engine, has announced that its index now contains over a billion web pages, an important milestone for crawler-based engines, and currently the only British search company to accomplish this.

Since receiving its first major investment, Mojeek has been concentrating on increasing its index size, improving relevancy, and preparing to scale its technology further. Mojeek was also the first search engine to have a no tracking privacy policy, so is pleased this has become a mainstream issue and that users are now more likely to seek out alternatives that protect their privacy. But there are still only a handful (English/Western Language) that maintain their own search index of over a billion pages, this results in a select few companies dominating a large and important service, and ultimately reducing consumer choice

lucy24

8:54 pm on Nov 8, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are you a dot com or a dot co dot uk? Their site makes it look as if they're intentionally focusing on UK-based sites.

aristotle

9:54 pm on Nov 8, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All of my sites are .net, but are hosted in the U.S. and geared toward a U.S. audience. Maybe Mojeek has recently become more focused on U.K sites now than in the past.

There may be opportunities for country-specific sites if google gets a bad name in some parts of the world. Most people like to do business with companies located in their own country.

lucy24

1:51 am on Nov 9, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There may be opportunities for country-specific sites .... Most people like to do business with companies located in their own country.

When we think of country-specific search engines we usually think of non-English-speaking countries-- Yandex in Russia/Ukraine and Turkey, Exalead in France, Seznam in the Czech republic, coupla others I can't remember at the moment. But it isn't because Google doesn't speak those other languages; it's user preference.

:: idly thinking that among the Eire/NorthernIreland/Wales/Scotland/England contingent, you'd expect to see from one to four breakaway search engines created purely so we don't have to use the same one as That Other Country ::

topr8

9:35 am on Nov 9, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy you normally make a lot of sense ... has Wisconsin got its own search engine? it has a greater population than all of those countries except england.

wilderness

12:01 pm on Nov 9, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW
195.74.55.164 - - [24/Apr/2012:02:21:24 +0100] "GET /robots.txt HTTP/1.1" 200 2627 "-" "Mozilla/5.0 (compatible; MojeekBot/0.2; [www...] .mojeek.com/bot.html)"

lucy24

5:23 pm on Nov 9, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



has Wisconsin got its own search engine?

Has Wisconsin ever made a serious push to secede from the Union? National identity has very very little to do with population.

:: idly wondering if there's some objective measure of the degree of outrage or non-outrage that would ensue in each separate case if you referred to a {Welsh/NorthernIrish/Scottish} person as English ::

jmccormac

3:04 am on Nov 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@lucy24 Wouldn't be a good thing to mistake someone from Ireland, Scotland or Wales as English. :)

:: idly thinking that among the Eire/NorthernIreland/Wales/Scotland/England contingent, you'd expect to see from one to four breakaway search engines created purely so we don't have to use the same one as That Other Country ::
The big problem for running any search engine in any of those countries is the cost and monetisation. Ireland is a relatively small market with approximately 200K active websites top level (excluding subsites). The Republic of Ireland and Northern Ireland is effectively a single market with a lot of cross border hosting and marketing but there are local differences.

The .UK ccTLD is the main UK ccTLD and it is hard to associate a .UK website with Scotland, England or Wales without textual analysis and keywords. The web usage of the new .SCOT and .WALES/.CYMRU new gTLDs is low and there is a lot of overlap with the the main .UK ccTLD.

The main problem for a search engine in Ireland and the UK and most of Europe is the high level over overlap between the local ccTLD and the legacy TLDs like .COM/NET/ORG. There is also regional TLD (.EU) but the web usage in that ccTLD is extremely low. But there is a very visible Adjacent Market effect where some EU countries will be selling into an adjacent market and will use websites branded for that country's ccTLD where possible. The strength of the local ccTLDs is something that can be somewhat surprising. The dominant TLD in most European countries is typically the ccTLD. The .COM TLD is becoming a legacy TLD as the growth of the local ccTLDs has, in most of the Western European countries, overtaken the .COM as the main TLD in those markets.

For the UK, that's a ccTLD with approximately 10 million registrations. For Germany, its approximately 16 million. Of those, approximately 30% will have active websites. And this the real challenge for search engines: new websites and domains are are created and domain names drop. So you have a target set that is continually changing. Google's FUDbuddies scared a lot of webdevs into not linking to other sites and most new websites do not have any outbound or inbound links other than the usual analytics and spyware links from Google Analytics, Statcounter, Facebook, Twitter etc.

The ccTLD registries do not provide access to their zone files as the gTLD registries do to theirs. This means that it is far more difficult for a search engine to detect new ccTLD websites by crawling. For the .UK, approximately 50K to 100K domains drop each month and another 100K to 150K new domains are registered. The UK webscape is mostly hosted in the UK but there are parts of it hosted in Canada, the US, France and Germany. The .UK is also an open TLD which means that a CO.UK is not necessarily an indication of a UK website though the .UK registration may be more reliable.

Regards...jmcc

lucy24

5:48 am on Nov 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Huh. When I think of a country-specific search engine, I don't think of one that considers only sites based in its own country. After all, Seznam and Yandex and so on crawl all over the place. I think more of a search engine that looks at .com and .uk and .eu and .au and blahblahblah and then uses its unique local perspective to figure out which sites are most useful and attractive to their own residents ... which may be different from a US-based search engine's notion of what the (for example) Scottish searcher is looking for. The ccTLD doesn't even enter into it.

Wouldn't be a good thing to mistake someone from Ireland, Scotland or Wales as English

My gut feeling is that the level of outrage, from highest to lowest, goes something like
-- Eire (off the chart)
and then
-- Scotland
-- Northern Ireland
-- Wales
in that order. But this may only be because I've never happened to encounter a truly irate Welshman.

jmccormac

7:41 am on Nov 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A country specific search engine should target its own country and provide better results than Google. Most of .COM is irrelevant at a country level. The ccTLD is very important with a country level search engine because the ccTLDs are typically the dominant TLD in these markets and .COM is a legacy TLD. The increasing irrelevance of .COM at a country level is a hard thing to understand when it is not seen in action. A country level search engine is not a generic search engine like Google. It does local search well.

The original theory behind providing country-level search was to limit the results to the local ccTLD and websites hosted on that country's IP ranges. That was the rather ignorant process used by many of the major search engines and it did not work well. This was because not all websites hosted on a country's IP ranges are associated with that country or targeted at that market. The opening up of some ccTLDs, such as the .UK to non-local registrations made sorting on the basis of ccTLD less accurate.

Once a country level internet market matures, the trade switches from primarily websites selling to other countries to websites selling locally. Thus the queries change from being geographically general to being geographically specific. This is one thing that the country specific search engines can do well because they generally have local knowledge.

Regards...jmcc

keyplyr

11:31 am on Nov 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



mojeek has one of those old build parsers that doesn't support JavaScript well. For the page snippet on many sites it says "Please enable Javascript to use this web site."

keyplyr

1:11 pm on Nov 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So yes, they've got their own crawler, but it doesn't spider a whole site.
Well they've indexed over 100 of my site's pages... maybe one at a time, but still :)

aristotle

1:47 pm on Nov 25, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looks like there's another U.K. search engine called wotbox:
Host: 81.144.138.34 
/robots.txt
Http Code: 200 Date: Nov 25 07:22:33 Http Version: HTTP/1.1 Size in Bytes: 246
Referer: -
Agent: Wotbox/2.01 (+http://www.wotbox.com/bot/)

/
Http Code: 200 Date: Nov 25 07:22:45 Http Version: HTTP/1.1 Size in Bytes: 23359
Referer: -
Agent: Wotbox/2.01 (+http://www.wotbox.com/bot/)

keyplyr

8:23 pm on Nov 25, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Wotnox has been around for at least 6 years, but possibly repurposed.

aristotle

10:16 pm on Nov 25, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Actually they appear to have been around in some form since 2003. Here's a thread about them that incrediBILL started in 2012.
[webmasterworld.com ] -- Wotbox Using Questionable Tactics?