Forum Moderators: open
Szukacz is a search engine we have been developing over the last 2 years. Its main goal is to search for documents prepared in the Polish language. It is supposed to be a commercial service.
The main duty of the Szukacz robot is to search for Polish documents, wherever they are. However, it also gathers English-language documents for our "The Best of the World" collection.
At present we have two main collections of documents: the Polish collection of 8 million Polish documents from 148 thousand websites and the collection of 8 million documents, mostly in English, from 450 thousand other websites.
We try to eliminate multiple copies of documents as well as multiple copies of whole websites from our archives and from our collections.
The Szukacz robot identifies as Szukacz/1.5. It operates using two IPs: bramka.proszynski.pl and brama.proszynski.pl, where brama and bramka are Polish names for gateway and small gateway, respectively.
The robot is a mature beast now, we believe. It gathers both static and dynamic pages. It has a built-in safeguard not to crawl any single website too often. In selects links to crawl from its link database at random. Moreover, it waits at least a few seconds before enetering the same website for the next page. It follows the robots.txt and the robots metatag protocols. In fact, we do not get too many complains these days anymore.
Our present task is to make the Szukacz search engine fully operational (it is now in the beta stage). Right now we work on the asterisk masking of word endings. It is our goal to be able to use asterisks inside a phrase as well.
Asterisks is one of the features where we hope to be better than Google, which is already quite a strong mark in the Polish-language world.
As a promotion of our engine we offer Polish webmasters a possibility to use Szukacz, free of charge, to let the public search their websites.
Our search engine operates at [szukacz.pl....] However, the interface is in Polish, so it is of rather little use to all non-Polish users. The description of our robot has a summary in English at [szukacz.pl...]
<link fixed ~Marcia>
(edited by: Marcia at 9:49 am (utc) on April 5, 2002)
Here is a German translation to English
It appears to be a functioning page.
There may be more in the Google search. There was no Polish to English translator at Alta Vista.
[translate.google.com...]
The German website [searchcodes.de...] contains a query box of Szukacz. It does not employ Szukacz to search this particular site (it seems to have own search engine for local searches). It looks to me like a news service reporting that everybody is free and welcomed to include a query box of Szukacz in his/her website.
However, you can do more than that while including Szukacz query box. You can also add radio buttons, one of which could carry sort of "search this website" label and do local searches. It is done by setting an appropriate value to the "ct" (collection) parameter. Of course such a website has to be crawled and indexed by Szukacz quite independently before the above service could be used.
If a particular website (say, tripod.com) shows up in our "The best of the World" collection (ct=swiat), one can use "ct=tripod.com@swiat" as the name of the collection to limit searches to tripod.com. In fact, the effect is same as adding "host:tripod.com" into the user query (space is equvalent to the AND operator). The difference is that it is the webmaster and not the user who takes care of this.
Would you mind heading over to the European Forum [webmasterworld.com] and shed a little light on the Polish Thread [webmasterworld.com]?
Thanks.
Welcome to WebmasterWorld BTW :)