It is a server/bot/bookmark manager, it is GPL'ed, free, and a work in progress.
It works with Netscape, Mozilla, and Opera, Galeon, and should work with Konqueror.
I am currently working on the code to get these scripts to hook up with each other in a p2p network, but that isn't in this version. Just think of it as a self contained search engine which spiders your bookmarks and sits on your desktop.
IE does bookmarks in an odd kind of way -- each bookmark is saved as an individual file. What you could do is grab one of those free bookmark converting utilities and save the IE bookmarks in either Mozilla, or Opera format.
I have this code in one of my bookmarked pages:
We­ge nach Köthen
When I search for köthen I get nothing. If I change the ö to ö and let the server index the bookmarks again it works.
It would be great if you could ignore the soft hyphen.
Andreas
>>Did you download the new version?
No. You said I just had to rebuild the index.
>>But how did you expect to get the new version without downloading?
I didn´t expect anything except for it to work. I don´t know how you changed it. It´s your code so you should know.
>>Oh I do know and I did the changes. All you need to do is download the new version.
Which new version?
>>The one with UTF-8 support.
What support?
Thanks littleman. I´ll try it later today.
Andreas
After making sure that the search terms were encoded likewise it did work. However, when displaying the results I got the usual UTF-8 garbage: köthen instead of köthen.
*** bookmark-server-linux.pl Wed Dec 11 16:46:48 2002
--- bmsaf.pl Wed Dec 11 16:41:56 2002
***************
*** 117,123 ****
$¦ = 1;
print $client "HTTP/1.0 200 OK\r\n";
print $client "Connection: close\r\n";
! print $client "Content-type: text/html\r\n\r\n";
#
##print the front page
#
--- 117,123 ----
$¦ = 1;
print $client "HTTP/1.0 200 OK\r\n";
print $client "Connection: close\r\n";
! print $client "Content-type: text/html; charset:UTF-8\r\n\r\n";
#
##print the front page
#
***************
*** 159,164 ****
--- 159,167 ----
#
my $search = $formdata{'search'};
#
+ $search =~s/([\x80-\xFF])/widechar(ord($1))/ge;
+ warn $search, "\n";
+
my @word = split ( / /, $search );
print $client ' ';
#
I was under the impression that Unicode support was better in Perl 5.8 but haven´t tried so far.
Andreas
Though possibly redundant, I am also adding a charset=UTF-8 meta tag.
Internally my CMS (written in Perl) uses a mean and stupid hack. I make sure that every character that is outside of the ASCII range is converted to its numeric entity. Then those entities are converted to look like ö i.e. I escape the ampersand again. This prevents any of the modules, xml parser, etc. to recognize the entities and mess around with them. Some of those wanted to be really smart and use UTF-8 since that is supposedly the computer science savy way to go. Now all they do is mess around with the ampersand. Right before my CMS outputs the PHP files it converts the escaped ampersand back to a real one and I get back all the numeric entities.
I do not really like this approach, but it has a great advantage over all the others I tried. It acually works. And I figured that internally I may legally use every representation I want as long as I adhere to standards when something leaves my CMS.
Andreas