Forum Moderators: coopster
i would estimate only a million or few million pages indexed. drive space and bandwidth isn't a problem. server load would be, so i would really like ideas on the most efficient method to do a google-type indexing search engine. i have no idea where to begin as far as algorithms go, i was thinking of using php/MySQL but i figure using MySQL's built-in search functions is probably suicide?
i've looked through freely available source code, but i can't tell what would be the fastest way to search, and the least load on the server.
how should i store all the database info and search through it?
there is a pretty decent book called 'managing gigabytes' which contains some details on algorithms and such.
i have put up some details on how searchhippo works in the 'about' section.
and would it be a good idea, i don't run my own server, i'm hosted by someone else, and the server is running FreeBSD w/Apache. and i have little to no experience with UNIX/Linux command line stuff.
is it easy to set up and install, and can i use it with PHP? i try to avoid perl/CGI because i hear that it puts more load on the server than PHP.