Forum Moderators: coopster

Message Too Old, No Replies

searching/indexing with super large MySQL databases

         

SubZeroGTS

12:52 am on Feb 1, 2003 (gmt 0)

10+ Year Member



i'm working on making my own little search engine.

i would estimate only a million or few million pages indexed. drive space and bandwidth isn't a problem. server load would be, so i would really like ideas on the most efficient method to do a google-type indexing search engine. i have no idea where to begin as far as algorithms go, i was thinking of using php/MySQL but i figure using MySQL's built-in search functions is probably suicide?

i've looked through freely available source code, but i can't tell what would be the fastest way to search, and the least load on the server.

how should i store all the database info and search through it?

kmarcus

5:11 am on Feb 1, 2003 (gmt 0)

10+ Year Member



my experience with mysql fulltext has been that is fails around a few million, especially when the queries are 'complex' (like matching more three or more terms). i'm sure that under a high q/s load it would be much more pathetic. now, 'm also sure that you could come up with a decent solution in mysql if you use it for nothing more than the file access mechanism - (much more like udmsearch), but you might as well just go for the real thing like berkeley db files.

there is a pretty decent book called 'managing gigabytes' which contains some details on algorithms and such.

i have put up some details on how searchhippo works in the 'about' section.

SubZeroGTS

8:28 pm on Feb 1, 2003 (gmt 0)

10+ Year Member



where can i find out more about berkeley db?

and would it be a good idea, i don't run my own server, i'm hosted by someone else, and the server is running FreeBSD w/Apache. and i have little to no experience with UNIX/Linux command line stuff.

is it easy to set up and install, and can i use it with PHP? i try to avoid perl/CGI because i hear that it puts more load on the server than PHP.

andreasfriedrich

8:36 pm on Feb 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can use BerkeleyDB databases with PHP.

[php.net...]

Andreas