Forum Moderators: coopster

Message Too Old, No Replies

PHP Search Application

Creating a simple yet effective search engine.

         

wfernley

4:29 pm on Nov 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi everyone.

Well I am trying to build a searche engine in PHP and I was curious what the best way is to go about it. I have found a few tutorials and I was just curious what the best way to design a simple search engine would be.

Right now it is going to be basic. I have a FAQ database with 3 columns, faqs_id, faqs_question, faqs_answer. I want to create a search engine to search the FAQ database and find the most relevent information for the user.

I have found a few tutorials like I mentioned before that dealt with creating a search engine application. I was wondering how effective they would be. They talk about creating another table called keywords and then keywords would be added and a url to the faq they were relevant to. I was curious how this would be speed wise and database size. When I think about it that will be a lot of keywords especially if a keyword is inserted into the database everytime it is found in the FAQ.

I am confused about how I should go about it, and I apologize for not being able to explain it well enough.

What would be the best way to design this search engine?

Thanks.

Wes

mincklerstraat

5:26 pm on Nov 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This sounds to me like a pretty sound approach. I haven't seen the tutorials (and this will be one of the biggest limitations of answers you'll get to this question - not seeing the tutorial, it's difficult to comment on how fast or effective the app will be), but I guess that your php script does a query based on the user input to the 'keywords' table; if it finds relevant info there, it spits out the url's of such keywords; if not, it does a query on the content tables themselves, and if it comes up with anything, dishes out those url's, and stashes the keyword(s) url(s) in the appropriate places in the keywords table so it doesn't have to go through the second, more time-intensive proceedure next time somebody searches this keyword. The first searches will go slow, but after a while they'll get faster. Hopefully it also has a way of removing the keywords when you remove or modify entries.

You'll also want some kind of filter which removes words and phrases which are commonly used in in searches - 'the', 'and', 'how', etc., you might want to add date / number of searches data for new keywords that gets updated until a keyword has been searched for x times by users - when it's 'confirmed' as a valuable keyword - this way you could weed out your keywords table for things that were only searched for once or twice a while ago.

For a FAQ, if it's decent-sized, this could be a nice addition. If you have a really big site with long articles, this kind of functionality could be a problem in slowing things down. SPIP had some kind of keyword-search table functionality which was turned off by default, since there had been a lot of complaints about it using huge amounts of disk space iirc; probably mainly from people who were running SPIP on shared hosting.

If you have less than a few of hundred pages in your FAQ, it might make more sense just to use ordinary SELECT LIKE syntax in your query and forget about the whole keyword business - you could do this with or without a fulltext index.