Forum Moderators: phranque

Message Too Old, No Replies

Search and Artificial Intelligence

         

SlowMove

10:57 pm on May 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to figure out the best way to put a news portal online. I'm in the process of building up a list of RSS feeds to use on the site. I want to be able to search through all the feeds and find the most qualified items for any given page. If I have a page with the title "New York City News", I might have 200 items that could qualify to be used on that page. The problem would be finding the 10 "most qualified" news stories to include on the page.

I learned a little Perl, but I can't seem to find any code that could be used out of the box to do this kind of thing. I picked up a few books on AI, but they seem to be addressing an audience of mathematicians. Does anyone know the best way to get up to speed on Artificial Intelligence for Search withuot spending the next 10 years studying?

StupidScript

9:10 pm on May 24, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most probably what you are looking for is not actually a full-blown AI experience, but for something that can (a) parse text/RSS feeds, (b) seek a series of patterns, (c) collect pattern-matching results into another file or set of files, and (d) order those results against a filter or set of filters.

(a) Is normal file I/O stuff...using Perl or whatever you feel most comfortable with (open file, parse, close file)
(b) Traditionally uses "regular expressions" to define the patterns to match (i.e. ^New York.*$)
(c) Dumping the results is also file I/O stuff (write new or append to existing)
(d) Once dumped, "score" each entry in the result set based on # of matches, occurances, etc. to attempt to evaluate the "best" entries and their order of appearance.

When I do this kind of thing, I gather the files and resources I wish to get my data from, then write little scripts (I use PHP or Perl, depending on convenience) that open, read, copy lines from, and close the resource files. During the "copying lines from" portion of the process, I instruct the program to dump the line into a database (usually MySQL). Once all of those lines are in the database, I use PHP (or Perl) to select data frlom the database, score it, and display the new results.

After that, "real" intelligence kicks in as I view the results and save the entries I wish to use. You could go on writing scripts and apps that do the looking and feeling for you, but it's more efficient to use your human judgement, at this point.

SlowMove

10:32 pm on May 24, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



order those results against a filter or set of filters

Good thinking. I could filter out results that have the fewest keywords related to the topic of a page.