Forum Moderators: coopster

Message Too Old, No Replies

Generating Text Snippets (Descriptions)

         

thing3b

11:35 pm on May 24, 2006 (gmt 0)

10+ Year Member



I have a database of many text fields that have no descriptions. The site design I have been handed requires that each field be displayed with a description and so they are going to be automatically generated.

The descriptions need to be no more than 20 words long.

The first question is:
- What type of description is better. A 20 word teaser taken from the first 20 words of the text (and hope they are good) or random snippets from the text?
- Using PHP, how would I go about finding the most identifing snippet from the text?

Steerpike

2:11 am on May 25, 2006 (gmt 0)

10+ Year Member



Taking the second part of your question first: there's no way to find the most identifing snippet of text from a string of words. PHP can not, by itself, do something so subjective. What you're asking for requires a certain amount of human decision making to weight each section of text.

As to the first part of the question, taking the first 20 words of a string would be the easiest option and probably the least likely to generate issues.

To highlight what I mean, let's take an arbitrary string and see what we can do with it:
$text_string = "Blue widgets. Here at widget.com we pride ourselves on producing nothing but the finest of blue widgets. Encassed in chocolate and hand rolled on the thighs of Cuban virgins each blue widget will bring you years of joy; perhaps even becoming a family heirloom to be passed down through the generations.";

So, we have $text_string which is a variable containing all the above text.

We can grab the first 200 characters of the string and use that:
$newDesc = substr($text_string, 0, 200);
which produces the output:
"Blue widgets. Here at widget.com we pride ourselves on producing nothing but the finest of blue widgets. Encassed in chocolate and hand rolled on the thighs of Cuban virgins each blue widget will bri"
but as you can see, that cuts out in the middle of a sentence. Not much use.

We can get the first 20 words:
$words = explode(" ", $text_string);
$desc = array_slice($words, 0, 20);
$x=0;
for($x; $x<sizeof($desc);$x++)
{
echo $desc[$x]." ";
}
Which produces the output:
"Blue widgets. Here at widget.com we pride ourselves on producing nothing but the finest of blue widgets. Encassed in chocolate"
Which, again, interupts in the middle of a sentence.

It really depends on what you want, but if your needing something that looks as subjective as your asking for then you may need to redesign the database to weigh different fields on importance.