Forum Moderators: open
I'm sure someone will come along with a more educated analysis of how it all works.
Anne
Personally when we spider a page, all the text content is bundled together (ie, remove all tags and link content, special characters) so you end up with one long piece of text which is kept for this purpose.. Then for each search result to be displayed we look in this text for each search term.. personally I look for:
'a space' -> 'up to 40 characters' -> 'the keyword' -> 'up to 30 characters' -> 'a space'
If the keyword doesn't exist (ie, if it's just in the page title) you just get the first 100 characters or so..
I haven't perfected this by any means, it was just a quick regex solution, and I haven't got round to dealing with 'phrases' yet.. but it's very quick and simple and produces decent enough results that appear pretty similar to the same search on G.
IMHO I think there won't be anything particularily special to what Google (or all the se's) do appart from regexing their caches of the text content.. (Though I guess it might help us glean some insight into what text content they actually count in the algo, in our case it bears no relation at all ;) )
HTH's :) J.
Not sure how many characters in the meta description are used total except where I've actually counted, but I'll be tacking a bit on to one to see if it gets picked up in the snippet to try to get a little more accurate picture of what's a good length to use for the meta description.