Forum Moderators: open

Message Too Old, No Replies

How do engines see a continuous text string split by font tags?

As one word or two words?

         

Robert Charlton

4:25 am on Jun 30, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Would the engines see the following string as one word or two words? There's no delimiter, but the text string is broken by code.

<font color="red">big</font><font color="blue">widgets</font>

bigwidgets

Brett_Tabke

6:35 am on Jun 30, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It's different with each engine. Test it ;-) Try an obscure keyword and break it with a font command. Generally speaking, most engines view it as two words.

Many engines substitute "trivial" style commands for spaces. When finished parsing the pages, they strip the extra spaces out.

You may already have something like this indexed? Can you search on the keyword/phrase and get a result?

Robert Charlton

4:55 pm on Jun 30, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>>You may already have something like this indexed? Can you search on the keyword/phrase and get a result?<<

I didn't think I did have examples of this online, as this one is in a password protected development area... but yes, in checking other sites belonging to the same client, I found one other example. I picked a long enough string to include the split and exclude all other examples on the page.

Searching the exact string in Google, I get a match without the space (with the cache being useful here to highlight the right text), and no match on the same exact string with a space added.

PositionTech/Inktomi gives the same results as Google. Teoma doesn't return anything for the exact string, but returns the page when I add the space (the only engine that sees it this way). But Teoma may be "correcting" my search, seeing part in the title and a partial match in the long string... it's hard to say.

All-The-Web and AltaVista find no matches either way.