Welcome to WebmasterWorld Guest from 54.166.222.116

Message Too Old, No Replies

Problem indexing our news - "article fragmented"

     
9:46 am on Feb 7, 2011 (gmt 0)

5+ Year Member



Hi,

We have a problem with Google news and his indexing.

The problem is many of our news are not indexed and in WMT I see the error: "article fragmented".

I've been investigating and this is what say Google about this error, we think that our articles are fine.

Article fragmented:

Explanation

The article body that we extracted from the HTML page appears to consist of isolated sentences not grouped together into paragraphs. We generated this error to avoid including what might be an incorrect piece of text.
Recommendations

* Try formatting your articles into text paragraphs of a few sentences each.
* Make sure your sentences are well punctuated.
* Make sure you don't use frequent <br> and <p> tags within your paragraphs, and try to avoid breaking up the article body in general.
* Consider removing some of the non-article text from the article page.

Our news have more than 250 words, good paragraph and no many <br />

Thank you
4:34 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



This sounds like a good place to use the "Fetch as googlebot" tool that is now offered in the Diagnostics section of Webmaster Tools - have your tried that?
4:52 pm on Feb 7, 2011 (gmt 0)



The explanation provided by Google is pretty self explanatory therefore ensure you have covered all of the points within the explanation.

If you have then I agree with @tedster, review one of your news pages with the 'fetch as googlebot' tool, review your source code etc. Maybe Google is seeing something different to what you see.
4:59 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I've been investigating and this is what say Google about this error, we think that our articles are fine.


Google doesn't and, it is giving you advice on how to fix them.

Make sure you don't use frequent <br> and <p> tags within your paragraphs.


Interesting that Google is suggesting the above. That means they are using a semantic analysis at some level and they can't determine what the article is about. Using the proper semantic HTML elements appears to be the fix for this.

I actually find this to be rather exciting news - written confirmation that semantic analysis is being used at this level. :)
6:24 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member planet13 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Make sure you don't use frequent <br> and <p> tags within your paragraphs


I don't understand this...

I thought using the P tag was how you designate the start of a paragraph?!?!?!

Does google want us to make one long run-on paragraph of text?
6:52 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Sometimes software inserts all manner of tags - like <p></p> with no content, or <p> for every sentence, but with CSS to hide the line break. I'm assuming this must be the kind of thing that Google sees on some sites, but they don't want to see it in News. Of course we need to use <p> tags - but appropriately.
6:57 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Sometimes software inserts all manner of tags.


Ain't that the truth! One of the more common ones is to start with a <p> and then <br><br> each wantabe paragraph in the section and close with a </p> So, you end up with one BIG paragraph, the <br><br> have no semantic meaning other than a line break.

Developers need to stop with that <br><br> crap! ;)
7:42 pm on Feb 7, 2011 (gmt 0)

5+ Year Member



Hi,

Is curious but Google recommends to write great paragraphs with many phrases and not abuse of <br> and <p>, also says the news must have at least 80 words, I have news with 150 words in two paragraphs and it isn't indexed, I don't know what to do, in other websites I see news with 100 words and we need to write more than 200.

I've read that the reason of this is when there are short phrases, it takes it like comments because the people when make comments use short sentences due to time.

I've checked the page in WMT and I don't see nothing anormal.

Sorry for my english

Thank you
7:56 pm on Feb 7, 2011 (gmt 0)

10+ Year Member



CHMS - this is interesting. I put up a post showing an example of a website where the blogger writes a headline such as "What is the Best Way To Vacation In WidgetWebmasterPlanet"

Then he places two big blocks of adverts immediately after the headline.

Then writes a ONE SENTENCE article - about 28 words.

That's it.

The ads run fine and he is well indexed.

The mods removed the post I guess because I posted the exact URL so folks could take a look at a classic useless site that is doing fine.

On some of our sites we deliberately break up articles to make it a little more difficult for content thieves - but this may also confuse the search engines unless you are already well ranked and an authority site I think.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month