Forum Moderators: goodroi

Message Too Old, No Replies

Google Search Console news sitemap error

Error parsing news sitemap

         

Francesc Invernon

4:16 pm on Mar 9, 2023 (gmt 0)

Top Contributors Of The Month



Hi there,

I'm recently having problems with the search console news sitemap status.

Today I entered and the status of the news sitemap is " 1 error " when I enter, the error is "Parsing Error" and the line that shows that error is not on the sitemap anymore. I assume that there is some type of cache as all validators show a pass on that sitemap.

To force a recrawl, I re-submited the same news sitemap url with a query like sitemap_news.xml?testnocache. And now it shows no errors.

Is there any option to "clean" google's cache of my sitemap? Is there any fix to this issue?

Thanks to all the forum in advance, and hope to help more users with this kind of issue.

not2easy

6:20 pm on Mar 9, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hi Francesc and welcome to WebmasterWorld [webmasterworld.com]

Since you have resubmitted your sitemap it should replace what they had. The error may show in GSC for a time as it is not a set of real-time data. The error you have corrected should not affect your site's performance if it continues to show for a few days in GSC.

Francesc Invernon

8:33 am on Mar 10, 2023 (gmt 0)

Top Contributors Of The Month



Hi again and thans for the response and the welcome,

I left the two sitemaps for some hours and today for my surpirse, the original one ( the one with the "error" ) sitemap_news.xml is now correct and with all urls discovered. If this issue happens again what you recomend?

1. Let it be recrawled and fixet without forcing it?
2. Send it to force a recrawl?
3. Send another sitemap_news.xml?test123 with parameters to at least have one temporal GNews sitemap working correct?

Thank u again !

not2easy

2:23 pm on Mar 10, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you see an error message in GSC, there is no reason not to let them know you have corrected it. If there is no error message, then I would not send a new sitemap - so long as you have listed its location in your robots.txt file and see their requests for the sitemap in your logs, there is no reason to push it.

Sometimes Google can interpret pro-active efforts as manipulation. An example is their old GSC verify robots tool 'submit to index' which they have asked people to submit only once. Repeatedly submitting the same content can be a bad idea. Just be aware that GSC data is not instantly updated.

Francesc Invernon

8:35 am on Mar 13, 2023 (gmt 0)

Top Contributors Of The Month



Thanks for the information, now it seems solved. Hope it does not broke any more !

Thankss

Sgt_Kickaxe

2:34 pm on Mar 14, 2023 (gmt 0)



Make sure you don't have a filesmatch directive in your htaccess to assign a long cache time to xml files. I see it occasionally when a site was built to give long cache times to css files and jpg files and an xml gets tossed into the directive.

It's usually copy pasta from an online guru article, but it happens.

Example of what that might look like - remove the xml if you see it...
<FilesMatch "\.(jpg||ico|xml|css)$">
Header set Cache-Control "max-age=28512000"

Francesc Invernon

3:45 pm on Mar 16, 2023 (gmt 0)

Top Contributors Of The Month



Hi ,

We've checked all cache directrives and it's all right as we have the max-age set to "0".
Now we will try to ad to the cache-control a "no-store" to force bot to not store anything. I'll update with more info.
Thanks again.

not2easy

4:40 pm on Mar 16, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It might help to add a noarchive tag to your meta robots line:
<meta name="robots" content="index, noarchive">
or if there is a date the content expires or you no longer want it crawled add in an expiration tag:
<meta name="robots" content="unavailable_after: 2023-04-16">

You can learn more about using Google's crawling and indexing robots instructions here: [developers.google.com...]

Francesc Invernon

8:33 am on Mar 17, 2023 (gmt 0)

Top Contributors Of The Month



Hi again,

Mi sitemaps are XML and AFAIK there are no meta directrives on those.

But thanks for the advice.

not2easy

11:33 am on Mar 17, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



No, the meta tags belong in the <head area of your site's pages. You can add headers to an entire folder or directory using X-Robots directive. That is explained at the link above.

Francesc Invernon

12:23 pm on Mar 17, 2023 (gmt 0)

Top Contributors Of The Month



Yes but as u said:
It might help to add a noarchive tag to yourmeta robots line
. This is not the same as the x-robots tag ;)

not2easy

12:34 pm on Mar 17, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I also mentioned that
the meta tags belong in the <head area of your site's pages.

None of these belong in a sitemap. The X-Robots tag is not added in the sitemap. These are just additional ways to give directives to Google's bots.

phranque

9:50 pm on Mar 17, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



This is not the same as the x-robots tag

a meta robots element is the head of your HTML document is technically equivalent to providing an X-Robots-Tag HTTP Response header with your HTML document:
Robots meta tag, ..., and X-Robots-Tag specifications [developers.google.com]