Forum Moderators: open

Message Too Old, No Replies

Google thinks my RSS feeds are standard HTML pages?!

         

isorg

10:23 pm on Dec 3, 2005 (gmt 0)

10+ Year Member



I have added a number of RSS feeds to my site. They are linked to from a master RSS news feed page, where people can pick and choose the specific type of widget they want to track. There are about a hundred widget-specific RSS feeds linked to from that page.

I have added them all to my Google Reader account, and so they all get downloaded several times a day by this agent:

FeedFetcher-Google; (+http://www.google.com/feedfetcher.html)

Now, my site appeared in the Google SERPs for the first time today (it is a new site). Looking in my logs, quite a few people have arrived at my site because the RSS file came up as a standard result on the SERP for "yellow widgets". My pages describing yellow widgets were not in the results(!), but the RSS feed page yellow-widgets.xml IS there instead.

Obviously, people have clicked on that, saw the XML, got scared and walked away.

Is Google supposed to show an XML file as a search result?

Here are the headers that are being sent by the RSS files:


HTTP/1.1 200 OK
Date: Sat, 03 Dec 2005 22:11:55 GMT
Server: Apache/1.3.34 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.11 FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7a
X-Powered-By: PHP/4.3.11
Content-Encoding: gzip
Vary: Accept-Encoding
Connection: close
Content-Type: application/xml

... and here are the first few lines of the page:


<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
<channel>
<title>Latest yellow widget news
etc...

The RSS feeds validate fine and show up great in the news reader, including Google Reader.

The listing in the SERPs is as follows:

Latest yellow widget news ...
File Format: Unrecognized - View as HTML
Description from the XML file

What I could possibly do is make a style sheet for the XML files explaining that they have reached an XML file, and providing a link to the real yellow widgets page. But probably very few visitors would bother to click on to the real page.

Other possibility is that Google has not finished adding pages to the index (went from 200 in the afternoon to 300 right now), and eventually my real yellow widgets page will outrank the XML file. So the current phenomenon is transient.

Thanks in advance for any thoughts on this issue!

sitedev

3:46 am on Dec 5, 2005 (gmt 0)

10+ Year Member



Hi isorg,

This may or may not help, but have you set the .XML files to "Disallow" in your robots.txt file? It won't fix what has already been indexed. but it should stop further indexing of files you don't want linked.

Hope that helps :-D

isorg

7:09 am on Dec 5, 2005 (gmt 0)

10+ Year Member



Sitedev,

Welcome to WebmasterWorld!

I actually do want Google to crawl the RSS files, but I don't want them to come in the SERPs!

Few if any of my users even know what a news reader is. I always put a load of RSS files on my sites mainly for SEO (to let the SEs know about all my pages), but I have never seen Google link to any of my RSS files directly from the SERPs!

Iguana

1:32 pm on Dec 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, Google have been showing rss files in Search Result for a long time now. Generally they rank a lot lower than the page they relate to so you might not have noticed them.