Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

WMT again - crawl errors

         

tigger

11:36 am on Mar 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm getting very odd errors within my WMT account

Its on a blog and reporting crawl errors on pages I can't even find

Its saying 404 errors (pages not found) on postings url.co.uk/blog/page/10/ - and so on till page/25 none of these pages are within the blog so I don't understand how G can say its crawls errors! and I've not linked to any pages called this

Anyone got any ideas?

tedster

7:57 pm on Mar 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



WMT reports should show you the pages where the 404 links are found - over to the right hand side of each line.

realmaverick

8:10 pm on Mar 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



url.co.uk/blog/page/10/ that does look like a usual WP url.

What URL structure do you have setup for paging?

tigger

8:12 am on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The Permalinks structure was set to custom using - /index.php/%postname%/

This is not how we would do it now but this blog is our oldest one.

It seems that the blog/database is somehow creating its own pages & categories based on the actual ones we have created and using the default WP structure although these do not show in the WP admin interface.

Because Google has now crawled these mysterious pages/categories and they are producing errors by the bucketful traffic to the correct pages of the blog has almost vanished but as I don’t know where these erroneous pages are being stored or how Google initially found them I really don’t know what to do about telling Google to ignore them.

The blog is very big so I really want to get it resolved, so any suggestions are warmly welcomed!

tigger

9:00 am on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



further to this

We have a domain that has a main static html site and then separate but linked in is a WP blog. We have a XML sitemap which is created for the main site with a Dreamweaver command plugin – but this obviously doesn’t crawl and include the blog.

The blog doesn’t have a XML sitemap generated but is crawled by Google - although Google is returning hundreds of errors due to phantom pages somehow being created by the blog and G has crawled them.

Is there a program anyone can suggest that will create one XML sitemap covering both parts of the site – or would it be better to create a separate one on the blog using a plugin and submit that one separately to Google? My reservation about this method is where Google is obviously aware of the blog from just the main sitemap being submitted to them.

Additionally what would your suggestions be to tell Google to disregard these phantom pages on the blog?

We do not know how these pages have been created (they don’t show in the blog admin) nor how Google has ever come to find them.

tedster

4:56 pm on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm still not clear about this problem, tigger. I thought requests for these URLs return a 404 status, right? Then that means the pages are NOT created, they are "not found".

WRT to a sitemap plug-in for Wordpress, there are many available. I'd say keep things simple and create two separate xml sitemaps, one for the blog and one for the other pages.

tigger

5:12 pm on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Ted

Thanks for replying

the error I'm seeing is that within WMT on Crawl errors

its showing a page co.uk/blog/page/20/ and when you click that it brings a 404 up - on right it states 1 page linking to it, but when clicking that it shows page co.uk/blog/index.php/page/20/ which does display a page although I've never created called page/20/ and didn't know if this was an archives section it was showing but even that doesn't seem to relate

Its also got pages showing /blog/index.php/page/15/?s !

One idea I do have and this really is me grasping at straws is its something to do with the way the server is set up? as I've checked another WMT and thats also showing very odd 404's

I've set another WMT account up for a blog thats hosted elsewhere just to see if I get similar errors, but I could be just way off here ?

Thanks for the site map help

tigger

6:36 pm on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One idea I do have and this really is me grasping at straws is its something to do with the way the server is set up? as I've checked another WMT and thats also showing very odd 404's


just to clarify they are both on the same server

until the other blog (hosted elsewhere) gets crawled I won't know if this is the case or not?