homepage Welcome to WebmasterWorld Guest from 54.197.19.35
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 93 message thread spans 4 pages: 93 ( [1] 2 3 4 > >     
Increase in not found errors message in webmaster tools
helenp




msg:4534215
 10:44 am on Jan 8, 2013 (gmt 0)

Hi, my site has fallen (english section) and in the last 14 days I got this message twice:
http://www.example.com/: Increase in not found errors

My site is 10 years old, and often I delete pages, and I dont give any special error as this happens often,
so now I have thousands of not found errors, and many pages are years old, and Goggle has still not dropped them. What should I do with them?
Can this affect my ranking?
Thanks
.

[edited by: Robert_Charlton at 11:20 am (utc) on Jan 8, 2013]
[edit reason] examplified domain [/edit]

 

DodgeThis




msg:4534226
 12:04 pm on Jan 8, 2013 (gmt 0)

Deleted pages should send a 410, just be sure they are definitely gone and not moved. More info here [webmasterworld.com...]

helenp




msg:4534235
 12:15 pm on Jan 8, 2013 (gmt 0)

The thing is that I have a custom 404 page for if they misspell or page is deleted, should I do a personalized 410 pages instead of a 404?
There are to many urls ot add to htaccess, also I could tell webmastertools to delete them, but there are thousands of them.
And yes they are all gone, no internal links and they have a 404 error in webmastertools.

Could I do this, or is that bad manner?
In htaccess serving a 410 instead of 404:
//Custom 404 errors
ErrorDocument 404 <local-path>/error-410.html
and then a personalized 410 page with a redirection to my homepage.
Thanks

helenp




msg:4534237
 12:20 pm on Jan 8, 2013 (gmt 0)

Tedster said this about using 410, if could be negative:

any negative impacts on using a 410 gone on a website

Only if you want to re-use the URL. Make sure it stays "gone".

So what happens if I do re-use one, will it ever be indexed again?
Sort of complicated to control a thousand pages site that delete and get new pages often, and some may return.

Str82u




msg:4534254
 12:58 pm on Jan 8, 2013 (gmt 0)

DOn't change your 404 to 410 - non-existant pages still need to return the proper 404 page with 404 headers to tell search engines a page really didn't/doesn't exist, like misspellings. 410 is generally for search engines only but I do use custom pages (with the proper headers) in case a live person shows up.

HTACCESS would work if there was something common about the pages like a directory/folder name or the page extensions (.html, .aspx).

Is it possible for you to restore those pages? They don't have to have real content but if you could somehow regenerate the pages you can make them all identical 410 pages (remember those headers).

aakk9999




msg:4534267
 1:15 pm on Jan 8, 2013 (gmt 0)

if you deleted a page, but the page has external (or internal) links pointed to it, it will still periodically appear in WMT "Not found" error section. if external links to such page were contributing to ranking of your website, then deleting pages without redirecting them may result in ranking drop.

The best would be to redirect deleted page to either a similar page or its category page if that is at all possible, and to internally link to redirect target. If there are no external links to page you deleted, then return 410 Gone with a nice custom page so that you do not lose your visitor.

ON the other hand, if page never existed, just leave it returning 404 (as said above).

levo




msg:4534277
 1:29 pm on Jan 8, 2013 (gmt 0)

Since you're deleting the pages and there is no mistake, you can safely ignore the message.

Redirecting visitors to the home page would be a soft-404. [support.google.com...]

helenp




msg:4534282
 1:37 pm on Jan 8, 2013 (gmt 0)

All the pages existed, and there arent any common pattern in them, they can have some external links but not many and not all of them.
At the moment I have 6.872 pages with error in webmastertools, has not checked them all but they all seems to be 404 errors, some even years old.
Restore the pages or redirect,... so many,,,puf, and I would have bad control, as the site is very big, I always delete them.
I suppose I would have to give a 301 in htaccess for those with backlinks and restore them without...hard work
However may this have something to do with my ranking?
Sort of donīt like the error message, the warning.
Thanks

g1smd




msg:4534292
 1:46 pm on Jan 8, 2013 (gmt 0)

personalized 410 page with a redirection to my homepage


Once you have sent 404 or 410 status there can be no "redirection".

helenp




msg:4534295
 1:55 pm on Jan 8, 2013 (gmt 0)

thanks levo and g1smd,
thats what I did wrong, redirect the 404, that way the pages ever disappears, I should have just a link instead.

helenp




msg:4534369
 5:30 pm on Jan 8, 2013 (gmt 0)

looks like this has something to do with my homepages dropping, it happened at beginning of december I think,
and I just checked in webmastertools the nš indexed pages, in advanced there is a grafic,
all 3 lines always been more or less equal with 866 not selected on 2/12 and the number of not selected been growing until last day on grafic (6/12) with 50.025 not selected......and the warning message appeared.
this is from that grafic for the 6/12
indexed total 626
Not selected 50.025
Pages bloqued by robot 95
I suppose the 50.025 pages are due to the 404 page.

Also I had about 3000 links from 1 site to an inner page of mine, I wrote the site, did not get any answer, so I renamned the page, and told google to delete that url, and its now gone.

I have taken away the redirection in the 404 page, and know what can I do more? Dont understand the number of 50.000, there are only 6.872 errors....

aakk9999




msg:4534406
 7:13 pm on Jan 8, 2013 (gmt 0)

I suppose the 50.025 pages are due to the 404 page.

I think not. This would indicate that there are pages that are either not returning proper 404/410 response or that you have too many pages that are noindexed, that are redirecting or that you have a duplicate content / thin content issue.

Right now I am in process tidying up a site that should have rougly 2000 URLs indexed, where WMT reported over 80,000 "Not selected" URLs. These were due to:
- server previously returning 200 OK for pages Not Found
- having many URLs with dates in URL that should not have allowed to be indexed in the first place.

At the begining of December we have asked for a change to be implemented to return 410 for all pages that should not have been indexed owing to dates in URL and to return proper 404 response when the page is genuinly not found.

This has resulted in "Not Selected" initially dropping daily at a rate by aproximately 500 URLs/day, and then last week WMT recording a big drop of almost 40,000 URLs from "Not Selected" chart in WMT. After 5 weeks the site is now down to 20,000 "Not Selected" URLs.

From what we can see, it seems that URLs returning 410 are dropped from "Not selected" quicker than URLs returning 404.

I would therefore carefully inspect your URLs, perhaps using "site" command narrowed down by using "inurl" string using some filters, to see where these "Not selected" are coming from. I don't think they are because of 404 errors.

aakk9999




msg:4534409
 7:24 pm on Jan 8, 2013 (gmt 0)

Reading your post above again, now that you have fixed your redirection to home page for pages not found, and hopefully you are returning 404 Not Found status, you should see the number of "Not Selected" starting to drop over next few weeks.

helenp




msg:4534412
 7:29 pm on Jan 8, 2013 (gmt 0)


I think not. This would indicate that there are pages that are either not returning proper 404/410 response or that you have too many pages that are noindexed, that are redirecting or that you have a duplicate content / thin content issue.

Right now I am in process tidying up a site that should have rougly 2000 URLs indexed, where WMT reported over 80,000 "Not selected" URLs. These were due to:
- server previously returning 200 OK for pages Not Found
- having many URLs with dates in URL that should not have allowed to be indexed in the first place.

At the begining of December we have asked for a change to be implemented to return 410 for all pages that should not have been indexed owing to dates in URL and to return proper 404 response when the page is genuinly not found.

This has resulted in "Not Selected" initially dropping daily at a rate by aproximately 500 URLs/day, and then last week WMT recording a big drop of almost 40,000 URLs from "Not Selected" chart in WMT. After 5 weeks the site is now down to 20,000 "Not Selected" URLs.

From what we can see, it seems that URLs returning 410 are dropped from "Not selected" quicker than URLs returning 404.

I would therefore carefully inspect your URLs, perhaps using "site" command narrowed down by using "inurl" string using some filters, to see where these "Not selected" are coming from. I don't think they are because of 404 errors.

I also for about a week ago went into parameter url (in spanish) and added that 969 url with same id should only index one as representive. (strange before there were only about 23 and google said everything ok, no need to touch....)
Quite dont understand what you mean with this:
"I would therefore carefully inspect your URLs, perhaps using "site" command narrowed down by using "inurl" string using some filters, to see where these "Not selected" are coming from. I don't think they are because of 404 errors. "
Do you mean searching in google?
Thanks

[edited by: helenp at 8:18 pm (utc) on Jan 8, 2013]

helenp




msg:4534416
 7:31 pm on Jan 8, 2013 (gmt 0)


Reading your post above again, now that you have fixed your redirection to home page for pages not found, and hopefully you are returning 404 Not Found status, you should see the number of "Not Selected" starting to drop over next few weeks

Hopefully yes, but I am in a hurry as season started, thought drop was maybe only christmas related, and it looks to me as the not selected urls is the cause of the drop as it started more or less at the same time.
Looks like Google are counting rubbish.

aakk9999




msg:4534441
 8:57 pm on Jan 8, 2013 (gmt 0)

Quite dont understand what you mean with this:
"I would therefore carefully inspect your URLs, perhaps using "site" command narrowed down by using "inurl" string using some filters, to see where these "Not selected" are coming from. I don't think they are because of 404 errors. "
Do you mean searching in google?


What I meant is that you could do something like:

site:example.com inurl:*php*someparam=

which will (for example) return all URLs that are php pages and which Google has in its index that use this particular parameter in URL. Often you will get message "...we have omitted some entries very similar to the 3 already displayed..." in which case you should click on "repeat the search with the omitted results included" to get the number of URLs indexed but "Not selected".

Trying this with different URL patterns and different parameters can show you which URLs may have problem in (perhaps) duplicate content, thin content and similar and you can check whether you have addressed the problem with blocking these via robots or noindexing them or in some other way.

helenp




msg:4534458
 9:57 pm on Jan 8, 2013 (gmt 0)

What I meant is that you could do something like:

site:example.com inurl:*php*someparam=

which will (for example) return all URLs that are php pages and which Google has in its index that use this particular parameter in URL. Often you will get message "...we have omitted some entries very similar to the 3 already displayed..." in which case you should click on "repeat the search with the omitted results included" to get the number of URLs indexed but "Not selected".

Trying this with different URL patterns and different parameters can show you which URLs may have problem in (perhaps) duplicate content, thin content and similar and you can check whether you have addressed the problem with blocking these via robots or noindexing them or in some other way.


Thanks, however the pages that have 404 (most of them) aren't dynamic and does not have any parameters.
I do have parameters since some years, but only for the bookingform for each static page, those parameters were only about 20 in google webmastertools before and google told me not to touch anything.
For some week I had more than 900 with that parameter and I told google to only index 1 of them (representive).
I think maybe that increase of parameters increased the same time as the amount of pages with 404 increased and my homepage fallen.
I am considering seriously on serving a 410 instead of a 404 to see if google take away those 404 errors quicker from webmastertools.
(The pages I checked were not indexed but they appeared as errors in google wembastertool.) ie google has them as 404 even though they redirect to homepage, but I think they see the homepage as duplicate content.
It looks to me all this is the due to the update in december, or maybe november.
Maybe I should serve the 404 as 410 for a while just to increase the speed of eliminating them from webmaster tools.

[edited by: helenp at 10:04 pm (utc) on Jan 8, 2013]

helenp




msg:4534463
 10:01 pm on Jan 8, 2013 (gmt 0)

Anyway, this gave me only 2 pages clicking on see all, but it did not gave the pages with parameter $propiedad, but gave pages with word propiedad:
site:mysite.com inurl:*php*propiedad=
maybe I did it wrong

helenp




msg:4534476
 10:31 pm on Jan 8, 2013 (gmt 0)

Im thinking, this is really odd, how can google webmaster tool tell me there are 969 url with the parameter $propiedad, I ever had that many at the same time, I may have at the momento only 75.
That means in those 969 the 404 pages must be included also, or google has gone nut.

lucy24




msg:4534477
 10:32 pm on Jan 8, 2013 (gmt 0)

Could I do this, or is that bad manner?
In htaccess serving a 410 instead of 404:
//Custom 404 errors
ErrorDocument 404 <local-path>/error-410.html
and then a personalized 410 page with a redirection to my homepage.

If you mean 301 redirect, noooo. But if you mean that your custom 404 or 410 page includes a link to your home page: yes that's fine. Link to anything you like.

For some sites you can use the same physical page for both 404 and 410:

ErrorDocument 404 /my404.html
ErrorDocument 410 /my404.html

The user won't know, because their address bar shows the URL they originally asked for. And the search engine will get the right response, 404 or 410. It just depends on what the page looks like. But do make sure you specify something for a 410 page. Otherwise your human users will get the Apache default, which is scary and intimidating.

how can google webmaster tool tell me there are 969 url with the parameter $propiedad

Oh, that's just google. They like asking for pages with made-up parameters or made-up URLs just to see what they get. But 969 seems a bit over the top. If they're all returning 404, surely 4 or 5 should be enough to reassure google?

Moral: Do Not Try To Understand Google.

aakk9999




msg:4534488
 11:55 pm on Jan 8, 2013 (gmt 0)

@helenp
You posted in December about a ranking drop for your english language pages in this thread: [webmasterworld.com...]

Have your rankings recovered? If they did, then perhaps the ranking problem was caused by these 50,000 URLs Google has "discovered" and now that they are returning 404, Google reports these 404 errors.

If you did fix your site so that 404 response is correctly returned (when it wasn't previously), then the message on increased number of 404 in WMT is a normal situation. You should briefly review these URLs, then declare them as "fixed" and if they are not linked from anywhere any more (e.g. they were result of some kind of error on the site/hosting) then they will not re-appear again.

It is always worth checking few of these URLs by clicking on a URL in the 404 list and then looking at the tab "Linked from" to see if Google gives you any indication where it found such URL. Sometimes the "Linked from" tab will be empty, but sometimes it will show you where it found URL that now returns 404. This should give you some clues.

If it shows a page on your own site as "Linked from" source and this page either does not exist or when checked, the URL that returns 404 is not there any more, it was probably a temporary glitch that has been fixed since.

In this case the only thing you can do is keep declaring these URLs as Fixed and ignore the 404 message in WMT - these will eventually go away as Google drops 404 pages.

helenp




msg:4534543
 7:10 am on Jan 9, 2013 (gmt 0)


@helenp
You posted in December about a ranking drop for your english language pages in this thread: [webmasterworld.com...]

Have your rankings recovered? If they did, then perhaps the ranking problem was caused by these 50,000 URLs Google has "discovered" and now that they are returning 404, Google reports these 404 errors.

If you did fix your site so that 404 response is correctly returned (when it wasn't previously), then the message on increased number of 404 in WMT is a normal situation. You should briefly review these URLs, then declare them as "fixed" and if they are not linked from anywhere any more (e.g. they were result of some kind of error on the site/hosting) then they will not re-appear again.

This is exactly what I am thinking about, alsa I mentioned in previous post could be the reason.
No the page has not recovered, but its really odd, how can google at beginning of december discover pages that does not exist and also invent themself parameters, looks like they are using google cache instead of my sitemap. Its really odd.
I hope you are right, however the pages returened a soft 404 already, and now will give a real 404, but as far as I heard 404 takes long to disappears.
And I thought marked as fix was for you to know whats done and whats not and that is does not tell google.

I been checking some and this is interesting,
some has some good links so I guess I better do a 301 on those,
however I have 404 that ever existed with odd folders like this:
hacienda_nagueles_apartments_marbella.htm/sales/svenska/maps/espanol/_availability_september.php
and its linked from a page of mine like this that ever existed, not even the folders:
mysite/hacienda_nagueles_apartments_marbella.htm/sales/svenska/maps/espanol/index.htm
Similar to above there are hundreds and they are folders and pages mixed up.
As I said google gone nut

aakk9999




msg:4534568
 9:59 am on Jan 9, 2013 (gmt 0)

From your sample URL I would imagine you have had somewhere relative path problem OR redirect problem. Relative path problem can occur if you internally linking to URL where href does not have a full path root / and where your page where the link is on has folders.

It could also be caused by badly implemented site move.

Interesting is that you said that some of these URLs that return 404 have good incoming links - what would indicate that such URLs have existed for some time - long enough to acquire good links.

helenp




msg:4534573
 10:29 am on Jan 9, 2013 (gmt 0)

From your sample URL I would imagine you have had somewhere relative path problem OR redirect problem. Relative path problem can occur if you internally linking to URL where href does not have a full path root / and where your page where the link is on has folders.

It could also be caused by badly implemented site move.

Interesting is that you said that some of these URLs that return 404 have good incoming links - what would indicate that such URLs have existed for some time - long enough to acquire good links.


The incoming links are not for those "fantasy pages" but for real pages I have deleted.

I always do the links like this page.php, and I use dreamweaver and I always check broken links before I upload, and when I uploaded I use xenu to check for broken links, I doubbt there was some misstake on my side. What worries me is that it can take long for google to drop those.

Str82u




msg:4534574
 10:33 am on Jan 9, 2013 (gmt 0)

From your sample URL I would imagine you have had somewhere relative path problem OR redirect problem.
Exactly right. Creating links like that without the path or / can allow users and Google to resolve those pages in any public directory you have. Can turn a 100 page website into a virtual 10,000 page monster.
helenp




msg:4534575
 10:48 am on Jan 9, 2013 (gmt 0)

Exactly right. Creating links like that without the path or / can allow users and Google to resolve those pages in any public directory you have. Can turn a 100 page website into a virtual 10,000 page monster.


So this is bad svenska/index.htm? (svenska is the swedish folder) and thats the way dreamweaver does the internal links.
I guess maybe I can configure dreamwaver to do /svenska/index.htm instead for future links,
and should I use absolute links instead, and how can one convert a nearly 700 page sites links....
manually a nightmare.
Thanks

Str82u




msg:4534588
 11:04 am on Jan 9, 2013 (gmt 0)

how can one convert a nearly 700 page sites links....
manually a nightmare.
No experience with Dreamweaver but FrontPage and Notepad++ have mass find and replace functions. If Google is showing they are 404, it sounds like the links they DID have are no longer valid and may never have been. Can you visit any of those links and see a real content page? It almost sounds like you need to decern between the actual 404 for never-existed pages (404) and those you deleted (410).

@helenp - I hope you aren't having to repeat yourself - Did you mention where Google is finding the links in the "Linked from" section of GWT? Are you able to judge that your site is the cause of most or all of them?

Have you read this? [googlewebmastercentral.blogspot.com...]

helenp




msg:4534597
 11:17 am on Jan 9, 2013 (gmt 0)

No experience with Dreamweaver but FrontPage and Notepad++ have mass find and replace functions. If Google is showing they are 404, it sounds like the links they DID have are no longer valid and may never have been. Can you visit any of those links and see a real content page? It almost sounds like you need to decern between the actual 404 for never-existed pages (404) and those you deleted (410).

@helenp - I hope you aren't having to repeat yourself - Did you mention where Google is finding the links in the "Linked from" section of GWT? Are you able to judge that your site is the cause of most or all of them?


Yes dreamweaver has find and replace also, but there are so many links...

Not sure what you mean, think you mean where the links comes from to those odd 404, well they come from my site with the same odd patterns like:
/hacienda_nagueles_apartments_marbella.htm/sales/svenska/maps/espanol/_availability_july.php
/hacienda_nagueles_apartments_marbella.htm/sales/svenska/maps/espanol/_availability_february.php
etc.
But if I dont remember bad, I renamed 2 files as 2 sites had to many links to 2 of my pages, one site had more than 3000 links to a swedish page, and this is the strange, a parked domain.....wich had ads in it, could not see my site, had more than 300 links to I think the problematic page above hacienda_nagueles_apartments_marbella.htm
I wrote the first site got no answer, cant write to a parked domain so I renamned those 2 files and told google to delete the files that had so many strange links as maybe te drop off my homepage could be due to those strange backlinks.
So maybe those pages before had as backlinks my own pages and as well that parked domain.

I know 404 does not hurt you normally, however the drop of my homepage in december, the 50.000 excluded pages since beginning of december and now that many 404 is to much to be a coincidence.
I guess the 50.000 excluded pages did the drop.

Str82u




msg:4534598
 11:29 am on Jan 9, 2013 (gmt 0)

Not sure what you mean, think you mean where the links comes from to those odd 404
In GWT, on the "crawl error" page you can select each error and by type of error. When you check the links by clicking them, what additional information does it give you about where the links are "Linked From" (that's the name of the tab)?

It's likely the errors are links created by crawl mistakes but it would be interesting to know if they are all from your site or if there are any external links creating them.

g1smd




msg:4534600
 11:30 am on Jan 9, 2013 (gmt 0)

So this is bad href="svenska/index.htm"?

Yes. This is bad.

I guess maybe I can configure dreamwaver to do href="/svenska/index.htm" instead for future links

This is also bad.

You should do:
href="/svenska/"

The leading slash is required.

The link should not mention the index file filename.

Your
DirectoryIndex directive should take care of delivering the correct content.
This 93 message thread spans 4 pages: 93 ( [1] 2 3 4 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved