homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

121,000 404 Page not Found - How to remove them?

 10:38 am on Oct 24, 2012 (gmt 0)


I have noticed 1,21,000 404 Page not found errors in my GWT, can you please tell me How I can fix them? Does this leads to a google penalty ? I am using Drupal. Can you somebody over here help me out ?

Thanks & Regards,


Andy Langton

 10:40 am on Oct 24, 2012 (gmt 0)

The 404 isn't really the problem - the problem is the links to these pages.

Find out which pages link to the 404s and then you can figure out how to fix the problem. If they're your own links, then you should correct them. If they're external links, and particularly if they link to content that has moved, you should redirect. But the links are they key - not the 404.


 10:46 am on Oct 24, 2012 (gmt 0)

What Andy said, and if they're not your links, then I wouldn't stress on it ... Counting inbound links from external sites against you would make it too easy for a competitor to tank you or any site they felt like, and my personal experience is they don't count against you at all, but might well count against the site containing the links.

One thing I would add to what Andy said is whether the originating link(s) are on your site or not if they don't go to a page with content that was Moved to a different location, meaning you have a page with 'essentially the same content' as the page the link points to had on it, Do Not redirect the links.

If there's no 'essentially the same' content as a link points to on your site:
If the links are on your site, simply remove them.

If the links are not on your site ... And you're feeling nice, let the site containing the links know they're broken. Or if you're not feeling nice (or maybe they're on a competitor's site) then don't do anything.


 11:13 am on Oct 24, 2012 (gmt 0)

It might not even be real links. There is a thread here about users with the disqus plugin on their site. It puts some sort of session id (a number anyway) in a place in the javascript that makes Googlebot *think* it might be a link, so Googlebot goes ahead and crawls it.



 11:33 am on Oct 24, 2012 (gmt 0)

Thanks for your replies, those 404 pages are basically tags and comment. But those URLs are actually wrong because its showing in GWT as newstag/websmaster/download. This is not the way my URLs are writtern. I am not sure why this happening. But article is not present on those pages because of wrong URLs. How I can check why this happening on my site, why GWT is reporting wrong URLs.

Andy Langton

 11:42 am on Oct 24, 2012 (gmt 0)

Sounds like a malformed link. Try crawling your site with something like Xenu to see if they are lniks from your own site. In the event these are internal links, you should both correct the link and redirect the wrong pattern to the correct URLs.


 12:10 pm on Oct 24, 2012 (gmt 0)

Hello Andy,

Thanks for you suggestion, those are the internal which I am talking about, but does 404 page not found leads to Google penalty ?

Correct if am wrong somewhere. I am not sure why google is indexing the unknown, from where he is getting this URls am not sure from my site.


 8:51 pm on Oct 24, 2012 (gmt 0)

They're not indexing it. They're "only" trying to crawl it.

How big is the site overall? If you've got millions of real pages, 1.2 lakhs of "not found" (quick detour to Profile to confirm that comma positioning was intentional) isn't that big a deal. But if you've got ten 404s for every real URL, the googlebot might start getting annoyed. Only yesterday in an unrelated thread, someone mentioned the dreaded "poor technical quality" label. There's probably a descriptive article somewhere.

If the 404s include queries that never occur in real life, it should not be that hard to get rid of them and redirect to the intended URL.


 9:11 pm on Oct 24, 2012 (gmt 0)

I wouldn't say it leads to a 'penalty', per se, but I would say in a game where the separation between 1st and 10th is 0.0001 (assuming only 1,000,000 possible results for a query) I think you should be all over every detail you have control of, because personally, if was at a search engine and had as many choices as they do for a query and my goal is to make my visitors happy so they keep coming back and using my search engine I'm going to send them to the 'safest choice', which isn't the site with the broken links on it...

In my opinion, the site the owner didn't 'lazy out', 'shrug' and think 'well, there's not too many broken links on here in relation to my page count so I'll just leave them...', is a much safer choice than the one the owner was too busy (lazy) to spend the time it would take to find and remove the links so where ever a visitor clicked they were taken to the right page with the right information.

I would also think broken links could easily influence the number of people who click the 'block results from' link, which is another 'indirect' factor influencing rankings and could easily be related to a site containing broken links.

It's your site, but I would think you want to make it the best choice for search engines to send their visitors to, so in my opinion, every detail you have control over that can influence the 'safety' of search engines sending their visitors to you is important, which includes every single link you place on your site...


 9:13 pm on Oct 24, 2012 (gmt 0)

If you have 121000 internal links on your site that return 404 Not Found then this is a sign of bad quality. You need to find out where these links are on your site and then figure out who (which script) is creating them. Then you need to correct this so that either links point to the correct URL or that links are removed.

If however there are 121000 URLs not found, but when you crawl your site, these URLs are not found anywhere within your own source code, then this is not a problem.

There could be that at some point you were generating such malformed links owing to some problem but the problem has been fixed since. Google might have picked them up in a meantime and will be re-trying them periodically. You could declare them as "fixed" in WMT and see whether they re-appear in reporting and if so, at what rate. You can then view the information "linked from" for some of these links to see where Google managed to find them again from (your site or external).


 8:23 am on Oct 27, 2012 (gmt 0)


Thanks for your help, I am not sure what is happening, We have recently launched our portal from Drupal 6 to 7, and URls are migrated to Drupal 7. I tried to mark those are mark as fixed. But no luck I did for almost 2,000 URls but hardly 500 got removed. Does this kind of glitch can lead to a google penalty ?


 8:50 am on Oct 27, 2012 (gmt 0)

I did for almost 2,000 URls but hardly 500 got removed

They should be immediately removed from the list when you mark as fixed. Are you saying that 1500 came back again shortly after?


 10:14 am on Oct 27, 2012 (gmt 0)

No When I marked them as 1000 URLs Fixed, Nothing happened but as soon as google refreshed GWT it got reflected, 500 URls removed and today again when I removed 1000 URLs nothing happened. Does not found pages leads to penalty ?


 11:11 am on Oct 27, 2012 (gmt 0)

Links within a site that when clicked return a 404 error or redirect to some other place lead to a site being noted for "low technical quality".

Merely having reports of many 404 errors in WMT is not a problem unless it is caused by ongoing poor internal linking within the site.


 11:24 am on Oct 27, 2012 (gmt 0)

@ Rockzer
Have you heard about xenu - free broken link checker?
It is a freeware that you can use to crawl your entire site to find broken links.
You may want to try it.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved