I have noticed 1,21,000 404 Page not found errors in my GWT, can you please tell me How I can fix them? Does this leads to a google penalty ? I am using Drupal. Can you somebody over here help me out ?
The 404 isn't really the problem - the problem is the links to these pages.
Find out which pages link to the 404s and then you can figure out how to fix the problem. If they're your own links, then you should correct them. If they're external links, and particularly if they link to content that has moved, you should redirect. But the links are they key - not the 404.
What Andy said, and if they're not your links, then I wouldn't stress on it ... Counting inbound links from external sites against you would make it too easy for a competitor to tank you or any site they felt like, and my personal experience is they don't count against you at all, but might well count against the site containing the links.
One thing I would add to what Andy said is whether the originating link(s) are on your site or not if they don't go to a page with content that was Moved to a different location, meaning you have a page with 'essentially the same content' as the page the link points to had on it, Do Not redirect the links.
If there's no 'essentially the same' content as a link points to on your site: If the links are on your site, simply remove them.
If the links are not on your site ... And you're feeling nice, let the site containing the links know they're broken. Or if you're not feeling nice (or maybe they're on a competitor's site) then don't do anything.
Thanks for your replies, those 404 pages are basically tags and comment. But those URLs are actually wrong because its showing in GWT as newstag/websmaster/download. This is not the way my URLs are writtern. I am not sure why this happening. But article is not present on those pages because of wrong URLs. How I can check why this happening on my site, why GWT is reporting wrong URLs.
Sounds like a malformed link. Try crawling your site with something like Xenu to see if they are lniks from your own site. In the event these are internal links, you should both correct the link and redirect the wrong pattern to the correct URLs.
They're not indexing it. They're "only" trying to crawl it.
How big is the site overall? If you've got millions of real pages, 1.2 lakhs of "not found" (quick detour to Profile to confirm that comma positioning was intentional) isn't that big a deal. But if you've got ten 404s for every real URL, the googlebot might start getting annoyed. Only yesterday in an unrelated thread, someone mentioned the dreaded "poor technical quality" label. There's probably a descriptive article somewhere.
If the 404s include queries that never occur in real life, it should not be that hard to get rid of them and redirect to the intended URL.
I wouldn't say it leads to a 'penalty', per se, but I would say in a game where the separation between 1st and 10th is 0.0001 (assuming only 1,000,000 possible results for a query) I think you should be all over every detail you have control of, because personally, if was at a search engine and had as many choices as they do for a query and my goal is to make my visitors happy so they keep coming back and using my search engine I'm going to send them to the 'safest choice', which isn't the site with the broken links on it...
In my opinion, the site the owner didn't 'lazy out', 'shrug' and think 'well, there's not too many broken links on here in relation to my page count so I'll just leave them...', is a much safer choice than the one the owner was too busy (lazy) to spend the time it would take to find and remove the links so where ever a visitor clicked they were taken to the right page with the right information.
I would also think broken links could easily influence the number of people who click the 'block results from' link, which is another 'indirect' factor influencing rankings and could easily be related to a site containing broken links.
It's your site, but I would think you want to make it the best choice for search engines to send their visitors to, so in my opinion, every detail you have control over that can influence the 'safety' of search engines sending their visitors to you is important, which includes every single link you place on your site...
If you have 121000 internal links on your site that return 404 Not Found then this is a sign of bad quality. You need to find out where these links are on your site and then figure out who (which script) is creating them. Then you need to correct this so that either links point to the correct URL or that links are removed.
If however there are 121000 URLs not found, but when you crawl your site, these URLs are not found anywhere within your own source code, then this is not a problem.
There could be that at some point you were generating such malformed links owing to some problem but the problem has been fixed since. Google might have picked them up in a meantime and will be re-trying them periodically. You could declare them as "fixed" in WMT and see whether they re-appear in reporting and if so, at what rate. You can then view the information "linked from" for some of these links to see where Google managed to find them again from (your site or external).
Thanks for your help, I am not sure what is happening, We have recently launched our portal from Drupal 6 to 7, and URls are migrated to Drupal 7. I tried to mark those are mark as fixed. But no luck I did for almost 2,000 URls but hardly 500 got removed. Does this kind of glitch can lead to a google penalty ?
No When I marked them as 1000 URLs Fixed, Nothing happened but as soon as google refreshed GWT it got reflected, 500 URls removed and today again when I removed 1000 URLs nothing happened. Does not found pages leads to penalty ?