Forum Moderators: Robert Charlton & goodroi
1. If I want to clean this database of old URL, what is the best approach and what should I consider before deleting the old URLs?
2. Does Google still crawl and visit permanently redirected URLs? What is the risk if I delete all of them, I must mention that these URLs are created a few years ago, maybe 5 years or more.Google has a long memory and will continue to attempt to crawl all pages it knows about even if the page has been deleted a long time ago. Frequency of crawl drops with time. Ultimately you shouldn't really care what Google crawls, if a page needs to be deleted or redirected, you delete or redirect it regardless of Google. Otherwise you will be managing URL's that provide no value to you or your users or Google. This also consume your crawl budget, so it comes at an indirect cost.
3. Is there any risk if I delete the old 404 URLs? Should I check anything else here?
I have a lite account on A hrefs,
I was thinking if this has dangers from an SEO POV.
If they have backlinks, the redirect will pass link juice, but if I 404 them, I will lose the link juice, right?
Is there any risk if I delete the old 404 URLs?The question makes no sense. If the URL returns a 404, then it is impossible for anyone--whether human or robot--to know if the page associated with the URL physically exists on the server. (The same applies to 301s and, for that matter, to almost any response other than 200 or 304.) In general, 404 means the server looked for the file and couldn't find it, but here it sounds as if you are returning the 404 manually. And if so, you should instead return a 410 (Gone). It’s more accurate, and will also make G stop requesting the URL faster.