Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Soft 404- Can't see broken backlinks

         

coolseo

11:35 am on Dec 11, 2015 (gmt 0)

10+ Year Member



Our site is developed in custom-made CMS which has many technical drawbacks. we are in the process of moving to new CMS for seo and editorial purposes.

The 2 major concerns I have are,

1) This CMS generates soft 404 errors for deleted/non-existing pages. So none of the link research tools gives me broken backlinks because they can't find proper 404 pages on our domain.

As per my knowledge, there are many pages which no longer exist but have got good backlinks from past.

Question: Should I make the changes in current CMS and make it produce proper 404 header for deleted/dropped pages to get broken backlinks data? OR I can still get this data after moving to new CMS that has the proper 404 in place?

2) For some reason CMS has generated huge number of weired url variations (no content). All possible combinations of pages and subdirectories, urls with " in them etc.

AHREFs crawl report shows 6400 pages with '200-ok' http status on our domain, while there are hardly 300 active pages on site.
Google doesnt seem to index them but I am not sure about, whether or not it can crawl these pages as they are not linked from website's code.

Question: After moving to new CMS, should I just forget about these non-existing pages or is there anything that I should do? I think 404 correction will solve this issue as well.

Sorry if my questions are not so bright. :)

aakk9999

7:21 pm on Dec 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



These are good questions :)

1) I would get this data after the move. This way you are making a big change to the site once (move to a new CMS), not twice (changing soft 404 on old CMS then moving to a new CMS).

2) I agree that 404 correction should solve this issue as well. It may take some time though for all of these to be recrawled, especially as they are seen as low quality pages (as Google does not seem to index them).

coolseo

5:29 am on Dec 14, 2015 (gmt 0)

10+ Year Member



@aakk9999 Excellent advice! I second the logic behind one big change vs 2 big changes.

I hope I will not lose any data after the move. As per my understanding,
Backlink crawlers can crawl the backlinks to all deleted pages on a given domain, irrespective of when the page was dropped, right?

ergophobe

7:46 pm on Dec 14, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just as a note to address the fact that you should be serving 404s, but aren't...

First, as a technical point, it sounds like you should be serving 410s, not 404s.

404 means: can't find the page, but I have no idea why. Might be a temporary sever problem
410 means: page is gone and not coming back; please update this link and/or remove from index.

So you can return a 410 at the server level without ever hitting your CMS.

The problem right now is that you don't know which pages did exist, but no longer do, right?

So does your CMS return *anything* that is a marker that a page was not found. For example, I am working now on a site that returns this

example.com/404.aspx?aspxerrorpath=/requested-page.aspx

If you have something like that, it should be a simple matter of checking for the query param and sending a 410. So you would have something like
RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)aspxerrorpath= [NC]
RewriteRule !^410\.html - [G]


See
[webmasterworld.com...]