I need some advice regarding the indexing of a site's staging environment.
Both new site and new CMS were indexed around 4 months ago. There's about 90 'bad' URLs which have been indexed and are returning 200 response codes. In addition to the indexing problem the company also wants to do a http > https migration. The real URLs i.e. content pages in the site are about 200 of which maybe 20 have been indexed
Breakdown of these URLs
1. 35 URLs
- URLs like /sites/test.com /widget/tom (test page)
- Have no corresponding / relevant URLs
- Some Cached with the 404 error page, some returning Google's 404 page, some not cached
-> I will return a 410 for these URLs
2. 44 URLs
- URL like /widget/blue which should have been indexed as /gizmo/blue or /widget/blue > which should have been indexed as /widget/blue-new
- Some cached with correct content but wrong URL, some cached, some not cached at all
-> I will return a 301 for these URLs
3. 28 URLs
- URLs of proprietary CMS assets including CSS, JPEG, Sprite files
- 1/2 are returning a 403 response code, some are returning 200 response code
-> I will return a 403 for those URLs
My thinking
1. 301s - although some of these URLs were indexed without content, there is a relevant logical corresponding URL. 301s are a strong order and will force the indexing of the correct pages and removal of the bad URLs quicker
2. 403s - eventually Google will stop trying with these URLs
Process
1. Undertake 401, 403 and 301 as above
2. Request temporary removal of all the 410 and 403 URLs via Google Search Console
3. Request re-crawling and indexing of the good URLs on the site via Google Search Console
4. Acquire good links to assist in deep crawling and indexing
5. Wait for indexation issues to be rectified
6. Site migration and server re-direct 301 from http to https
Questions
1. How does the plan sound?
2. What else can I do expedite the indexing and un-indexing of the URLs I want?
3. What is the extent of the damage?