Forum Moderators: goodroi

Message Too Old, No Replies

Sitemap Creation

         

new_seo

6:19 am on Sep 12, 2019 (gmt 0)

10+ Year Member



Hello Forum,
I need a suggestion regarding sitemap creation of our company Website.
Our company has N number of pages and honestly speaking we are not getting a proper info of how many pages we have.
Recently I have been given the charge to supervise the SEO Activities. The first thing I discovered that in GSC – a huge number of pages are de-indexed due to redirection issues. I am working with the developer to solve that.
Another huge problem our website has is no Sitemap submitted in GSC. I have suggested to have a proper sitemap – so that Google can crawl our site through that. My idea is until we solve the redirection issues (de-indexed pages by Google) – don’t want Google to crawl those pages. Those pages (having redirection issues) are linked from within the site and outside the site as well.
My concern is; if we submit xml sitemap in GSC – Google will crawl those listed pages only and ignore the pages not listed there?
Any suggestion can help me to move forward with the sitemap issue.
Regards
Utsav

not2easy

12:25 pm on Sep 12, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Google will crawl those listed pages only and ignore the pages not listed there?
A sitemap should list all your pages but there are cases when you do not want to list all URLs. Not all websites are built the same way and there are platforms such as Joomla and WordPress that have their own URL environments.

You do not mention whether the site you are asking about is static html, dynamically generated, or using a CMS platform so I have no specific suggestions. In general you would want to have a clear idea of all the existing files and the indexing goals for your pages and directories (or folders).

Google generally will follow all links found on a site unless they are disallowed in your robots.txt file. You don't want to list pages on a sitemap that are blocked in robots.txt but they will crawl pages that are not listed on a sitemap. A sitemap lets you show the pages you want to have indexed. It is like a guide or map.

new_seo

3:10 pm on Sep 12, 2019 (gmt 0)

10+ Year Member



Thanks for your suggestion; our company URLs generated through CMS. But my question is with de-indexed pages with redirection issue. Until we solve those I want Google not to crawl those URLs. The list is huge and with different pattern. So you can understand that it will be difficult to block those through robots. What can be done in this scenario? Even I am not sure whether it is a good idea to stop Google from crawling those pages - or we can let it de as it is and keep rectifying those URLs.

not2easy

3:32 pm on Sep 12, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Some pages have been changed or replaced with newer pages? If this is the case, you may be able rewrite the old URL to the new page URL and just remove the old page. You would only want to use rewrite/redirect when the new page is a replacement for the old page. If the new page is not a replacement you can remove the old pages and return a 410 (Gone) server response.

Old pages which are being replaced should include a noindex header so that the new URL that replaces the old URL will be seen as its replacement. If the old page is just being dropped, it should not be redirecting to any other page. As you can see, there is not one rule for all changes, it depends on the purpose of the change.

You mention that the URLs are CMS generated but without knowing "which CMS?" I can't offer any better suggestions.

new_seo

4:47 am on Sep 13, 2019 (gmt 0)

10+ Year Member



We are using Adobe Experience Manager as our CMS, I am not knowledgeable enough to give you any insight about it. It is totally managed by our Dev team.
Regarding new pages, as per requirements dev team create new pages every now and then. But sometimes they do 302 redirections from old to new or sometimes they don't even care to do that. This process is going for a long time. As a result the list of such URLs are now huge and our SEO team is struggling to manage such pages from Google de-indexed list.

not2easy

5:05 am on Sep 13, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



That sounds like quite a large number of pages to carry with an uncertain status. A 302 redirect is a Temporarily Moved status so that the "new" URL is not likely to be indexed. It appears to need some direction from management to be able to create a useful sitemap.

new_seo

6:12 am on Sep 13, 2019 (gmt 0)

10+ Year Member



So what do u suggest; after rectifying all the de-indexed issue - we should go for sitemap? If yes, then will the time required to solve the de-index issue is going to harm anyhow?

tangor

12:00 pm on Sep 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do the housecleaning (ie, if a page is truly gone, let it die as a 404, more preferably as a 410... which requires a redirect).

Quit making mistakes. That should be rule #1.

A sitemap will not cure the prior ills. But will at least let g know what you WANT to be indexed.

G never forgets a url it has met... they continue asking for those DECADES after the page disappeared. YOU CAN'T FIX THAT and trying to is a waste of time.

GSC is not that reliable, just keep that in mind. Think of it as an indicator of how befuddled g is at times. :)

As for how many pages you have ... check your folder/files and run a site review. Then you will know exactly how many YOU have. As for what g thinks you have don't lose too much sleep over that... that's their problem, not yours.

Your site map, if you use one (I never have), should be EXACTLY what you want indexed/crawled. You cannot do better than that for that purpose. If g still gets it wrong that's on them, not you.

Sometimes you can't fix stupid.

new_seo

7:08 am on Sep 16, 2019 (gmt 0)

10+ Year Member



Excellent feedback - it actually helps to to analyze the situation to the authority.