Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Disallowed some pages, now they're "Good" in GSC core web vitals ?

         

seodawg802

9:55 pm on Mar 3, 2022 (gmt 0)



As the title states - I disallowed some pages via my site's robots.txt file. These are pages that aren't meant for search results and were potentially contributing to crawl budget issues.

2 days after publishing the change to the robots.txt file, all of those pages showed up as "Good" under the report you find in Google Search Console, here: Experience > Core Web Vitals > Desktop or Mobile > Good URLs

I'm confused by this. Has anyone else seen this?

It's been about a week like this. Will this stick? Why is this happening? When I inspect each of the URLs in GSC the platform shows that they are blocked via robots.txt, so I don't get why it's simultaneously saying those pages are "Good" for CWV, since it isn't crawling them.

not2easy

3:42 am on Mar 4, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hello seodawg802 and welcome to WebmasterWorld [webmasterworld.com]

Unfortunately, Google does not remove pages from their index just because of a disallow in robots.txt. Google may be receiving mixed signals on the page (or pages) in question. If you have submitted a sitemap in the past that included that content and they crawled it and found nothing to indicate you did not want that content indexed then it will be indexed even after you tell them not to crawl it.

If you do not want it indexed it should have a "noindex" meta tag, and a "noarchive" tag might help keep it out of their old records. They do not easily forget an URL they have seen without more effort than disallowing. IF anyone anywhere ever linked to that page (or pages) then they will continue to crawl it from that old link.

Keep in mind that disallowing Google's search engine bots does not mean that other Google tools can't or won't be visiting it. It also won't necessarily prevent other search engines from crawling and indexing the disallowed content.

Is there a reason you are concerned about Google's crawl budget?