Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Pages deindexed & GSC showing 'noindex' detected in 'robots' meta tag

         

stymie

12:30 am on Dec 9, 2020 (gmt 0)

5+ Year Member



Hello all. I have been scratching my head trying to figure this out for several days now, and thought that I would see if one of the gurus here have a solution. Over the last few weeks we have had 3 of our more popular pages get deindexed (2 returned on Sunday, but 1 is still MIA) and GSC showing an incorrect 'noindex' detected in 'robots' meta tag error when the url is inspected. There is NO command to noindex the page on the page or in robots.txt file. Needless to say, this is costing us money and causing headaches.

Our marketing team implemented some Google Optimize A/B tests right around the first time that it happened, so it may or may not have something to do with it. Has anyone else run into this issue? Any ideas how to resolve it?

goodroi

5:43 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'd start by talking with your tech dept.

It doesn't matter what is being served to your browser. What matters is what the server give Googlebot when it comes to visit.

stymie

6:13 pm on Dec 10, 2020 (gmt 0)

5+ Year Member



The developers have looked at it, and have no idea why it is being seen as noindex when it is not coded that way on the page, in the header, or in the robots.txt file.

goodroi

7:33 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Are you using Google tag manager? That can also introduce code into the page.

stymie

7:55 pm on Dec 10, 2020 (gmt 0)

5+ Year Member



We are using Google tag manager. We have also (possibly) narrowed it down to an issue with page resources not loading for the mobile version. When I run the mobile friendly test, it passes, but states that not all resources could be loaded, as seen in the screenshot here:

[ibb.co...]

When you mouse over the robots.txt link in the status column, it links to the robots.txt for the resource (hubapi, fontawesome, linkedin, etc.). So the robots.txt it is citing seems to be these - not ours. So now how do we resolve this issue? Do we need to add code to our site? Is this being done by our CDN? Feeling my way around in the dark here...lol.

goodroi

9:39 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Talk to the person on your team that touched it last. They should know how to undo it. If you are still lost, then it might be time for you to bring in a professional. WebmasterWorld is intended to be a discussion forum where we help each other understand things, it isn't a place to seek others to work for free so let's be careful this thread doesn't turn into that :)

stymie

10:00 pm on Dec 10, 2020 (gmt 0)

5+ Year Member



My apologies. Wasn't asking anyone to work for free. Just thought I'd post in case someone had a similar experience and had a solution.

NickMNS

10:57 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are you sending noindex in the page headers? More specifcally the X-robots-tag header. see here for more info:
[developers.google.com...]

stymie

11:14 pm on Dec 10, 2020 (gmt 0)

5+ Year Member



We had someone look at it last week, and they did not see anything concerning. Here is what is showing:

HTTP/2
403
date: Thu, 10 Dec 2020 23:13:21 GMT
content-type: text/html; charset=utf-8
content-length: 192
set-cookie: AWSALBTG=72jae0KpL+xpE1U4YgeZg7aRIaJffCk5/s0Ij/pmQ4Suk1UWtE/Q3IwieSqtTzJayYrexNVt3PyG6n596kl13t7MQn7uR15K6capE1gAp+r3I+BZ7VhZNpVNMVhaQCn0h4OaGgTs4alQfzIvPE3jhM5NFZky6bYzvkJBA2OZgnp1IMueghE=; Expires=Thu, 17 Dec 2020 23:13:21 GMT; Path=/
set-cookie: AWSALBTGCORS=72jae0KpL+xpE1U4YgeZg7aRIaJffCk5/s0Ij/pmQ4Suk1UWtE/Q3IwieSqtTzJayYrexNVt3PyG6n596kl13t7MQn7uR15K6capE1gAp+r3I+BZ7VhZNpVNMVhaQCn0h4OaGgTs4alQfzIvPE3jhM5NFZky6bYzvkJBA2OZgnp1IMueghE=; Expires=Thu, 17 Dec 2020 23:13:21 GMT; Path=/; SameSite=None; Secure
server: nginx
vary: Accept-Encoding

not2easy

3:41 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You do not see a 403 server response as anything concerning? That is the server error response for "Forbidden" access.

Edited to add - the problem you describe about third-party resources not being available to Google is common, the same thing happens for Google's AdSense content on a page. If you can look at what they see in GSC you should be able to see your site as a visitor would see it. If there are external resources that are blocked by third parties, that will eventually work out as Google analyzes their crawls.

stymie

4:18 pm on Dec 11, 2020 (gmt 0)

5+ Year Member



SEO is my area of expertise, so things like this are foreign to me. I am sure that some are of the opinion that I should know server response codes, but I do not.

I have seen other forums post the same thing about Google working this out on its' own. The problem is that it something is causing Googlebot to see the site incorrectly because of the resources not fully loading (example: [ibb.co...] and because of that, it is getting deindexed. Just need to figure out what we can do to address this. Thank you for your response.

tangor

5:22 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One way to check things is to shut down all third party add ins and wait for g to crawl again and then check GSC for the responses. If all checks out then one of those third party items is sending the 403 that happens to nick you in the process.

(One reason my third party stuff is ZERO or only ONE (such as adsense))

Might make the site look funny if you are relying on fontawesome (example), but you will find out if anything YOU have coded is at fault.