Forum Moderators: goodroi

Message Too Old, No Replies

Google No Longer Supports Noindex, Nofollow and Crawl-Delay in Robots.txt

         

engine

4:58 pm on Jul 4, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Google has clarified the unsupported rules in robots.txt, which it recently announced was submitted to open source. [webmasterworld.com]

These include noindex, nofollow and crawl-delay in robots.txt

[webmasters.googleblog.com...]

NickMNS

5:21 pm on Jul 4, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think that it is worth making this clear
These include noindex, nofollow and crawl-delay in robots.txt

This only applies to the use of these tags in the Robots.txt file.
noindex and nofollow are both still supported in the meta tag and in http headers

From the Google post:
Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.

Dimitri

5:39 pm on Jul 4, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



I think that it is worth making this clear

Yes, some certainly had a heart-attack ...

tangor

9:11 pm on Jul 4, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is one more reminder that the utility of robots.txt is pretty limited and that is not likely to change anytime soon.

not2easy

10:15 pm on Jul 4, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



From the article it looks like some folks were trying to noindex files or directories by putting something in the robots.txt file. They would be clarifying better if they went out of their way to explain that the Disallow: line keeps them from seeing the noindex meta tag. It is mentioned but does not seem clear to a good number of people, it seems to happen a lot. As the article goes on to say, those "retired" rules had never been supported as part of robots.txt and they do continue to support them as meta data such as in a robots meta tag.

... and preparing for potential future open source releases...
This is something that I hope we will see some new developments in, the new GSC does not seem to have the robots.txtchecking tools that the older GSC offered. It was a help when they complained about "blocked resources".

lucy24

12:45 am on Jul 5, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



afaik, Google never did recognize Crawl-Delay. You have to do it through GSC instead. Been that way since it was called WMT.

tangor

2:27 am on Jul 5, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I get chuckles over noindex ... you have to have a page for the directive to exist ... yet most webmasters try to use to it REMOVE pages from g's index. Horse/Cart, Cake/Too kind of problem. Sad reality is, that g will index the noindex anyway ... so the info is out there for their AI to chomp on in the background.

engine

8:13 am on Jul 5, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>I think that it is worth making this clear
LOL, I agree, but I thought I had, LOL

jor70

7:16 pm on Jul 6, 2019 (gmt 0)

5+ Year Member



I didn't know those could go to robots. I use disallow