Welcome to WebmasterWorld Guest from 3.227.249.234

Forum Moderators: goodroi

Google No Longer Supports Noindex, Nofollow and Crawl-Delay in Robots.txt

     
4:58 pm on Jul 4, 2019 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:26370
votes: 1034


Google has clarified the unsupported rules in robots.txt, which it recently announced was submitted to open source. [webmasterworld.com]

These include noindex, nofollow and crawl-delay in robots.txt

[webmasters.googleblog.com...]
5:21 pm on July 4, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Apr 1, 2016
posts:2711
votes: 822


I think that it is worth making this clear
These include noindex, nofollow and crawl-delay in robots.txt

This only applies to the use of these tags in the Robots.txt file.
noindex and nofollow are both still supported in the meta tag and in http headers

From the Google post:
Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.
5:39 pm on July 4, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


I think that it is worth making this clear

Yes, some certainly had a heart-attack ...
9:11 pm on July 4, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10457
votes: 1091


This is one more reminder that the utility of robots.txt is pretty limited and that is not likely to change anytime soon.
10:15 pm on July 4, 2019 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4509
votes: 348


From the article it looks like some folks were trying to noindex files or directories by putting something in the robots.txt file. They would be clarifying better if they went out of their way to explain that the Disallow: line keeps them from seeing the noindex meta tag. It is mentioned but does not seem clear to a good number of people, it seems to happen a lot. As the article goes on to say, those "retired" rules had never been supported as part of robots.txt and they do continue to support them as meta data such as in a robots meta tag.

... and preparing for potential future open source releases...
This is something that I hope we will see some new developments in, the new GSC does not seem to have the robots.txtchecking tools that the older GSC offered. It was a help when they complained about "blocked resources".
12:45 am on July 5, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15872
votes: 869


afaik, Google never did recognize Crawl-Delay. You have to do it through GSC instead. Been that way since it was called WMT.
2:27 am on July 5, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10457
votes: 1091


I get chuckles over noindex ... you have to have a page for the directive to exist ... yet most webmasters try to use to it REMOVE pages from g's index. Horse/Cart, Cake/Too kind of problem. Sad reality is, that g will index the noindex anyway ... so the info is out there for their AI to chomp on in the background.
8:13 am on July 5, 2019 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:26370
votes: 1034


>I think that it is worth making this clear
LOL, I agree, but I thought I had, LOL
7:16 pm on July 6, 2019 (gmt 0)

New User

joined:Jan 10, 2018
posts: 8
votes: 0


I didn't know those could go to robots. I use disallow