Welcome to WebmasterWorld Guest from 54.196.2.131

Forum Moderators: Robert Charlton & goodroi

How to de-index search pages?

     
9:23 pm on Oct 15, 2017 (gmt 0)

Junior Member

joined:May 3, 2017
posts:53
votes: 6


Hi guys ohhh I am all confused! all my search pages have been accidently indexed I don't want them indexed obviously. I am getting all confused with my tags. Is putting this in the header enough?

<meta name="robots" content="noindex,follow"/>
<link rel="canonical" href="https://mysite.com/search/keyword/" />

Or do I also need to put in these?

<link rel="next" href="https://www.mysite.com/?q=keyword/keyword &page=1" />
<link rel="prev" href="https://www.mysite.com/?q=keyword/keyword &page=1" />

Thank you for your help
10:02 pm on Oct 15, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14259
votes: 552


At the risk of sounding like That Other Forum, isn't this still the same question as this one [webmasterworld.com]?
10:12 pm on Oct 15, 2017 (gmt 0)

Junior Member

joined:May 3, 2017
posts:53
votes: 6


No because I read that I need to use this in the search page rather than robots.txt
10:13 pm on Oct 15, 2017 (gmt 0)

Junior Member

joined:May 3, 2017
posts:53
votes: 6


Thanks though Lucy for your response on the other post, so you think i should just disallow in robots.txt? or use the meta tag?
12:03 am on Oct 16, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11080
votes: 106


if you are noindexing the page i don't see a need for those link elements.
the link rel next/prev would be useful if you were indexing a collection of documents.
the link rel canonical would be useful if you wanted "this" document indexed under another url.
if anything, the link rel canonical may send an unwanted signal.
12:19 pm on Oct 16, 2017 (gmt 0)

Junior Member

joined:May 3, 2017
posts:53
votes: 6


Yes that is what I thought thanks so much :-)
6:13 pm on Oct 22, 2017 (gmt 0)

Preferred Member from IN 

Top Contributors Of The Month

joined:Apr 30, 2017
posts:485
votes: 67


Add this to your robots.txt:

Sitemap: https://www.example.com/sitemap_index.xml
User-agent: *
Disallow: /go/
Disallow: /?s=*
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Allow: /wp-admin/images/
8:27 pm on Oct 22, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11080
votes: 106


MayankParmar please explain how your post is relevant to Samsam1978's problem.
4:02 pm on Oct 23, 2017 (gmt 0)

Preferred Member from IN 

Top Contributors Of The Month

joined:Apr 30, 2017
posts:485
votes: 67


phranque Sam said that his website's search pages are getting indexed. If Disallow: /?s=* is added to the robots.txt, Google won't able to index the search pages, as it is disallowed :D And once the access is blocked, Google will deindex them (hopefully).
11:20 am on Oct 25, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Oct 17, 2015
posts:107
votes: 31


Yes but you don't disallow if you want pages to drop out of G automatically, you noindex. Only disallow if they're not already in.

If disallowing in robots, then you need to manually / bulk remove the known pages via webmaster tools yourself as if you disallow, G won't know. An answer previously gleamed from here.
12:26 pm on Oct 25, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11080
votes: 106


MayankParmar:

Disallow: /?s=* won't exclude robots from crawling the search urls used by Samsam1978 and as timemachined stated exclusion from crawling will not deindex the url.

instead the url will show this description in search results:
A description for this result is not available because of this site's robots.txt


the other Disallow/Allow directives are irrelevant to the problem statement.