How to de-index search pages?

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How to de-index search pages?

Samsam1978

9:23 pm on Oct 15, 2017 (gmt 0)

Hi guys ohhh I am all confused! all my search pages have been accidently indexed I don't want them indexed obviously. I am getting all confused with my tags. Is putting this in the header enough?

<meta name="robots" content="noindex,follow"/>
<link rel="canonical" href="https://mysite.com/search/keyword/" />

Or do I also need to put in these?

<link rel="next" href="https://www.mysite.com/?q=keyword/keyword &page=1" />
<link rel="prev" href="https://www.mysite.com/?q=keyword/keyword &page=1" />

Thank you for your help

lucy24

10:02 pm on Oct 15, 2017 (gmt 0)

At the risk of sounding like That Other Forum, isn't this still the same question as this one [webmasterworld.com]?

Samsam1978

10:12 pm on Oct 15, 2017 (gmt 0)

No because I read that I need to use this in the search page rather than robots.txt

Samsam1978

10:13 pm on Oct 15, 2017 (gmt 0)

Thanks though Lucy for your response on the other post, so you think i should just disallow in robots.txt? or use the meta tag?

phranque

12:03 am on Oct 16, 2017 (gmt 0)

if you are noindexing the page i don't see a need for those link elements.
the link rel next/prev would be useful if you were indexing a collection of documents.
the link rel canonical would be useful if you wanted "this" document indexed under another url.
if anything, the link rel canonical may send an unwanted signal.

Samsam1978

12:19 pm on Oct 16, 2017 (gmt 0)

Yes that is what I thought thanks so much :-)

MayankParmar

6:13 pm on Oct 22, 2017 (gmt 0)

Add this to your robots.txt:

Sitemap: https://www.example.com/sitemap_index.xml
User-agent: *
Disallow: /go/
Disallow: /?s=*
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Allow: /wp-admin/images/

phranque

8:27 pm on Oct 22, 2017 (gmt 0)

MayankParmar please explain how your post is relevant to Samsam1978's problem.

MayankParmar

4:02 pm on Oct 23, 2017 (gmt 0)

phranque Sam said that his website's search pages are getting indexed. If Disallow: /?s=* is added to the robots.txt, Google won't able to index the search pages, as it is disallowed :D And once the access is blocked, Google will deindex them (hopefully).

timemachined

11:20 am on Oct 25, 2017 (gmt 0)

Yes but you don't disallow if you want pages to drop out of G automatically, you noindex. Only disallow if they're not already in.

If disallowing in robots, then you need to manually / bulk remove the known pages via webmaster tools yourself as if you disallow, G won't know. An answer previously gleamed from here.

phranque

12:26 pm on Oct 25, 2017 (gmt 0)

MayankParmar:

Disallow: /?s=* won't exclude robots from crawling the search urls used by Samsam1978 and as timemachined stated exclusion from crawling will not deindex the url.

instead the url will show this description in search results:

A description for this result is not available because of this site's robots.txt

the other Disallow/Allow directives are irrelevant to the problem statement.