Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Search being indexed

         

Samsam1978

3:41 am on Oct 14, 2017 (gmt 0)

5+ Year Member Top Contributors Of The Month



I have just noticed that my robots.txt was removed for some time and now my search pages are indexed making it look spammy - I have put back robots.txt with the following in my robots.txt

User-agent: *
Disallow: /search
Disallow: /?q=search/

But the page urls are like this:

http ://mysite.com/search?page= (then the number)
http ://mysite.com/search?page= (then the number)
http ://www.mysite.com/Search/page.html?= (then the page number)

So now I put that in my robots.txt does that mean that anything with the word /search area should get spidered? The first line makes this applicable to all search engines.

[edited by: goodroi at 3:56 pm (utc) on Oct 14, 2017]
[edit reason] delinked example domains [/edit]

keyplyr

9:46 pm on Oct 14, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can also set that parameter to be ignored in GSC > Crawl > URL Parameters

lucy24

10:44 pm on Oct 14, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is
example.com/search
an actual page that physically exists? If so, an alternative approach is to slap a "noindex" meta on it.

If a given URL has already been indexed, disallowing it in robots.txt will not immediately remove it from the index, though it will disappear eventually. In the meantime, you can use the Remove feature in GSC.