Welcome to WebmasterWorld Guest from 3.227.249.234

Forum Moderators: goodroi

Blocked by robots.txt

example.com/search?who=

     
8:58 am on Aug 7, 2019 (gmt 0)

New User

joined:July 23, 2019
posts: 24
votes: 0


Blocked by robots.txt, excluded issue.
I have inside of robots.txt the " Disallow: /*?who= "...What i must doing to stop blocked by robots.txt theses pages example.com/search?who= ?
The best practises ?
9:04 am on Aug 7, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


I am not sure I understood what you were trying to say. Do you want to "prevent" robots to to visit your search page?
9:18 am on Aug 7, 2019 (gmt 0)

New User

joined:July 23, 2019
posts: 24
votes: 0


Why these page exluded as Blocked by robots.txt ?Is good for my site? Else i must delete the " Disallow: /*?who= " and put in these pages a noindex?
9:26 am on Aug 7, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


Why these page exluded as Blocked by robots.txt ?

This is your site, you are supposed to know :) ... teasing apart, this is certainly because you are using a CMS, which added this entry to your robots.txt.

Is good for my site?

Again, it's up to you to see, but "usually", there is no point of letting search engines to crawl your search page. These pages will rarely be ranked, and if your site has a good navigation structure, search engine will already find all your pages naturally.

Also, since you can have an unlimited number of search pages, it can unnecessarily exhaust your crawl budget.

So in other words, I would let things as they are.

If you remove this line, and put a noindex, then it means that crawlers will still consume resources to access the /search page for nothing, since they'll be told not to index it.

edit: I don't know if my explanation are clear, because I am drunk right now. You know how alcohol can help you forget ...
11:50 pm on Aug 7, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10457
votes: 1091


Back up a bit ...

1. Where are you getting this notice?
2. Did you put this rule in robots.txt?
3. Is the url a CONTENT page?
4. Are there any other errors for "robots.txt"?

robots.txt can be overwritten by .htaccess ... but you need to know how and why you do that.

Reality is robots.txt is a voluntary, MOST FREQUENTLY IGNORED, request by a website to visitors and bots "not to do this" or "please do this".
2:42 am on Aug 8, 2019 (gmt 0)

Preferred Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts:575
votes: 59


Only legitimate bots obey robots.txt, and these are the bots you want. Almost all others ignore it. Even the legit Chinese search engine bots ignore robots.txt

You should learn more about robots.txt instead of have your cms/host provider set a default one for you. This is your site, so you should decide on robots.txt as well as .htaccess
3:42 am on Aug 8, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15871
votes: 869


What i must doing to stop blocked by robots.txt theses pages example.com/search?who= ?
Nothing. This kind of URL should be blocked. The only puzzler is why your robots.txt disallows the specific query string
?who=
when it should be disallowing the whole URL
/search
regardless of what comes after it.