Forum Moderators: goodroi

Message Too Old, No Replies

Blocked by robots.txt

example.com/search?who=

         

designergweb

8:58 am on Aug 7, 2019 (gmt 0)



Blocked by robots.txt, excluded issue.
I have inside of robots.txt the " Disallow: /*?who= "...What i must doing to stop blocked by robots.txt theses pages example.com/search?who= ?
The best practises ?

Dimitri

9:04 am on Aug 7, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



I am not sure I understood what you were trying to say. Do you want to "prevent" robots to to visit your search page?

designergweb

9:18 am on Aug 7, 2019 (gmt 0)



Why these page exluded as Blocked by robots.txt ?Is good for my site? Else i must delete the " Disallow: /*?who= " and put in these pages a noindex?

Dimitri

9:26 am on Aug 7, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Why these page exluded as Blocked by robots.txt ?

This is your site, you are supposed to know :) ... teasing apart, this is certainly because you are using a CMS, which added this entry to your robots.txt.

Is good for my site?

Again, it's up to you to see, but "usually", there is no point of letting search engines to crawl your search page. These pages will rarely be ranked, and if your site has a good navigation structure, search engine will already find all your pages naturally.

Also, since you can have an unlimited number of search pages, it can unnecessarily exhaust your crawl budget.

So in other words, I would let things as they are.

If you remove this line, and put a noindex, then it means that crawlers will still consume resources to access the /search page for nothing, since they'll be told not to index it.

edit: I don't know if my explanation are clear, because I am drunk right now. You know how alcohol can help you forget ...

tangor

11:50 pm on Aug 7, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Back up a bit ...

1. Where are you getting this notice?
2. Did you put this rule in robots.txt?
3. Is the url a CONTENT page?
4. Are there any other errors for "robots.txt"?

robots.txt can be overwritten by .htaccess ... but you need to know how and why you do that.

Reality is robots.txt is a voluntary, MOST FREQUENTLY IGNORED, request by a website to visitors and bots "not to do this" or "please do this".

TorontoBoy

2:42 am on Aug 8, 2019 (gmt 0)

5+ Year Member Top Contributors Of The Month



Only legitimate bots obey robots.txt, and these are the bots you want. Almost all others ignore it. Even the legit Chinese search engine bots ignore robots.txt

You should learn more about robots.txt instead of have your cms/host provider set a default one for you. This is your site, so you should decide on robots.txt as well as .htaccess

lucy24

3:42 am on Aug 8, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What i must doing to stop blocked by robots.txt theses pages example.com/search?who= ?
Nothing. This kind of URL should be blocked. The only puzzler is why your robots.txt disallows the specific query string
?who=
when it should be disallowing the whole URL
/search
regardless of what comes after it.