Forum Moderators: goodroi
So in the following example 'page' is a file, not a directory. If I'm making sense... :/
http://www.example.com/directory/page
However, there are 'stubs' of information, or details from that page that I want to exclude from search engines with robots.txt.
e.g. in the following url, I want to exclude everything after 'page' (which I believe is now treated as a directory by SEs). However, I DO want to include the above URL, where 'page' is treated as a file.
http://www.example.com/directory/page/i-want-to-exclude-this.html
Am I right in saying that the following will do what I want? Or do I need to put some 'allow:' instructions in there?
User-agent: *
Disallow: /example.com/directory/page/
User-agent: Googlebot
Disallow: /example.com/directory/page/*
Once again, I want the following to be indexed by search engines;
http://www.example.com/directory/page
But I want the following to be excluded by search engines;
http://www.example.com/directory/page/anything-after-this...
Thanks.
In cases where I had a sub directory, I was banning anything below the first directory.
For example;
http://www.example.com/directory/page/page2/anything-after-this...
I essentially need to deal with every url, in this section of my site, that *ends* with / and allow all others.