Welcome to WebmasterWorld Guest from 54.160.221.82

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

quick robots.txt question

     
7:26 pm on Feb 8, 2014 (gmt 0)

Preferred Member

10+ Year Member

joined:July 10, 2005
posts:495
votes: 0


this tells google not to index anything after my domain url that's followed by a ? right ?

Disallow: /?


like if it was mysillydomain.com/?php=somecrappyphpstuff

this stuff wouldn't get indexed anymore ?

I'd also like to block the same with .htaccess.. or to tell google to not access those anymore. (since their garbage url's anyway) but since their php dynamic URL's...they don't listen to "301 REDIRECT" in .htaccess
10:57 pm on Feb 8, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


You can exclude googlebot from crawling that content using robots.txt but this will not control indexing.
Also note that when you exclude a bot from requesting a URL then whatever you do in .htaccess is irrelevant.
11:48 pm on Feb 8, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:6132
votes: 277


To carry phrangque's thought further, if you really need to 301 something in .htaccess, do NOT disallow it in robots.txt.
2:18 am on Feb 9, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


they don't listen to "301 REDIRECT" in .htaccess

A robot-- any robot, whether it's the world's leading search engine or a passing Ukrainian-- isn't obliged to follow a redirect. It can come back later, or not at all. But it can't ignore the redirect and barge on in to the originally requested URL. Redirects, unlike robots.txt, don't work on the honor system.
1:54 pm on Feb 9, 2014 (gmt 0)

Preferred Member

10+ Year Member

joined:July 10, 2005
posts:495
votes: 0




You can exclude googlebot from crawling that content using robots.txt but this will not control indexing.
Also note that when you exclude a bot from requesting a URL then whatever you do in .htaccess is irrelevant.


I need like 5 minutes to analyze that...

Why must they behave so awkwardly..

So I guess I must remove the URL's first in webmaster tools, and then Robots it up?

also, the

Disallow: /?


won't 'exclude' everything right? Just URL's that begin with a question mark immediately after the domain ?
11:33 pm on Feb 9, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The pattern is to be treated as "begins with...".