homepage Welcome to WebmasterWorld Guest from 54.234.228.64
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
quick robots.txt question
mike2010




msg:4643551
 7:26 pm on Feb 8, 2014 (gmt 0)

this tells google not to index anything after my domain url that's followed by a ? right ?

Disallow: /?

like if it was mysillydomain.com/?php=somecrappyphpstuff

this stuff wouldn't get indexed anymore ?

I'd also like to block the same with .htaccess.. or to tell google to not access those anymore. (since their garbage url's anyway) but since their php dynamic URL's...they don't listen to "301 REDIRECT" in .htaccess

 

phranque




msg:4643585
 10:57 pm on Feb 8, 2014 (gmt 0)

You can exclude googlebot from crawling that content using robots.txt but this will not control indexing.
Also note that when you exclude a bot from requesting a URL then whatever you do in .htaccess is irrelevant.

tangor




msg:4643604
 11:48 pm on Feb 8, 2014 (gmt 0)

To carry phrangque's thought further, if you really need to 301 something in .htaccess, do NOT disallow it in robots.txt.

lucy24




msg:4643636
 2:18 am on Feb 9, 2014 (gmt 0)

they don't listen to "301 REDIRECT" in .htaccess

A robot-- any robot, whether it's the world's leading search engine or a passing Ukrainian-- isn't obliged to follow a redirect. It can come back later, or not at all. But it can't ignore the redirect and barge on in to the originally requested URL. Redirects, unlike robots.txt, don't work on the honor system.

mike2010




msg:4643700
 1:54 pm on Feb 9, 2014 (gmt 0)



You can exclude googlebot from crawling that content using robots.txt but this will not control indexing.
Also note that when you exclude a bot from requesting a URL then whatever you do in .htaccess is irrelevant.


I need like 5 minutes to analyze that...

Why must they behave so awkwardly..

So I guess I must remove the URL's first in webmaster tools, and then Robots it up?

also, the

Disallow: /?

won't 'exclude' everything right? Just URL's that begin with a question mark immediately after the domain ?

g1smd




msg:4643778
 11:33 pm on Feb 9, 2014 (gmt 0)

The pattern is to be treated as "begins with...".

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved