Hi,
I need help to block (noindex) a page (index.php) from search engines. Because this is a CMS system, the pages have URLs like this :
/index.php?option=com_content&view=article&layout=form&Itemid=29
I would like to block ALL these pages, no matter what the query string is (?....)
How to do this in robots.txt ?
I have tried both index.php and index.php* but I still see such URLs in the search index (Google), even after using the Webmaster Tools URL removal, which seemed to accept them.
I have URLs not based on index.php that I wish to be the only ones indexed by the search engines.
Secondly, I have many URLs of the form /getpage?page=67 /getpage?page=109 etc but have changed these to append some extra information and wish to remove the URLs of this type that don't have the appened info.
For example :
/getpage?page=109 exclude this form
/getpage?page=109:video keep (index) this form
Thanks!