I had the 1.5 version of joomla in march 2013 and in the google webmaster tool google was indicating 39 pages indexed. In april 2013 our webmaster upgraded joomla to the 2.5 version ( we changed nothing to our website ) other than an upgrade to the latest stable version ).
Within a week of installing that version google had 600 pages indexed in the webmaster tool
that look like that one : index.php?option=com_content&view=article&id=75&Itemid=261 or that one www.example.com/example/159-news/latest-news/66-the-example.html and it has nothing to do in google indexed ( we had the url rewriting in place etc... and never created those pages )
When we saw those pages index by google ( how did we see it ? by typing the command site:exampe.com, google was willing to show us a few of those page ) we decided to put back the 1.5 version of joomla because we realized right away that there was a bug with the upgrade but unfortunately the damage was done. Google had those pages in its index due to a bug with our CMS upgrade.
Since then our goal has been to remove those url from the index of google because we have duplicate content of all our pages. which is killing our seo. We disappeared from the rankings on all our keywords. To give you an example on certain keywords we were on the 2 page and within a few weeks of that issue went to page 70 !
Right now we are trying to remove all the duplicate content pages one by one and in the google webmaster tool but we still have 350 pages listed ( instead over over 600 which means that we managed to remove some of those those with the url removal tool but we have a very hard time figuring our what the address of the webpages to remove is.
Google will not list those when we type the site:example.com command !The inurl command doesn't give a list of those page either so we really have to guess and guessing over 300 pages address to remove is impossible.
How we did it at the beginning is in the URL Parameter google was showing us a list of samples and when we typed the site:example.com we saw a few also listed and then did a few guess ... but we are now stuck.
The question I have is how can we remove the 300 pages that google has in it index that are duplicate content and which is creating the penalty we have. So far we have added the Disallow / * ( for number going from 0 to 9 ) in the robots.txt, we are using the URL Parameter in the GWT and have told google NO URL for all the itemid it found but we have no clue if it going to work...
Is there anything else we could do ? and if we are on the right track what is the delay to remove those pages for google index.
Thank you for your help and comments,
[edited by: ergophobe at 3:47 pm (utc) on Aug 30, 2013]
[edit reason] domain exemplified [/edit]