What you are talking about is the value of parameter and you cannot control how Google handles a parameter of a certain value within URL parameters option in Google Webmaster Tools.
So from the your example:
you can tell Google to ignore parameter "Itemid" but you cannot tell Google to only ignore "Itemid" only if it has value of 1.
Therefore, Google will recognise URL with &Itemid=1 unless you have specified within WMT to ignore Itemid parameter alltogether.
From what I understand I cannot set a value a parameter ( ex :1 ) but I have to set a parameter and a parameter is something such as itemid ? is that correct ?
By the way would news but a valid parameter ?
|From what I understand I cannot set a value a parameter ( ex :1 ) but I have to set a parameter and a parameter is something such as itemid ? is that correct ? |
Yes, this is correct, in Google Webmaster Tools you must set up parameter name (you cannot set the value).
|By the way would news but a valid parameter ? |
Anything after ? in URL that is from the left side of = sign is a query string parameter. They are normally separated with & (ampersand). The first parameter of the query string which is immediately after ? does not have & in front of it.
So, if you have ?news=somevalue or &news=somevalue in URL then yes, news is a parameter.
Thank you for the clarification but what parameter do I set when I have a page with no ? in the URL.
Here is an example : http://www.mywebsite/my-website/my-website/2-news/newsflash/65-testimonial-2.html
[edited by: aakk9999 at 1:33 pm (utc) on Aug 24, 2013]
[edit reason] Delinked URL, please use example.com in the future to avoid auto linking [/edit]
|http: //www.mywebsite/my-website/my-website/2-news/newsflash/65-testimonial-2.html |
This URL has no parameters. What exactly are you trying to do?
Here is the issue I have.
Google has indexed web address like this one
I managed to find those using the site:example.com command on google but i don't know if there are any left in google index.
If there are I would like to remove those using the URL parameter.
Is it possible even though i don't have a ? in the URL.
If so what parameter should I write.
URLs in this format do not have parameter in URL and hence it is not possible to remove them using URL parameters in Google Webmaster Tools.
If you do not have too many of these URLs, then the way to remove these is to block these URLs in robots.txt and then use "Remove URLs" in Google Webmaster Tools.
You can check if any have remained by doing the following query:
|I managed to find those using the site:example.com command on google but i don't know if there are any left in google index. |
which should give you a list of URLs that Google has indexed from the "whats-included" folder.
I think you may have misunderstood what the "parameters" feature is for.
#1 a "query string" is the part of an URL after the ? If there is no question mark, the URL has no query string and the parameters area does not apply.
#2 within the query string, each separate named item is a parameter. So in
the whole element "this=123&that=456" is the query string, while "this" and "that" are parameters
#3 Some parameters affect the content of a page, for example
might create a page showing hotels in Atlanta. Others just change the way the content is displayed, like
Right here at WebmasterWorld, each forums page has a parallel form with a parameter called something like "printfriendly". Again, no effect on the substantive page content, it just looks different.
The purpose of the "parameters" feature is to tell the search engine which parameters make a difference. Most of the time it can figure out for itself but sometimes you have to correct it. On rare occasions a parameter will show up that you don't even use; this comes from following links that might say "open in a new window" or similar. Nothing to do with your page, so make sure the search engine ignores it.
If the "parameters" feature in webmaster tools is empty, it almost certainly means the robot has never found any parameters.
|Is it possible even though i don't have a ? in the URL. |
No. You will need to use meta robots noindex on the page itself or set an X-Robots-Tag header with a value of noindex via httpd.conf/.htaccess [assuming Apache hosting] or server-side scripting language such as PHP.
[edited by: Robert_Charlton at 4:06 am (utc) on Aug 26, 2013]
[edit reason] fixed typo at poster's request [/edit]
In the left menu dropdowns, look for Remove URLs where you can remove URLs or even directories that you do not want indexed - but you need to read their requirements. The URLs do need to exist on your site in order to remove them from indexing, they should be noindexed but NOT blocked in robots.txt.
So if you are deleting, moving or renaming those pages, a redirect would be better. The URL parameters settings is not for removing indexed pages from searches.
|The URLs do need to exist on your site in order to remove them from indexing |
Are you sure? My impression was that if you're removing an area, you can have it removed from cache and index right away, rather than wait for google to discover that it's gone.
Maybe more accurate to say, the URLs need to have existed at some time in the past. That is, they can't be removed from the index if they weren't in it in the first place.
:: detour to check ::
gwt didn't object to a removal request for a wholly imaginary directory. Still listed as "pending"; I'll see if the computer explodes when it discovers the directory was never in their index in the first place.
Thank you for your replies but I am now confused, is the URL parameter function to going to help me wide out all my duplicate content pages without having to use the URL removal tool and do those one by one ?
|Are you sure? My impression was that if you're removing an area, you can have it removed from cache and index right away, rather than wait for google to discover that it's gone. |
I got the info from notes in a 2007 .txt file back in my GWT stuff, but I just went and looked it up again. Good thing, here is from the horse's mouth:
URL removal requests expire after 90 days, after which the content may appear in our search results again. To remove a page or image from the index completely, you must do one of the following:
Make sure the content is no longer live on the web. Requests for the page must return an HTTP 404 (not found) or 410 status code.
Block the content using a robots.txt file.
Block the content using a meta noindex tag.
To remove a directory and its contents, or your whole site, you must ensure that the pages you want to remove have been blocked from crawling using a robots.txt file. Returning a 404 isn't enough, because it's possible for a directory to return a 404 status code, but still serve out files underneath it. Using robots.txt to block a directory ensures that all of its children are disallowed as well. Note that disallowing crawling with the robots.txt may not always prevent the URL itself from appearing in our search results.
More information at the bent link above.
URL parameters can not de-index URLs.
[edited by: phranque at 9:27 pm (utc) on Aug 25, 2013]
[edit reason] fixed url [/edit]
I was commenting on the "URLs need to exist" line. I've just gone back and checked; the computer doesn't seem to care a whit if you ask them to remove an URL that never existed in the first place. So I've now got the directory /foobars/ listed as "removed".
:: counting on fingers ::
Wonder if they'll try to crawl /foobars/ come November now that they've been told to ignore it for three months?
|is the URL parameter function to going to help me wide out all my duplicate content pages |
No. The parameter function applies only to parameters.
I had a slower look at your first post. You never actually say that you're getting duplicate content in google. And I don't understand how
could be different from
is the same page as
which seemed to be what you're saying.
If the problem is that your CMS is returning valid pages when you request an invalid parameter value (not name) then that's a CMS problem, not a Google SEO problem. We can still help, but it has to be asked in a different way.
Thank you for all your replies but I have one last question..
When I set the URL Parameter to NO URL when I am asked the question " Which URL with this parameter should google crawl " ? is it the equivalent of using the URL removal tool ?
The only different being the fact that with the URL Parameter google can handle one Parameter with 100's of URL at once instead of me having to remove each URL one by one with the URL removal tool.
You don't appear to *have* URL parameters.
I do, is the same as the URL removal ? and will the page be removed from the index after a while with the URL Parameter ?
How long does it take for those to be removed from the index once the URL Parameter is set, is it a mater of days or months ?
I'm sorry, I don't think you're understanding this.
I will try to join up member22 posts in this thread to see if we can come up with what his problem is:
In his opening post, the sample URL is as follows:
The above URL has four URL parameters which are: option, view, id, Itemid
Selecting "No URLs" for "Itemid" parameter will still leave Google to index URL similar to above, just without the "Itemid" parameter variation.
In his 4th post, when asked for clarification, the sample URL that member22 wants removed was given as follows:
As netmeg and others said, the above URL has no parameters and hence URL Parameters option of WMT will not affect this URL.
member22, if both of these two types of URLs have to be removed from Google index, there are other ways of doing it rather than using URL Removal tool from WMT.
In this case you will need to try to explain your problem better and with more details so that others who want to help you can actually understand what your problem is.
Thank you aakk999 for the clarification.
That is correct I want to remove URL's with ( option, view, id, Itemid ) in it such as this one index.php?option=com_content&view=article&id=129&Itemid=32
from google indexed and I am wondering if selecting the NO URL in the URL Parameter will remove those ( I believe so but can someone confirm )
Then that is true that I also want to remove pages like this one http://www.mywebsite/my-website/my-website/2-news/newsflash/65-testimonial-2.html ( I am not sure google still has any in its index but let's imagine... )
I understand that I cannot use the URL Parameter function for that. I have used the Disallow: /*news and or Disallow: /*6 and it seems to work. ( is it a good way to do it for that type of address )
Finally how long does it take for the URL Parameter tool to remove pages such as this is one index.php?option=com_content&view=article&id=129&Itemid=32
from its index once I have there the URL Parameter to NO URL.
[edited by: aakk9999 at 6:00 pm (utc) on Aug 28, 2013]
[edit reason] Unlinked URL [/edit]
|I understand that I cannot use the URL Parameter function for that. I have used the Disallow: /*news and or Disallow: /*6 and it seems to work. ( is it a good way to do it for that type of address ) |
It may work for now, but Google is known to index disallowed URLs based on inbound links and their surrounding text if they are found, so removing the block from robots.txt and using noindex on the page or in the server header is a better way to go.
I don't know anything about using the noindex in the server header, what does it mean and how do it work.
Is it a way of removing let's says all the pages that have /news in their web address from google index even though google finds those URL based on inbound links and surrounding text ?
"using the noindex in the server header" means using the X-Robots-Tag HTTP Response header and it does the same thing as a meta robots element except it also works for non-html documents served.
|...what parameter do I set when I have a page with no ? in the URL. |
I've not been able to follow this discussion very closely, so please forgive if I'm incorrect or this has been covered here... but I get the sense as I look at this comment, and at some of these urls, that the "prettier" ones, without question marks, might simply be rewrites by the CMS (or manual rewrites enabled by the CMS) of the URLs with the query strings and parameters.
They were probably rewritten by the CMS, but not 301 redirected. To greatly oversimplify... and it's late at night, so this may be sloppy... when a CMS makes the same page accessible via multiple urls, the ideal way to get rid of these, when the content is identical, is to 301 redirect the urls with parameters to the user-friendly version of the url with the same content.
Has the whole question of canonicalization been considered?
Following your answer to my post yesterday, I am wondering if selecting the NO URL in the URL Parameter will remove those types of address option=com_content&view=article&id=129&Itemid=32 from google index or do I need to to use a no index in the server header to make sure it is removed ( as JD_Toims mentioned for the other web address ?
How long does it take with the URL Parameter for pages to be removed from the index ?
I currently have pages indexed in google with the following description : " A description for this result is not available because of this site's robots.txt – learn more. " is it because of the disallow I have or because of the URL Parameter that I set and is it the last step before the URL is removed from the index ?
Thank you for your help,
Ok the URL Parameter function does not necessarily REMOVE anything from the index. It's just a *suggestion* to Google about how to handle that parameter. I have at least a dozen URL parameters defined in GWT for one of my ecommerce sites (for well over a year) and some of them are still in the index.
Hum.. that is what worries me because you only have a dozen after a year. I have about 600 to remove and if the URL Parameter doesn't take my suggestion into account I will never have the duplicate content penalty I have removed... ( maybe I will but if it takes years it is a problem ).
It took a few days to index those 600 pages ( due a bug when I upgraded joomla ) but you are telling me it can take years to remove that is terrible...
So does you recommend in addition the URL Parameter to use the URL removal tool for the URL I have discovered... or should I let the URL Parameter do its job or is the disallow / * in the robots going to work...
In other words, what is best to use, the disallow in the robots.txt, URL removal tool or URL Parameter or should I use the 3 of them at once to give myself the most chance and get the penalty I have removed as quickly as possible.
Header set X-Robots-Tag: "noindex"
| This 38 message thread spans 2 pages: 38 (  2 ) > > |