|URL Encoding Preference?|
Does Google prefer + over %20?
| 1:27 am on Jan 6, 2008 (gmt 0)|
Does anyone have any reason to think google prefers url's encoded with a + over %20?
Here's why I ask... I have a site where over the years some sections have spaces between words in the URL encoded with the %20 and other sections where URL's have been encoded with a + for the spaces.
Today, I was doing a site: search for one of the folders that uses %20 and noticed that pages were listed in this order... First, all pages where the url had no %20 in the url, then all pages with a %20. There are 50 pages for this section (one for each state) and the pages for each single word state were first, followed by all of the 2 word states.
But, when I did the exact same site: search on another section of my site that also has 50 state pages but uses the + for encoding, the result set was mixed. i.e not all single words at the top and two words at the bottom.
I haven't been one to assume in all cases that google necessarily ranks pages with a site: search by their relative authority but there is evidence of that because usually your strongest pages are near the top of the list. If so, is google giving less authority to my pages that encode a url with the %20?
| 6:43 pm on Jan 6, 2008 (gmt 0)|
I always avoid underscores and spaces in URLs. They are always bad news.
I always use hyphens or dots between words.
That's an interesting observation with the + and it doesn't surprise me at all.
| 12:23 am on Jan 7, 2008 (gmt 0)|
Spaces in URLs, this is poor coding for me.
| 2:35 am on Jan 7, 2008 (gmt 0)|
No. The url's don't really have spaces... the words in the file name do.
My observation was that all url's where a rawurlencode was done (%20) ranked lower on a site search than url's where there was no encoding (Illinois). Yet, if the encoding was with the + sign, rather than %20, then it appeared not to affect the rankings with the site command.
Again, I know site: rankings aren't real rankings, but why would this happen?
| 3:09 am on Jan 7, 2008 (gmt 0)|
The answer to that question would probably be deep within Google's code for retrieving site: queries. I doubt we can come up with anything definitive here, but my guess is that if we knew the cause, it would not be evidential or helpful for the website owner.
Note that the search allinurl:%20 returns many results that do not contain a "%" character. It seems to be treated like any generic seperator. The only separator that is indexed on its own seems to be "_", the underscore character.