Forum Moderators: open
www.example.com/folder/file+name+12.html
and when I search:
allinurl:folder site:www.example.com
Google shows me all the pages it has indexed in my /folder/ folder, but when I search:
allinurl:file site:www.example.com
or
allinurl:name site:www.example.com
or
allinurl:12 site:www.example.com
Google says there are no results.
RFC2396 - Uniform Resource Identifiers (URI) [faqs.org] section 2.2
Jim
The plus "+", dollar "$", and comma "," characters have been added to those in the "reserved" set, since they are treated as reserved within the query component.
(G.2. Modifications from both RFC 1738 and RFC 1808)
...so, for that part of an URL that is a query string, they're okay. Google does not evaluate plus signs as spaces, though. Right now i'm working for a customer that's spending a very large amount of money on a CMS system that uses plus signs in URLs (not in query strings), so i've tested it and all tests sofar have been negative. They'll probably be okay anyway, though, it's not that essential to them.
So if it's before the "?" character then a+b is "a+b", but if it's after the "?" character it's "a b"?
I agree in principle, but surely the same goes for "/index.html" being different from "/" and Google do not honour that part of HTTP.
In other words, I don't think we should assume that Google want to follow HTTP in the way that you or I would expect.
Note: While the topic is interesting, that doesn't mean that keywords in URLs are a useful part of ranking for words and phrases in Google.
For the web server, yes. For Google, query strings are just strings - they don't parse the query. Anyway, they are probably able to do it, as they do sometimes highlight partial urls in the SERPS like this:
www.example.com/xml[b]tree[/b]view?this=that&bla=bla (term: tree) - but generally, words in urls have to be separated by ".", "/", or "-" to be highlighted under the snippet.
(note that highlighting terms under the snippet in SERPS does not need to be related to ranking, or PR. To identify a text string is one thing, and to use it is another)
Claus, I agree. I don't see any particular reason why "+" should be treated in the same way as "-" or as "_", nor why those characters are treated as they are. After all, URLs are just strings of characters used to find IP addresses and nowadays to talk to Web servers.