Forum Moderators: open

Message Too Old, No Replies

Using +'s in URLs ...

... Google doesn't seem to recognise them

         

internetheaven

8:43 pm on Sep 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thousands of my URLs look like this:

www.example.com/folder/file+name+12.html

and when I search:

allinurl:folder site:www.example.com

Google shows me all the pages it has indexed in my /folder/ folder, but when I search:

allinurl:file site:www.example.com
or
allinurl:name site:www.example.com
or
allinurl:12 site:www.example.com

Google says there are no results.

ciml

5:07 pm on Sep 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I can't think why I've never checked this before, I think I always assumed that "+" would be " ", just like "-" is.

But no, from your results it looks like "+" in the search matches "+" in the page. So file+name+12 is just as much as word as fileznamez12, if it's in a URL.

sun818

5:11 pm on Sep 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With Netscape 4, spaces have to be replaced with +. Otherwise, Netscape 4 would only read the URL up to the first space like giving you a 404 error or taking you to the wrong location. I would be surprised if Google could not decipher that a plus in a URL is the same as a space.

jdMorgan

6:33 pm on Sep 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not up to us to pick what we like. To assure compatability, stick with the rulebook:

RFC2396 - Uniform Resource Identifiers (URI) [faqs.org] section 2.2

Jim

internetheaven

7:32 pm on Sep 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not up to us to pick what we like. To assure compatability, stick with the rulebook:

That'll teach me to use Ukrainian companies to write scripting for me .... you'd think people who do this for a living would have to have some sort of training wouldn't you?

claus

8:10 pm on Sep 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The plus "+", dollar "$", and comma "," characters have been added to those in the "reserved" set, since they are treated as reserved within the query component.
(G.2. Modifications from both RFC 1738 and RFC 1808)

...so, for that part of an URL that is a query string, they're okay. Google does not evaluate plus signs as spaces, though. Right now i'm working for a customer that's spending a very large amount of money on a CMS system that uses plus signs in URLs (not in query strings), so i've tested it and all tests sofar have been negative. They'll probably be okay anyway, though, it's not that essential to them.

ciml

10:03 am on Sep 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> stick with the rulebook

So if it's before the "?" character then a+b is "a+b", but if it's after the "?" character it's "a b"?

I agree in principle, but surely the same goes for "/index.html" being different from "/" and Google do not honour that part of HTTP.

In other words, I don't think we should assume that Google want to follow HTTP in the way that you or I would expect.

Note: While the topic is interesting, that doesn't mean that keywords in URLs are a useful part of ranking for words and phrases in Google.

internetheaven

10:44 pm on Sep 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



that doesn't mean that keywords in URLs are a useful part of ranking for words and phrases in Google.

Are you suggesting that file names are no longer a ranking factor?

claus

10:20 am on Sep 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> So if it's before the "?" character then a+b is "a+b", but if it's after the "?" character it's "a b"?

For the web server, yes. For Google, query strings are just strings - they don't parse the query. Anyway, they are probably able to do it, as they do sometimes highlight partial urls in the SERPS like this:

www.example.com/xml[b]tree[/b]view?this=that&bla=bla
(term: tree)

- but generally, words in urls have to be separated by ".", "/", or "-" to be highlighted under the snippet.

(note that highlighting terms under the snippet in SERPS does not need to be related to ranking, or PR. To identify a text string is one thing, and to use it is another)

ciml

10:14 am on Sep 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



internetheaven, I don't suggest that keywords in URLs carry no weight, just that they're not important. Well, they matter more than bold text I suppose. ;-)

Claus, I agree. I don't see any particular reason why "+" should be treated in the same way as "-" or as "_", nor why those characters are treated as they are. After all, URLs are just strings of characters used to find IP addresses and nowadays to talk to Web servers.