Forum Moderators: phranque

Message Too Old, No Replies

%20/spaces in URLs - How does it affect a site and spiders

         

hannamyluv

3:30 pm on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Long, long time ago, when I first got into the tech field, I had a co-worker who was adement that files saved on the server not have spaces in them. Something about how it could foul up the system and better to just avoid them. I have always followed suit, as it seemed to make sense.

In the internet world, I know that 99% of people replace spaces with dashes, underscores or plus signs. Makes sense as well since different browsers translate the space, some put the %20 while others leave it a space.

So now I have a client who has rather sloppily just thrown category names, spaces and all, into their URLs. I have pleaded with them to change it as I am certain it will affect spidering somehow (and to be honest, the number of indexed pages has plummeted). Problem is, I am not 100% positive that it really is an issue and I am also not sure that if it is the case, why it affects the spiders or the site's performance.

This is the first time I have run into a client that has done this, honestly. Seems like every other developer just knows your don't leave spaces in the URL. I can't even seem to find any info on this.

Anyway, is the %20 an issue, really? Or am I just being anal? Why exactly is it an issue if it is one?

phranque

7:36 am on Apr 25, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



it's a problem if the keyword following the %20 is important since it won't get separated from the numeric code.
as described here:
Is %20 a Problem in URLs [webmasterworld.com]

g1smd

8:19 am on Apr 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Never use underscores or spaces in URLs.

Google does not treat two_words as being two words, and underscores visually disappear in underlined links.

Spaces get converted to %20 and that makes%20the%20URL%20very%20hard%20to%20read.

Use hyphens, dots, commas, colons, the plus sign, whatever; avoid both spaces and underscores.

As for whether spaces are allowed by the server filesystem, that depends on the operating system running the server. Filenames are not URLs, the two things are only 'associated'. Avoid them there too, if you can.

BillyS

12:24 pm on Apr 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Agree with g1smd... adding that if you change these then redirect the old urls to the new ones.

rocknbil

5:18 pm on Apr 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So now I have a client who has rather sloppily just thrown category names, spaces and all, into their URLs. I have pleaded with them to change it.

As a programmer, I always turn these around: if a problem arises out of user error, it means there is a deficiency in my programming I have overlooked. I find it the "path of least resistance" to just program the fix into it. The "average user" doesn't understand half of what you tell them, much less remember it.

$separator = '-';

$url_title =~ s/\s+/$separator/g;

or more popular,

$url_title = preg_replace('/\s+/',$separator,$url_title);

Done.

As for the underscore issue, I have one site that uses underscores, not dashes, and I have read all the points about underscores. They are all perfectly valid and by better coders than I; still, while I agree on the hyperlink complaint, I'm still on the fence about how it affects SEO and user comprehension. This site is doing extremely well "as is."

Most of it's user's can't even find the address bar, much less remember how to type

example.com/Green Widgets

But even if they can, see point #1 about path of least resistance:

(After unencoding, %20 becomes a regular space)

$separator = '_';

if ($url_title =~ /\s+/) {
$url_title =~ s/\s+/$separator/ig;
}

or

if (preg_match('/\s+/',$url_title) {
$url_title = preg_replace('/\s+/',$separator,$url_title);
}

<dons flame suit>

dailypress

1:44 am on May 6, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have done the same with old images. Image 1.jpg, Image 2.jpg and so on... Unfortunately I dont have the time to fix all of them and rather work on new pages.

Anyway, none of these images were indexed or at least I havent received Google traffic for any of them. the only pics I received traffic were the ones properly named!

Luckily I havent put space in my URL's except for the direct links to the old images: example.com/image%201.jpg

maximillianos

12:55 pm on May 6, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We've had this problem arise out of error as well. It does not seem to affect ranking (as some such pages rank #1 for decent keyword terms), but it definitely does not look pretty.

Or as my landscaper calls my front yard, "a visual eyesore". ;-)