Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

URL Construction - keywords in folder name or page name?

         

Simsi

4:41 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm constructing a new site and have read conflicting opinions on this, although quite a few comments are possibly outdated now.

Is it best to construct my internal URLs like:

/dir1/dir2/key-words/

or

/dir1/dir2/key-words.php

Cheers

Ian

tedster

6:25 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IMO there's not going to be any ranking difference. However, not showing the extension ".php" is the smartest way to go because it future-proofs your urls in the area of any possible technology changes.

Simsi

7:48 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Tedster. Fair point but one thing I read confused me, admittedly from a 2006 post, but someone said that Google looks at the number of directory levels and assigns more relevance where a match occurs nearer the root. The inference was that 2 nested levels is better than 3. Any thoughts?

tedster

8:32 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That has never been my opinion or my observation. I've read that "theory" in several places around the web, and I think it came from observing toolbar PR esitmates in the old days, rather than any real data from testing.

jd01

9:09 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I usually try to go with what makes the most sense to me personally on any URL pattern... Here's advice I've given about it previously:

On a site with products, all of which may have unique 'topics' (different brands), electronics, for example, something like /MP3_Players/Apple/iPod might make sense for a website with multiple products and brands, but on a site carrying exclusively 'Apple' products /Apple/MP3_Players/iPod or simply /Apple/iPod might make more sense. IMO: It really depends on the exact application and the contents of the website.

I absolutely agree with tedster's statement about not showing the extension...

To add to what he said, it also makes it so you can 'mask' your underlying technology from the public, which makes it a bit tougher for some 'script kiddie' to 'hack' or mess with.

Personally, almost everything I do now is extensionless, query_stringless, and does not contain the trailing /, but mod_rewrite is one of my 'things' so it might not be an option for you to go with 'completely clean' URLs, but if you are at all 'scripting friendly' I would suggest taking it up, because you can use URLs like my example (all capitalized) and install 'error correction' using Mod_Rewrite, so if someone types the lowercase version the URL 'corrects' to the right location...

I actually use the same ruleset for basic spelling / spacing corrections too, which is great for inbound links when the webmaster placing the link can't type and doesn't understand the importance of copy / paste!

ADDED:
BTW: I would always try to keep the most important keyword (or 'theme') closest to the root, because it may or may not matter at this time, but it makes the most sense to me... Remember everything you do helps Google 'organize' your site into their structure, and it makes sense to me for example.com to be organized as follows /MainCategory/SubCategory/SubSubCategory/InfoAboutATopic

Simsi

7:20 am on Sep 5, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks guys. I'll go with that then and yes, I will be using Mod Rewrite. JD01...you mention that you don't use a trailing slash. Any particular reason?

jd01

8:23 am on Sep 5, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Honestly, I use capitalization and no trailing slash because I think it looks cleaner and more professional on a URL... I also serve headers with PHP, so Last-Modified, ETag, Expires, Content-Length, etc. are generated and served.

Here's an example of what I look at re URLs:
/main-category/sub-category/info-about-a-topic/
/main_category/sub_category/info_about_a_topic/
/maincategory/subcategory/infoaboutatopic/
/maincategory/subcategory/infoaboutatopic
/MainCategory/SubCategory/InfoAboutATopic

I personally think the last looks more professional to visitors (human), even though all say 'the same thing' and could (will probably be) considered 'essentially the same' by a search engine.

Basically, I do some things to try to give a site 'clean URLs' and appear 'static' to most visitors (including search engines), even if it's dynamic. There are still some ways to tell if it's 'static' or 'dynamic' but they aren't 'obvious' or easy to find for a large percentage of the visitors, even if they check server headers.

Simsi

10:40 am on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks jd. Can see the logic in capitalisation, being similar to the general advice for formating adwords URLs.

HuskyPup

1:29 pm on Sep 8, 2009 (gmt 0)



I personally think the last looks more professional to visitors

No disrespect however whenever I see such a URL I always wonder if the person who decided to do that way either didn't know what they were doing or had not considered the future problems it may create for another new team member adding to the site.

I have seen more problems created by such extensions like this and with people uploading incorrect files, i.e. htms when they should have been htmls etc.

To me consistency is the key, personally I couldn't work that way across many sites however if it works for you, fine, I ain't changing mine though:-)

tedster

6:14 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I agree with Husky. Mixed case causes trouble.

Technical trouble: only Windows servers are case insensitive with regard to the filepath. All other servers, including those used by the search engines, are case sensitive. Over time there is often confusion about what the canonical form of the url actually is.

Human trouble: Not only in-house, but with those sites who link to you, or any reporting you may get in offline media. All this means you can lose credit for significant backlink power over time. I see this all the time with sites on Windows servers.

I'm a very energetic champion of using all lower case in the file path.

Simsi

6:57 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have temporarily set it so either works - lower or upper. Is that OK for SEO or are they seen as duplicate?

tedster

7:09 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not OK for SEO - see Canonical URL Issues [webmasterworld.com]

Search engines do "try" to work around canonical URL problems, but you really need to help them out for best results.

jd01

7:23 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think most people should follow your advice.
(No Joke)

No disrespect to either of you either, but I think some of us know enough Mod_Rewrite and PHP to make it so if pages are allowed to be added dynamically we can probably do it without a headache for the person adding the pages and if they are added via FTP there is only a minimal one...

IOW If the page is 'static' they might have to visit a PHP page that updates the .htaccess file for them when they add a new page (minor headache), but if the page is added dynamically, it can be done 'on-the-fly' with PHP.

From a 'site management' perspective, most people might not be able to pull it off, but if they can, from a visitor perspective they look better. (I keep thinking about the stinking visitors, most of whom are not webmasters, and the overall image / professionalism a site (including the URLs) presents to them for some reason.)

Of course most of my new sites are AJAX, so you'll probably think I'm really a rank amateur if I tell you they run off the root and aren't search engine friendly... But they're for me and the visitors I am creative enough to find, not sites I'm building for someone else where SEO is priority Number One.

jd01

7:33 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I forgot to refresh...

I have temporarily set it so either works - lower or upper.

What tedster said & absolutely not for SEO & if you can't automate the process to correct capitalization 'on the fly' then it's probably better to follow the advice of tedster and HuskyPuppy... Go with lowercase, because unless you really know how to do it, they're right, you'll probably just end up with a headache.

Simsi

8:26 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm OK thanks guys. Will go for lower. Interesting: if Google treats capitalisation variations as different URLs, that opens up much better options for experimenting at comparing the effect of content changes side-by-side without impact from the url/path. Useful.

ZydoSEO

8:45 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I prefer using lowercase w/ hyphens as word separators in folder and page names also. For domain names I avoid using hyphens to keep them short, easy to type, easy to remember. But most people are not going to bother to remember a particular page name on your site, so in the case of folder/page names I go with what the search engines (especially G) prefer which is hyphen as a word separator.

As Tedster pointed out MOST operating systems (especially UNIX/Linux) are case sensitive. Windows is probably the biggest exception, but they make up only a small percentage of the web servers on the web. Most are Apache/Unix (or some derivative like Linux) based sites.

jd01

9:04 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Human trouble: Not only in-house, but with those sites who link to you, or any reporting you may get in offline media. All this means you can lose credit for significant backlink power over time. I see this all the time with sites on Windows servers.

I guess I didn't state this part very clearly in my previous posts and think I should have...

My URLs which contain capitalization error correct for: Some spelling & typos as well as all transposition and capitalization through the entire file path.

EG You can link to or type in any of the following:
/URL/Example (correct location)
/URL/Example/
/url/expamle.htm
/url-example/
/ulR_example

And you will end up at the correct location. I guess I should have made more clear I think this amount of Mod_Rewrite is *essential* when you are using the URL structure I do, most specifically for the reasons tedster points out in the preceding quote.

(Actually, when I started installing the error correction is when I went to caps. I figured if I was going to run everything through an error check so I didn't have to worry about people's ability to type a URL into their browser or link to my site I could probably make them look good at the same time... and if you're not smart enough to understand you can only use a certain extension for the 'real file location' on any of my sites, then you're probably not smart enough to be on my team very long, so I don't worry about that too much. ;)

I'm a very energetic champion of using all lower case in the file path.

I'll side with you for 99.9% of the people out there building sites, but there's a few of us who really enjoy pushing the limits and know enough to be able to pull it off... (Besides, you're 'old school' (LOL) - It's a good way to be, and I really do appreciate all your posts, even if we don't always agree on everything!)

jd01

10:08 pm on Sep 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I prefer using lowercase w/ hyphens as word separators in folder and page names also.

Since I'm rollin on this topic:
As far as hyphens go: I prefer using either /, _ or the whole word phrase as one word to ever using a hyphen any more. There is one site I built with hyphens and have since switched to / or _. (I actually probably don't do the word all as one phrase too often, but have on occasion. Anyway, the following is the basic structure I normally use, with - to _ error correction in place, of course!)

So, words which should not be counted as a single phrase are separated with a / and words which should be are separated with an _.

EG
topic-example-info/the-page-name

Would probably look like:
(In all lowercase, just to stay with the consensus.)

topic_example/info/the_page_name