Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

.PHP or .HTML? – That is the Question

PR1 ranks higher than PR3 – What to do?

         

spina45

1:09 pm on Nov 30, 2006 (gmt 0)

10+ Year Member



My site is 6 years old and the most well-known in its category. I have 1000s of incoming links. Until recently my SE rankings were excellent.

Background:
Four years ago my web developer installed a file into my OsCommerce shop that would convert product URLs from: product_info.php?products_id=nn to: product-nn.html and list them in a sitemap linked to my homepage. This was done to achieve better SE rankings. Within a month 100s of my individual product pages started to appear in the SEs and sales took off.

Over the years I’ve noticed that 90% of the URLs that appear in SEs were the .php version and NOT the .html. This seemed odd because the .php URLs produce a PR1 while the .html URLs produce a PR3. Weird?!

In June ’06 a well known company listed my website in a print publication and misprinted my domain name. I found the domain was available so I purchased it along with another that I thought was cool. I then pointed both new domain names to my existing website. But I did NOT do a 301 redirect because I didn’t know any better.

About 3 months later my site’s ranking plummeted in Google and I began to research why. I determined (but not 100% sure) that Duplicate Content was the culprit.

I modified my .htaccess with a 301 for both domains, including www and non-www, and waited until the next noticeable “update” in Google. Bummer! No change in my rankings

I’ve now been reexamining everything about my site to eliminate Duplicate Content.

So here’s my question, should I standardize on: product_info.php?products_id=nn or: product-nn.html?

Note: there are many incoming links from sites, blogs, articles, etc that point to the .php URL version.

Any insight would be appreciated as to what is going on, how long I’ll have to wait and what to do to regain my once stellar rankings in Google.

spina45

6:39 am on Dec 1, 2006 (gmt 0)

10+ Year Member



I'm sorry. I didn't know I had two .htaccess files.

There is one in my root directory and another in my /store directory.

I see the code below in .htaccess in the /store directory...

RewriteEngine on
RewriteRule product-(.*).html$ /store/product_info.php?products_id=$1

This corresponds the "sitemap_products.php" contribution.

So, I'm sorry, he DID modify my .htaccess. I guess I need to get some sleep! It's 1:40AM!

Thanks for all your help.

spina45

6:47 am on Dec 1, 2006 (gmt 0)

10+ Year Member



> Is your whole site php based?

Actually, no. When I first launched I only had an MSFrontPage site (don't ridicule me!) When I added the OsCommerce shop, I kept the MSFP portion, with links into the store.

I've had two sitemaps, one for the non-store that I create manually and the other for the store products that is generated via "sitemap_products.php"

Decius

8:00 am on Dec 1, 2006 (gmt 0)

10+ Year Member



To be totally honest, I don't see the purpose of the conversion that you are showing us.

That just converts your urls from PHP urls to HTML urls but doesn't include any keywords in it such as product titles and what not. The only possible benefit you are getting then is a site that looks static instead of dynamic, which is minimal.

All the current information still stands, but you should consider pressing your developer to look into actually including keywords in the HTML urls. This would give your site all completely new pages that at first won't carry any PR, and is risky, but is definitely a better long term idea.

--------

The code you have shown us is quite standard. Next he will probably add:

RewriteRule category-(.*).html$ /store/category.php?category_id=$1

or something along those lines which will accomplish the same thing, just for the categories.

But again, there is very little benefit to this unless you put the whole actual category name in there. If he does not know how to do this, just have him lookup some RewriteRule examples on Google. It's quite cool and fun once you get to know it, and extreeeeeeeeemely useful.

tedster

8:27 am on Dec 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's some really sound advice from Decius.

Take this opportunity to write some keywords into the urls. Not only can it help you in ranking (some times more than others, depending on Google's tweaks to their algo), it also helps to draw clicks from the user when your urls end up on their SERP.

spina45

1:55 pm on Dec 1, 2006 (gmt 0)

10+ Year Member



> you should consider pressing your developer to look into actually including keywords in the HTML urls.

I have considered this but...

1a) Will the 301 preserve the 1000s of links external people have posted directly to my "product_info.php?products_id=nnn" pages?

1b) Will the old PR flow into the new Keyword URL?

2) I'm worried about losing existing rank AND if I do a wholesale change to all my URLs could Google consider this spam? I've read that adding to many new pages too quickly is bad.

3) If Google saw that a URL and the anchor text contained the exact same (or very similar) keywords, couldn't they assume I'm gaming the algorithm and weigh me down in the rankings?

jtara

4:16 pm on Dec 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If Google saw that a URL and the anchor text contained the exact same (or very similar) keywords, couldn't they assume I'm gaming the algorithm

I think they'd assume your site is well-organized and logical.

tedster

5:55 pm on Dec 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From my experience, a 301 redirect eventually will transfer both PR and backlink influence, as long as the redirect is just "one hop", and not a chain of redirects that eventually ends up at the new url. However, it can take a while for Google to evaluate and trust the new redirects -- 4-6 weeks in some cases.

If you are not in trouble right now, and you already have multiple urls in play, then I can appreciate your reluctance. It is worth considering, however. Done properly, I've seen solid improvements from using more human-readable urls. Of course when technical errors get in the way, then you can have a mess to fix.

spina45

6:24 pm on Dec 1, 2006 (gmt 0)

10+ Year Member



> as long as the redirect is just "one hop", and not a chain of redirects

Considering Google has indexed my URLs in two formats: 1) "product_info.php?products_id=nnn" and, 2) "product-nnn.html" (both serving the same content), do you think it is beneficial to introduce yet a third URL 3) "keywordwidgetname.html" and 301 the other two to this new URL?

It appears my Google rankings are currently being suppressed, so I'd like to do what is best long term. But I'd rather not have to wait a year or more!

Decius

10:15 pm on Dec 1, 2006 (gmt 0)

10+ Year Member



Long term, what you most recently suggested is exactly what you should do in my opinion. Forward everything via 301 to the new keyword URLs.

I doubt Google will think you are spamming if you alter your site structure.

Also, it is possible that Google will think you are spamming if you make the links match the urls too much. This is my personal belief. Therefore, focus on making sure the URLs are perfect (since you do not want to change those over and over) and then tweak the titles and links to be variations of that. This will provide an environment where everything matches up topically, but does not look duplicated.

Remember, the more you make it look like a human went in and wrote every single url, every single title, and every single link, the less likely you are to trip filters.

jtara

10:46 pm on Dec 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Considering Google has indexed my URLs in two formats: 1) "product_info.php?products_id=nnn" and, 2) "product-nnn.html" (both serving the same content), do you think it is beneficial to introduce yet a third URL

Absolutely.

You currently have duplicate content indexed.

When you are done, you will have a single URL indexed for each product.

The 301 tells Google to forget about the old locations, and use the new ones. You will no longer have your pages indexed using two different sets of URLs.

Feel free to change the structure to something more logical while you are at it.

I don't know if you have simplified things for the post, but I notice that you have no top-level "products" or "store", etc. I would certainly do that, to distinguish your product pages from other miscelaneous parts of your site structure - "about us", "privacy policy", "shipping", etc.

spina45

2:47 am on Dec 2, 2006 (gmt 0)

10+ Year Member



> doubt Google will think you are spamming if you alter your site structure.

In what way?

> Therefore, focus on making sure the URLs are perfect

Should I separate words within URL?
Should I use a dash -?
Should I use an underscore _?
Is there an ideal nummber of characters?
All Upper Case?
All lower case?
Mixed?
What matters most?

> I don't know if you have simplified things for the post

Yes I did. domain.com/store/product_info.php?products_id=nnn

Decius

3:24 am on Dec 2, 2006 (gmt 0)

10+ Year Member



domain.com/store/product_info.php?products_id=nnn

->

domain.com/producttype/nnn/productname.htm

Seperate all spaces with dashes, remove all special characters, lowercase it all so it looks cleaner.

domain.com/store/category.php?category_id=nnn

->

domain.com/producttype-plural/nnn/categoryname.htm

That sounds just about right.

spina45

3:46 am on Dec 2, 2006 (gmt 0)

10+ Year Member



MY CURRENT: domain.com/store/product_info.php?products_id=nnn

YOUR SUGGESTION: domain.com/producttype/nnn/productname.htm

Are you saying to segement products into separate directories?
And then put each product into its own subdirectory?
That seeems like it could rather L-O-N-G.

The "nnn" is my current unigue product identifier. If I use "productname.html" There is no need for "nnn." CORRECT?

jtara

5:21 am on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think you should use whatever makes sense to you and people in your industry.

Do you refer to products by name? By some industry-standard number? By catalog number? By an alphanumeric code?

I'd stick to one (or two) level(s) under "catalog", "products", etc.
Often, products cross categories. If this is the case, don't put the products under categories! Make seperate category indices. The indices should point to the product pages under a unified scheme.

Of course, I am going to illustrate my preference for no file type. Add ".html" if you insist...

Example:

example.com (home page) 
about_us
contact
privacy
ordering
customer_service
downloads
manuals
459z456
catalog
widgets
frenulated_widgets
frapulated_widgets
wadgets
fragets
client_side_fragets
server_side_fragets
products
0x435q (or) super_excelsior
459z456

Try to give each product a UNIQUE URL. I think it's useful to distinguish your catalog - which may have categories, and subcategories, and have multiple references to the same products - from your product pages.

On the other hand, it may be conventional in your industry to have a linearly-sequenced catalog. You are probably going to incur some duplicate-content penalty in this case. But perhaps not. Catalog pages are often!- product data pages.

Forget the SEO. Think about your customers. IMO, that's the best SEO.

Decius

7:46 am on Dec 2, 2006 (gmt 0)

10+ Year Member



domain.com/producttype/nnn/productname.htm

This is not 2 directories... it only looks that way.

domain.com/producttype -> domain.com/product.php3
/nnn ->?productid=nnn
/productname.html -> toss it

The above shows how htaccess should convert the URL

g1smd

2:57 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Always avoid spaces and underscores in filenames and URLs.

Use only hyphens, dots, or commas to separate words.

If you can use all lower case for the URL, then do that. CamelCaps can cause problems, especially on IIS.

jtara

5:01 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



CamelCaps cause problems not only with IIS, but with users. Use all lower case, and use a rewrite rule to force everything to lower-case, so that a user can get away with typing CamelCaps in a URL if they want, but will still get the page they are looking for.

Why only dashes, periods, and hyphens? I think underscores most convey the visual appearence of spaces, and this can make things clearer to your users. Commas are awful, IMO.

OK, now I see what Decius is trying to do - include both a product ID and name in the URL, but the product name is just a decoration and not actually used to identify the page to serve. Cute. I like the idea, but I'd use something other than a "/" to seperate them, then. The "/" is confusing, and implies a hierarchy to the user.

Again, try to follow the conventions of your industry. If your industry uses names, use names. If not, don't. Try to avoid a totally meaningless product ID in any case. Try to avoid using a product ID forced on you by your database or CMS. Use a meaningful product code - the same thing that would appear in a paper catalog.

Catalogs, induces, etc. are seperate. Don't organize your product page URLs under a taxonomy, especially if the same product are likely to be categorized under multiple headings. Make their URLs flat. products/123, products/456, not products/widgets/123. You can have one or more taxonomies elsewhere in your URL tree that refer to product pages.

[edited by: jtara at 5:10 pm (utc) on Dec. 2, 2006]

g1smd

5:09 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> Why only dashes, periods, and hyphens? I think underscores most convey the visual appearence of spaces, and this can make things clearer to your users. Commas are awful, IMO. <<

The problem is that a search for "two words" will only match "two-words" and "two.words" and "two,words" and "two words" but will NOT match "two_words".

Only a search for "two_words" will match "two_words".

Additionally, spaces in a URL are converted to %20 and that makes them%20very%20hard%20to%20read.

Hence the recommendation to use only hyphens, dots, or commas when separating words in a URL.

(Caution: highlighting of keywords in the title, snippet, and URL, in the SERPs is only a display process and does not indicate a search algorithm match.)

jtara

5:15 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The problem is that a search for "two words" will only match "two-words" and "two.words" and "two,words" and "two words" but will NOT match "two_words".

But the words being separated will surely be found in the product page as well, right? And without the underscores. Search generally doesn't rely on the URL to find things.

Maybe this is good SEO (or maybe it isn't...) but I don't see underscores in URLs as preventing search from finding things people are looking for.

g1smd

5:29 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is the other way about. The literal underscore is matched in a search because computer programs often use them in procedure and function names.

Google wants to allow programmers to find computer code examples, and so they treat the underscore as a literal match: like define_print_options [google.com] as opposed to define print options [google.com] etc.

spina45

6:10 pm on Dec 2, 2006 (gmt 0)

10+ Year Member



Okay, this is my CURRENT product path...

domain.com/store/product_info.php?products_id=nnn

What is the consensus on moving to this path...

domain.com/store/widget-name-nnn.html

VERSUS THIS...

domain.com/store/widgets/widget-name-nnn.html

I don't understand including "widgets/" in the path other than adding a keyword to the url.

Also, if "widgets/" is not really a directory structure, what exactly is it?

I'm sorry, I'm not a programmer just a dude who's trying to fix his Google Grief and wants to "measure twice and cut once" in terms of making big changes. Thanks to all who are providing their input.

jtara

6:34 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is the other way about. The literal underscore is matched in a search because computer programs often use them in procedure and function names.

I realize that.

I was pointing-out that if you have, say, "red_widgets" in the URL, you probably have "red widgets" in the actual content.

Google will still be able to find "red widgets".

If you need keywords in URLs for Google to be able to find things, you have bigger problems.

Now, the presence or absence of keywords in the URLs might have SEO implications. Whether those are positive or negative from one day to the next is anybody's guess. Which is why I say make it make sense to your and your users, and forget the SEO implications.

don't understand including "widgets/" in the path other than adding a keyword to the url.

Nor do I...

Also, if "widgets/" is not really a directory structure, what exactly is it?

It is simply part of the URL. But, by convention, users read it as a component of a tree-structured heirarchy.

In the early days, URLs were mapped 1-1 to a directory structure on the server's disk. Dynamic page serving and mod_rewrite frees us from this structure, and allows us to structure URLs in any meaningful (or meaningless) way.

I say, go with the flow, and use "/" to represent heirarchy. That's why I said earlier that it would be confusing to use a "/" to seperate a product number from a product name.

Give some thought as to whether the term "catalog" is really appropriate here. I think in most cases, "product" or "products" might be better. Your catalog is something else - it IS typically organized as a heirarchy and may well have multiple-level subcategories. Your catalog might simply consist of index pages that link to product pages, might include some heading material, might be a reproduction of your paper catalog (in which case you will have to be careful able duplicate content - but I wouldn't be all that much paranoid about it).

You probably have one or more seperate indices on your site for locating products using a heirarchy. But this is distinct from the product pages themselves, and I think it makes sense to keep the URLs for the latter as flat as possible - UNLESS, say, your company has different divisions, with entirely seperate products, or you have completely different product lines. Again, whatever makes sense for your company and your industry.

One final thought - consider whether you might have - or might some day have - multiple bits of information for each product. A product information page, a catalog page, a data sheet, a material safety sheet, a manual, downloads, etc. If so, give some thought as to how to organize this. As in: inside-out, or outside-in?

e.g. example.com/products/45tzq67/data_sheet, example.com/products/45tzq67/manual, or: example.com/manuals/45tzq67, example.com/data_sheets/45tzq67, etc.

You don't have to be a programmer. Sit down and think about what really makes sense for your users. The programmers will sort it out, unless you come up with something REALLY goofy!

[edited by: jtara at 6:52 pm (utc) on Dec. 2, 2006]

g1smd

6:48 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For a page about green widgets, having green.widgets or green-widgets or green/widgets in the URL might help, or might not.

Having green_widgets in the URL can never help.

Decius

9:46 pm on Dec 2, 2006 (gmt 0)

10+ Year Member



Keywords in the URL are very useful as tedster pointed out (reread his post)

They are bold when they appear in search engine listings and Yahoo and MSN and even Google use them to determine keyword strength. Don't overdo it by stuffing it with keywords, but the product name will suffice.

I don't think you should spend too much time worrying about how it will "look" to users or other people in your business as jtara is saying. This is not a priority according to me. The average user does not attempt to understand how your website is organized.

As I stated above:

domain.com/product-type/nnn/product-name.html
domain.com/product-types/nnn/category-name.html

This provides anyone who looks at it a topical hierarchy - you have product-name which belongs to product-type.

You have category-name which carries product-types.

I believe Google will like this very much as well.

if you do:

domain.com/product-type/nnn-product-name.html
domain.com/product-types/nnn-category-name.html

This is also acceptable.

Decius

9:56 pm on Dec 2, 2006 (gmt 0)

10+ Year Member



Additional note:

domain.com/product-type/nnn-product-name.html
domain.com/product-types/nnn-category-name.html

Is worse than the url I suggested IMO... this is because if you want to stick in any additional variables:

domain.com/product-type/nnn-yyy-product-name.html
domain.com/product-types/nnn-yyy-category-name.html

It gets sticky, because you are already using dashes for spaces in the category name.

I would stick with:

domain.com/product-type/nnn/product-name.html
domain.com/product-types/nnn/category-name.html

I think you should run this by your programmer before responding here to get a more thorough idea of what is implied by all this.

g1smd

10:13 pm on Dec 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you do run dynamic URLs that "look" like folder-based URLs then DO make sure that the "index URL" for each folder DOES produce some meaningful content: even if is just a 404 error page with links to various sections of the site.

I hate trying to traverse up the folder tree, to get to some other place on the site, only to receive a "virtual folder listings have been deactivated" (or similar) error message.

Decius

11:55 pm on Dec 2, 2006 (gmt 0)

10+ Year Member



That is not the way you are supposed to navigate a site, and is certainly not the way the site is spidered. It is via sitemap, inbound links, or in-site links. This is the way Google will spider it as well as far as I know. There is no benefit to catering to users that choose to navigate this way, although a 404 should natually happen if you have indexing off.

g1smd

12:06 am on Dec 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A lot of designers are clueless at providing a heirarchical "breadcrumb" navigation trail and/or "related products" links, so if I am on a page like www.shoes.com/running-shoes/mens/adidas/track/lightning/7/blue/ and I now want to look at the Nike stuff, and you haven't provided a quick way for me to get there, then I am going to try accessing www.shoes.com/running-shoes/mens/nike/ and see what I get.

Decius

1:00 am on Dec 3, 2006 (gmt 0)

10+ Year Member



Then this is poor linking: Nike should be made available via a tree link of some sort if you indeed provide products in that manner. To assume in this day and age that you can alter a URL and expect to find categorically sorted information that isn't linked to by the webmaster is not very logical.

jtara

4:35 am on Dec 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To assume in this day and age that you can alter a URL and expect to find categorically sorted information that isn't linked to by the webmaster is not very logical.

I do it all the time. And it works.

The most common case is when a company has come out with a new version of software, and I want to download it, get the manual, etc. They don't always get the links into all the index pages right away. If the site is laid-out logically, I can take a good guess and most of the time it will be right.

Better uses should be able to find stuff that you've forgotten to link than not.

I do think the URL should make sense to the user. It would be different it the URL wasn't displayed to the user in the URL bar - but it is.

If nothing else, it can show your users that you are organized. Or not.

This 62 message thread spans 3 pages: 62