homepage Welcome to WebmasterWorld Guest from 54.167.179.48
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 64 message thread spans 3 pages: < < 64 ( 1 [2] 3 > >     
What File format with extension is best for SEO?
ganeshjacharya



 
Msg#: 4521761 posted 7:49 am on Nov 23, 2012 (gmt 0)

What file format with extension is best suited by search engines?
  • page_name.html
  • pagename.html
  • page_name
  • pagename

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4521761 posted 5:11 am on Nov 24, 2012 (gmt 0)

It doesn't cost any extra to have an htm at the end and it was a convention I have used since 2000. I don't think NOT having an htm will affect the rank.

The "uhm" was a reference to "I agree" (with post about no extensions) ... followed by an example with extensions. And I've been using .html since, oh, 1994 or thereabouts, so neener-neener ;)

Now personally, I see an extensionless URL, my first impulse is to tell it to get back into the server and put some clothes on. But then, I don't like the idea of every URL having to be explicitly rewritten every time or it won't work. I've compromised by letting directories end in / even though that too calls for a rewrite. Only difference is, you don't have to code it yourself.

People would also need to use voice to cite URLs. Suppose I am talking to someone on phone and that person is over a Laptop.

Dictating an URL over the telephone is a task not to be undertaken lightly. Pop it in an e-mail instead. Especially, ahem, if the person is already in front of a computer.

Remember when TV commercials first started including www addresses and they always always always said "backslash"? OK, so that's http:\\www.example.com\stuff\morestuff ... and you wonder why their site doesn't get any visitors.

And, er, yes, I did do a double-take at Pen Island. Ouch.

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4521761 posted 6:12 am on Nov 24, 2012 (gmt 0)

People would also need to use voice to cite URLs.


Not a concern of mine as I always send people links via email, facebook or IM, I don't spell out URLs, not in this decade anyway. More importantly if I sent it to them somewhere it's archived, like email or FB, they have it after the phone call ends and they also have a way to contact me back regarding it if needed.

Not only that, I now have their email address too :)

sunnyujjawal



 
Msg#: 4521761 posted 6:12 am on Nov 24, 2012 (gmt 0)

Always avoid underscores and spaces in URLs. Hyphens are fine.

Agree - Matt also speak about this in a video

setzer



 
Msg#: 4521761 posted 6:13 am on Nov 24, 2012 (gmt 0)

page-name is the most popular. Rarely do I see page_name used.

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4521761 posted 6:19 am on Nov 24, 2012 (gmt 0)

People would also need to use voice to cite URLs.


Spaces typically aren't in URLs or shouldn't be, so I would assume it's an underscore but the URL displays in the status area of most browsers when you mouse over it so regardless of whether or not you can tell if it's an underscore or part of the underline it's plainly visible elsewhere on the screen.

Also, the hyphen and underscore are on the same key on a standard keyboard so it's really not much of a challenge to tell someone how to type it in either way IMO.

Dan01



 
Msg#: 4521761 posted 6:39 am on Nov 24, 2012 (gmt 0)

The "uhm" was a reference to "I agree" (with post about no extensions) ... followed by an example with extensions.


Again, I agree that it doesn't affect the rank. I still use the htm.

ganeshjacharya



 
Msg#: 4521761 posted 6:43 am on Nov 24, 2012 (gmt 0)

Not a concern of mine as I always send people links via email


What if one is in a busy train, elevator, and someone urgently needs a URL? I guess there are many places one might get stuck. Suppose one gets a call from an important lead, someone who has a mobile phone and is waiting to key in the URL? and say for some reason their Email client or yours is stuck?

Suppose one is in a third world country where people don't have their emails configured over their mobile phones? Or, there could be some super busy person who dislikes Mobile Phones but is around a computer with internet? I know an actor who has publicly said he does not carry an mobile phone with him..

Another point. The URLs need to be simple and something that can be easily recollected. Suppose they need to be used in a print campaign? In these situations example.com/sportscar like URLs those that are short and *rational* may be useful.

[edited by: ganeshjacharya at 7:06 am (utc) on Nov 24, 2012]

Dan01



 
Msg#: 4521761 posted 6:47 am on Nov 24, 2012 (gmt 0)

And I've been using .html since, oh, 1994


I don't remember Windows allowing four-letter string extensions until Windows 98. I don't think Windows 3X allowed it. That is why I chose to use three-letter extensions. My programming days go back to 1976 when I taught myself BASIC BTW. :)

[edited by: Dan01 at 8:01 am (utc) on Nov 24, 2012]

Lame_Wolf

WebmasterWorld Senior Member lame_wolf us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4521761 posted 7:12 am on Nov 24, 2012 (gmt 0)

From a pure SEO standpoint "pagename" would be best


Define best.
I have URL's that are like...

SometimesLikeThis.html
or-sometimes-like-this.htmnl
and_some_like_this.html

and they all rank #1
You cannot get better than #1, so how is one better than the other?


Minimal number of characters used in conveying the exact keyword(s) between the four options provided for pagename.


How is that better from an SEO standpoint?

To me, SEO is about reaching #1. If I have reached that with either...
SometimesLikeThis.html
or-sometimes-like-this.htmnl
and_some_like_this.html

...then i've done my job.
I try to not make any URL too long. I've made that mistake before and can cause all sorts of problems.


I still prefer .html extensions on pages I am unlikely to need to update and that don't have dynamic content. A quick php_copy gives me a static .html copy that my server gives precedence to automatically without any redirecting in htaccess, it doesn't get much easier (or faster) than that.


I use .html too. I don't know why, but I have never liked .htm

It's all preference, search engines understand them all just fine and there are better places to make gains with your SEO efforts.


True.

At the same time, you have to think of your audience.
Personally, I wouldn't want to be sending out a newsletter aimed at children with a URL of www.example.com/penisland.html or www.example.com/penisland

Lame_Wolf

WebmasterWorld Senior Member lame_wolf us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4521761 posted 7:19 am on Nov 24, 2012 (gmt 0)

My programming days go back to 1976 when I taught myself BASIC BTW
You old fart ;) Showing your age now.

Do you remember COBOL too? (Compiles Only Because Of Luck.)

Dan01



 
Msg#: 4521761 posted 7:52 am on Nov 24, 2012 (gmt 0)

I didn't learn COBOL. I think that was more business oriented, but it was around when I learned FORTRAN and then PASCAL.

I am trying to remember my first computers. HP was "the" engineering calculator back then. I had a CASIO hand held computer that I programmed a horse racing handicapping program on. I used the COMMODORE 64 while everyone was using the Apple 2 (82). I never got into Apple - I went PC with DOS (I could build my own a lot cheaper). I bought a hand-held scanner (B&W) back in the early 90s. It was around $150, but the color ones were $300. :0 I didn't see flat-bed scanners back then at Computer City.

[edited by: Dan01 at 8:11 am (utc) on Nov 24, 2012]

Lame_Wolf

WebmasterWorld Senior Member lame_wolf us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4521761 posted 7:57 am on Nov 24, 2012 (gmt 0)

I used the COMMODORE 64 while everyone was using the Apple 2.
I was on a ZX Spectrum and that was an odd way of trying to learn programming.

Although I never programmed in COBOL, they did at where I worked, and I think they still do today for some of their systems.

Dan01



 
Msg#: 4521761 posted 8:27 am on Nov 24, 2012 (gmt 0)

In college they had an IBM 360 that everyone networked to. I used punchcards the first year. For Pascal I used the Commodore with a dongle. Later logged onto a HP 1000 (I think that was it) to make 2-D animations.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4521761 posted 9:55 am on Nov 24, 2012 (gmt 0)

I prefer "pagename.html" ... ... I can create a static copy of my php pages and upload them to the server to completely get rid of the CMS or php script.

What the file is called has nothing to do with what URLs your site uses. You can use physical .html files on the server and one line of code associates the extensionless URLs with each file.
RewriteRule ^(([^/]+/)*[^/.]+)$ /$1.html [L]

But then, I don't like the idea of every URL having to be explicitly rewritten every time or it won't work.

Every URL request is rewritten. The default server action is:
RewriteRule (.*) /$1 [L]
It's hardly a great effort to alter that default URL to file mapping.

On the other topic, I started with a teleprinter and punched tape via the telephone in 1979 then the Research Machines 380Z in 1980, TRS-80 in 1981, and the BBC Micro in 1982 or thereabouts. First job used some sort of Apple machine. Second job had display terminals on an IBM System 36 or 38. Didn't use a PC until much later. Started with DOS 3.1 and changed to DOS 3.3 very soon after. First laptop had two 3.5inch floppy disks and NO harddrive. Heck!

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4521761 posted 11:35 am on Nov 24, 2012 (gmt 0)

I don't remember Windows allowing four-letter string extensions until Windows 98.

Fortunately I have never needed to worry about That Other Platform's extension format.

:: snrk ::

Weirdly it's only in recent years that I've used extensions at all. I mean in real life, not on the web. It's part of that OS X thing. If I can't see the extension I don't know whethere there really is one, or whether the computer will try to open the file in ... well, I've forgotten it's name but it's some arcane utility that I swear I have never used for any purpose. Adding insult to injury, it will then tell me that application X can't open this file and in fact won't run on this computer at all. Well, I never ASKED it to open the file. Hmph.

Every URL request is rewritten. The default server action is:
RewriteRule (.*) /$1 [L]
It's hardly a great effort to alter that default URL to file mapping.

But you still have to take a conscious and deliberate action. With directory-slashes it happens without any human involvement.

Oh, and I guess it's time to put in a reminder that we didn't invent computers. For every 87-year-old getting their first tablet there's someone like my 81-year-old father who probably still speaks fluent Fortran though he hasn't had to use it in a good many years. You know the type: the ones who will happily spend two hours running up the code to do a one-time job that you could do by hand in 45 minutes.

Never got past BASIC myself. Any other language, including php and javascript, I'm in "three words and the rule for forming plurals" territory.

But I still expect to see
:: exerting superhuman strength to turn back to thread's original topic ::
an extension at the end of a www page name. Query strings, no, yuk. But a modest html is always in fashion.

Tiggerito



 
Msg#: 4521761 posted 2:56 pm on Nov 24, 2012 (gmt 0)

This is a good read and has changed my viewpoint on URLs.

I think there are a few key points to URLs:

    Easy for search engines to understand
    Nice for users to see (online and offline)
    Easy for users to explain verbally
    Easy to type in


The ideal solution will cover all. Here is my opinions.

Extensions are no longer needed and only detract from all the requirements. So don't use them.

Google may be able to work our that pagename is two words, But why take the risk. Make their job easier by providing a word divider.

Google now considers underscore as a word divider so it is an option. To me the decision is if you want people to be able to see the syntax of the URL. In many cases the underscore becomes invisible for online links. This does make the link look nicer though.

As many systems are case sensitive with the URL path we have to deal with it.

Thinking about all this, here's my current idea of best practices:

Use underscores as a divider for words. This looks tidy as they look like spaces when links are underlined. And search engines don't have to guess what the words are. This is actually a change on my previous thinking of dashes being the best.

Write using the natural case for what your saying. Again to make your URLs look nice online. e.g capitalise when it makes sense.

Don't bother with extensions. Developers from the 90s might know what they mean (I remember Mosaic), but it has no significance today. It's just fluff that makes explaining a URL more complex.

Notice that all the above is about making the URLs look nice for the user.

Know to what I think is a more important concept. Let people make mistakes without them knowing it.

We have the use of 301 redirects and the canonical tag that let us support multiple URLs as if they are the same one. So use it. Either way you can make a bunch of URLs act as one. Redirect or canonicalise:

    With underscores
    With dashes
    Without either
    All case variations
    Versions with/without extensions


Basically, if someone explained the URL over the phone or in a pub, make sure every mistake made by the listener would still work.

This also means you could explain to staff that they don't have to say all the dash/hyphen/underscore stuff. Just read out the text and it will work.

e.g. you could have all the following URLs work:

    page_name
    page-name
    pagename
    pagename.html
    pagename.htm
    page-name.htm
    pagename.php
    etc.


But they all indicate the correct version via a 301 or canonical tag.

Something else I do. For offline media you know that people have to read and type in the URLs (or use QRCodes). I quite often create a custom URL to not only make it easy for the user but to also let me track the source.

Tiggerito



 
Msg#: 4521761 posted 3:09 pm on Nov 24, 2012 (gmt 0)

And the first computer I programmed was a Sharp...

[old-computers.com...]

I also remember a job for my dad where my code had to be punched on cards to enter it into the computer. It had tape drives like you see in the original Italian Job movie.

And I remember writing assembly via rem statements on a VIC 20, or was it the ZX81?

Tell you what, I don't miss it. I love the fact I can now focus on the features I'm developing and not the code.

Dan01



 
Msg#: 4521761 posted 9:48 pm on Nov 24, 2012 (gmt 0)

Google now considers underscore as a word divider so it is an option.


The way I remember it - Google sees the underscore as a space.

Never use a space though. I remember (this happened several times) when a link is transferred (perhaps via email - I can't remember right now) that the space was replaced with several characters (like %@&). When that happened the URL was unusable and gave a PAGE NOT FOUND. I was with a friend who had an email with this. I replaced the characters with a space and it worked. But how many people will do that.

I guess that bottom line is: the underscore is the best.

As per the extensions: I don't think it matters to the search engines. I don't think it matters as much as the dash/underscore debate. BUT... I still use the .htm for the reasons I said above.

Before I used a CMS (and I still might use simple HTML pages from time to time) I would store the pages in a folder with images. Sometimes the images would be the same or a similar name. An htm or html would help me quickly identify the webpages from the images and other files.

I don't think there is a reason to use .htm or html for SEO.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4521761 posted 10:16 pm on Nov 24, 2012 (gmt 0)

Hyphen always trumps underscore or space for several reasons mentioned above, especially usability.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4521761 posted 11:36 pm on Nov 24, 2012 (gmt 0)

Never use a space though. I remember (this happened several times) when a link is transferred (perhaps via email - I can't remember right now) that the space was replaced with several characters

All characters except a very short list are percent-encoded. Space is %20. Percent itself is %25. (You'll see this in nested query strings that have been encoded twice.) But generally they are disencoded on arrival; the user does not know or care whether you've got an explicit function to deal with it, or the browser and/or server does it on its ownsome.

If you ever post a link on a forum-- not this one, duh, but forums that allow links-- and it doesn't work, try percent-encoding any unusual characters like parentheses. Works a treat. (For the two people who didn't already know: It's percent followed by the hexadecimal number of the character. For example %3C paired with %3F in language of your choice.)

Lame_Wolf

WebmasterWorld Senior Member lame_wolf us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4521761 posted 12:17 am on Nov 25, 2012 (gmt 0)

Hyphen always trumps
So do I after a curry ;)
Dan01



 
Msg#: 4521761 posted 1:59 am on Nov 25, 2012 (gmt 0)

Good info Lucy.

ganeshjacharya



 
Msg#: 4521761 posted 2:26 am on Nov 25, 2012 (gmt 0)

BUT... I still use the .htm


What if tomorrow a new sub-section is to be created for the desired topic in concern?

Suppose

pages.htm has now grown large has needs to be divided into

pages.htm/newpage1.htm
pages.htm/newpage2.htm etc?

the usage of pages.htm towards the category seems a bit odd. But on the other hand if a page name without extension was used they could be then easily divided into further pages.

Say

pages
(without an extension ".htm")

can be then easily divided into

pages/newpage
pages/newpage1/subsection
pages/newpage2/subsection
pages/newpage2/subsection/subsubsection

etc

Sgt_Kickaxe

WebmasterWorld Senior Member sgt_kickaxe us a WebmasterWorld Top Contributor of All Time



 
Msg#: 4521761 posted 3:12 am on Nov 25, 2012 (gmt 0)

Vic-20 was where I wrote my first actual program, though I owned one of everything at the time. PEEK and POKE commands saved to a tape deck, yikes!

Anyways, traffic to my index page from Google has doubled this week though it seems to be at the expense of some other pages, wouldn't it be ironic if it had something to do with a flattening of the web to devalue excessive internal linking? We'd have to pick out urls and extensions(if any) carefully.

Dan01



 
Msg#: 4521761 posted 3:40 am on Nov 25, 2012 (gmt 0)

pages.htm has now grown large has needs to be divided into

pages.htm/newpage1.htm
pages.htm/newpage2.htm etc?


I am not sure what you are talking about, but I wouldn't name a folder with a dot.

Dan01



 
Msg#: 4521761 posted 6:32 am on Nov 25, 2012 (gmt 0)

ganeshjacharya, What would prevent me from making a new folder without the .htm ?

ganeshjacharya



 
Msg#: 4521761 posted 6:58 am on Nov 25, 2012 (gmt 0)

I am not sure what you are talking about, but I wouldn't name a folder with a dot.


Yes that is what I too wrote.

the usage of pages.htm towards the category seems a bit odd.

ganeshjacharya



 
Msg#: 4521761 posted 7:24 am on Nov 25, 2012 (gmt 0)

ganeshjacharya, What would prevent me from making a new folder without the .htm ?


While structuring a website that use hierarchical URLs... a topic can have deeper sub and sub-sub topics.

e.g. example.com/sportscars (might initially start with talking about Sport Cars... and then the topic may have to expand further to discuss example.com/sportscars/BMW example.com/sportscars/Bugatti

and in this case if we start with example.com/sportscars.htm and then need to divide this topic further we will need to start with a empty example.com/sportscars/ and then add a redirect to example.com/sportscars.htm etc. which takes additional efforts.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4521761 posted 7:51 am on Nov 25, 2012 (gmt 0)

Starting with extensionless example.com/sportscars then moving to example.com/sportscars/BMW and example.com/sportscars/Bugatti means that a request for example.com/sportscars will be automatically handled by the server as an external redirect to example.com/sportscars/ and the second request would then be fulfilled by the file at example.com/sportscars/index.ext

If you're using a CMS and internal rewrites, the code changes for that are also very easy.

Dan01



 
Msg#: 4521761 posted 8:15 am on Nov 25, 2012 (gmt 0)

pages/newpage
pages/newpage1/subsection
pages/newpage2/subsection
pages/newpage2/subsection/subsubsection


I don't think you would gain any more rank by that.


You have a webpage - newpage or newpage.htm

I don't know what addition efforts you need.

[edited by: Dan01 at 8:23 am (utc) on Nov 25, 2012]

ganeshjacharya



 
Msg#: 4521761 posted 8:21 am on Nov 25, 2012 (gmt 0)

If you're using a CMS and internal rewrites, the code changes for that are also very easy.


I agree. But why put extra efforts if just avoiding the extension serves the purpose?

This 64 message thread spans 3 pages: < < 64 ( 1 [2] 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved