homepage Welcome to WebmasterWorld Guest from 54.145.183.126
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 42 message thread spans 2 pages: 42 ( [1] 2 > >     
301-redirect though URLs don't change?
deeper

5+ Year Member



 
Msg#: 4515006 posted 12:26 pm on Nov 2, 2012 (gmt 0)

Hi,
I have to manage a redesign of old static sites and their pages have .html-Urls. As the new CMS will be Wordpress and WP doesn't allow .html-Urls for pages I think about a 301-redirect.

But I don#t feel very good with a 301, because it is meant to manage a migration,i.e., changing Urls. In my case it would be the opposite, I would use to keep the former .html-Urls.
How will Google treat that?

Would you do this?

 

DitaP



 
Msg#: 4515006 posted 3:04 pm on Nov 2, 2012 (gmt 0)

Wordpress does allow .html urls, you just need to set this up in the permalink setting

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 3:27 pm on Nov 2, 2012 (gmt 0)

Add to your .htaccess:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

Then install the plugin: 'Custom permalinks'...go into settings and set custom structure to: /%postname%

301'ing is bad if not needed because I think you lose 10% of page juice with each 301. Plus if you 301 stack, you can get into trouble with google.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4515006 posted 4:23 pm on Nov 2, 2012 (gmt 0)

We've discussed the many deficiencies in that "standard issue" Wordpress code many times before, the last being only a few weeks ago. Do not use that code in its present form.

There is no reason why page URLs have to end .htm, .html, .php, .asp or any other such extension. Take this opportunity to "go extensionless". It will mean you can have much simpler rewrite code (and you can then also ditch the extremely inefficient -d and -f checks).

[edited by: g1smd at 4:26 pm (utc) on Nov 2, 2012]

deeper

5+ Year Member



 
Msg#: 4515006 posted 4:24 pm on Nov 2, 2012 (gmt 0)

@ditaP:
WP only allows .html for posts not for pages, check it.

@smithaa02:
Thanks for the code, which should be placed in the htaccess, but what does it?

Where to fill in all the page-Urls like www.website.de/apples.html
www.website.de/bananas.html
...
so that WP replaces his Urls with the former html-Urls?

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4515006 posted 4:28 pm on Nov 2, 2012 (gmt 0)

That code does nothing for .html URLs.

Additionally, htaccess can never "make" a URL. No. It does the exact opposite. To "make" a URL you must link to it from the pages of your site. htaccess kicks in only after the link is clicked. By then it is too late to change the URL. When a rule matches the requested URL it either issues a redirect to a different URL and your browser then makes a new request for the new URL, OR it uses an internal rewrite to fetch content from a non-default location inside the server. This depends whether the RewriteRule has been configured as a redirect or as a rewrite.

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 5:06 pm on Nov 2, 2012 (gmt 0)

Deeper, that is the standard wp .htaccess code that pipes all url requests to index.php which does the grunt work in making almost any url structure you want work.

Once you install that .htaccess code, then install that plugin 'custom permalinks', then configure it (per my original instructions), you will be able to call your wp pages whatever. Just edit the page...tab out of the title to get the url box to appear then type in whatever.html, click save and you are good. This way your new url matches your old url.

G1smd...that will work for .html urls...I use it all the time on my wp sites.

I disagree about wp making those urls...figuratively it does with the rewrite and without a redirect. Google doesn't know the difference (or doesn't care) so it is not a big issues IMO.

I agree that for new sites, you can getaway with not inserting the html extensions.

The advantage of adding them for an old site is that you don't have to 301 (and not lose %10(?) of your page juice from old links).

Wordpress also has the annoying habit of insert trailing slashes onto the end of pages. With custom permalinks you can avoid this nonsense and possible seo complications. mydomain.com/test, mydomain.com/test/ are technically different urls so IMO trailing slashes should be used sparingly and only for true directories.

WP is a great system but it gets some other seo aspects wrong as well... It's canonicilization is buggy...it's rel next/prev is incorrect, it create tons of dupe pages (especially with categories/author pages/keywords) if you're not careful and it inserts annoying feed pages into your header. But with the correct fixes (mostly to functions.php) and plugins, it is a very slick platform that serves seo well.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4515006 posted 8:38 pm on Nov 2, 2012 (gmt 0)

I disagree about wp making those urls...figuratively it does with the rewrite and without a redirect.

You can't make an url figuratively. It is either there or it isn't. What WP is doing is SERVING CONTENT from one location while pretending to live at a different location.

@deeper: The bottom line is this. Even if the only change you make is in removing the final ".html", you have still changed the URL. That means people requesting blahblah.html (from outdated bookmarks or similar) must be redirected to the naked "blahblah" form. You can't have both forms floating around unless you truly aren't concerned with Duplicate Content.

deeper

5+ Year Member



 
Msg#: 4515006 posted 9:55 pm on Nov 2, 2012 (gmt 0)

@smithaa02:
Obviously your solution needs both the rewrite-code and the plugin.
Why not just using a plugin like
[wordpress.org...] or
[wordpress.org...] ?

@lucy24:
I know why it is necessary... DC and loosing backlinks.
But I still don't know the best solution in my case...

And I still don't feel good with using a 301. Nobody shares this feeling using a Url-migration-tool in order to keep up the Urls?

A plugin on the other hand makes me dependant "for lifetime" and there will be changes of WP and the theme.

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 10:19 pm on Nov 2, 2012 (gmt 0)

Lucy...correct me if I'm talking about totally the wrong subject...but I still don't see the problem.

The .htaccess script (which is common not only with wordpress ( [codex.wordpress.org...] ), but with drupal and other CMS'es), merely as apache ask...does a file exist with that file name already? If not, send it to a php script (index.php) which will match the url values to database values to see if the page exist. If it does it will be served up no problem...if not, wp displays a 404 page.

The end user or google doesn't have a clue that this happens nor cares...even if they examined the request and response headers...because this is orchestrated by apache behind the scenes.

There won't be duplicates nor stemming. If I create a wp page that says sample.html...only that is served. Not sample.htm nor sample (the latter two result in 404's assuming the .haccess is configured as I showed). On my wordpress setup sample.html/ redirects immediately to sample.html which is perfect. A lot of people configure their wordpress installs like this and don't have any problems.

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 10:27 pm on Nov 2, 2012 (gmt 0)

Deeper...there is no way for wordpress to rewrite say sample.html to index.php without help from apache (and therefore a .htacccess). This is a common configuration for WP and is detailed on their site here:

[codex.wordpress.org...]

You can install other plugins, but they will depend on the .htaccess lines of code (easy to copy and paste in). I have played with "
.html on PAGES" myself before and like it. I have messed with a lot of custom url rewriting scripts, but 'Custom Permalinks' is still my favorite though.

If you install this, it will be easy to keep your old urls and not lose the 10-15 percent page juice loss that Tedster said Cutts said you lose with 301's (the alt being 404's which orphans way too much page juice or 301 stacking which doesn't work).

As for your being dependent on the plugin...I wouldn't worry about it. The plugin is free...simple and has survived many an updates. Even if ophaned by the author and facing a serious wp security upgrade, it wouldn't be difficult to hack it or to find a different plugin. Custom permalinks has worked great with many a wordpress themes and wp versions for the wp sites we've done and I highly recommend it.

What should be clarified is that this is a two stage process. The .htaccess configuration remove the need for index.php in the url and custom permalinks gets rid of the need for a trailing slash and let's you set the url to be whatever you want. Most WP users do the former, but not the latter...but should do both.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4515006 posted 1:25 am on Nov 3, 2012 (gmt 0)

There won't be duplicates nor stemming. If I create a wp page that says sample.html...only that is served. Not sample.htm nor sample (the latter two result in 404's assuming the .haccess is configured as I showed).

Once you are in WordPress, the "real" underlying name of the page isn't html any longer though is it? It's .php. So you're not rewriting "blahblah" to serve content from "blahblah.html" or "blahblah.php" or extension of your choice. You're rewriting "blahblah.html" to serve content from "blahblah.php". Sure, this can be done. If you know what you are doing, you can achieve just about anything with mod_rewrite.

But among Good Rewriting Practices, making extension A serve content from extension B is probably pretty near the bottom of the list.

Now, personally-- as a user-- I don't care for extensionless URLs in any case. Whenever I see one my first reaction is to tell it to go home and put some clothes on. But that's me ;)

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 2:28 am on Nov 3, 2012 (gmt 0)

Sure, this can be done. If you know what you are doing, you can achieve just about anything with mod_rewrite.
Not to be nitpicky...but this isn't some run of the mill hack. It is the official recommended implementation from wordpress and is very common (any wordpress site that doesn't have index.php prefixed in the urls is using this).
indyank

WebmasterWorld Senior Member



 
Msg#: 4515006 posted 3:33 am on Nov 3, 2012 (gmt 0)

I have to manage a redesign of old static sites and their pages have .html-Urls. As the new CMS will be Wordpress and WP doesn't allow .html-Urls for pages I think about a 301-redirect.


Are you sure you only want to migrate your static sites to "Pages" in wordpress? Why not Posts?

Also try this code in your functions.php for pages:

add_action('init', 'my_page_permalink', -1);
function my_page_permalink() {
global $wp_rewrite;
if ( strstr($wp_rewrite->get_page_permastruct(), '.html') != '.html' )
$wp_rewrite->page_structure = $wp_rewrite->page_structure . '.html';
}

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4515006 posted 4:55 am on Nov 3, 2012 (gmt 0)

this isn't some run of the mill hack. It is the official recommended implementation from wordpress

We've discussed the many deficiencies in that "standard issue" Wordpress code many times before, the last being only a few weeks ago. Do not use that code in its present form.

Widespread problem: Just because a particular set of rules is built into or recommended by some big name CMS doesn't mean it's good code. Every time I think I've seen the worst, along comes someone quoting a rule that makes me shed even bigger tears. Heck, Apache itself has been known* to give examples containing "(.*)exact-text-here" -- and I don't mean examples of What Not To Do.


* Known to me, at least. If I ever find a specimen again I'll bookmark it.

deeper

5+ Year Member



 
Msg#: 4515006 posted 2:56 pm on Nov 3, 2012 (gmt 0)

@smithaa02:
I understand that I need htaccess anyway, but I don't understand, why the plugin custom permalinks additionally needs the htaccess code you mentioned. The plugin description at WP nothing says about additional htaccess-code and thousands seem to use the plugin without that code. Noone seems to have problems with the index.php.

Could you explain why you prefer custom permalinks compared to other plugins? Small efficient code, not influencing loading speed, what is your experience concerning that?
Or other reasons?

@indyank:
There are some reasons why "static made" posts are not a good idea, for example, when adding an onsite-blog. Of course ecerything can be fixed anyhow, but I'm not a coding expert and like things simple and efficient. Besides my pages are pages and Google should identify that correct in the code, not reading there "post" blabla.

That code does the same as the html-on-pages-plugin, just adds .html to all pages, to ALL and ALWAYS? Please explain, what it does.

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 3:52 pm on Nov 3, 2012 (gmt 0)

@Deeper...just a quick primer on what does what.

The .htaccess lines will do away with the need to have index.php in your urls which is bad, bad, bad. In fact if you go to /wp-admin/options-permalink.php already, it will probably already ask if you want to install those exact lines of code.

This is a great improvement, but wp still adds a lot of junk to the urls like categories and trailing slashes. You can customize this somewhat at /wp-admin/options-permalink.php but not to the extent that you can create .html pages.

Again what this does is to enable a permalink for each page to negate the need for index.php to be displayed (apache still pipes the url to index.php so the page can be generated dynamically on the fly which is a standard methodolgy for CMS's).

'Custom permalinks' then takes this a step further because it allows you to set at /wp-admin/options-permalink.php a custom url structure that doesn't have a trailing slash.

As for other plugins...they may do the trick...it's been a while since I played with them. My issue was mostly that they would only apply in some circumstances (like pages only and not posts...I wanted this everywhere). Some custom url plugins even had trailing slashes still which was bad bad).

None of the plugins should influence speed too much. WP still has to do a permalink lookup regardless in the db, and these plugins just let you specify that your permalink can be for example /fakedir/fakedir2/fakepage.html. I guess overall my preference for the plugin is that it just works...they may be find alternatives out there now.

If you are worried about speed, install w3 total cache (wp plugin) and have your web hosting company enable memcache...this will help A LOT.

As for pages vs posts...I think you are right to uses pages. Logically, posts are designed to be amalgamated into collections/archives/snippets/slideshows/etc...which is not what you're after, so pages are more appropriate. Plus, if you're not careful, it is easier to create duplicate content with posts compared to pages because of how WP is structured.

I believe indyank's code still depends on the .htaccess implementation as again there is no way for wp without the help of apache/.htaccess to implement url rewriting.

deeper

5+ Year Member



 
Msg#: 4515006 posted 5:38 pm on Nov 3, 2012 (gmt 0)

@smithaa02:
Thanks for your comprehensive comments and your patience. May be you have a bit more.

I won't use index.php in my urls ever, that's sure.
The "bad need" of index.php in the url: Is that something I will notice with my eyes when creating new pages or something going on in the background of WP?

Why should I ever be faced with index.php in my urls apparently, when I

-set "postname" in the custom permalinks,
-edit the slug for every page individually when creating pages,
-adding .html by using a plugin
-tell WP my site-Url (without index.php) in the general settings and
-set a static front page in WP



Nice to hear that the plugins are simple and fast. Thanks for the tip with memcache, I will ask my hoster.

As they are simple they should be easy to fix in the WP editor when - may be some day - a plugin doesn#t work with a new WP-version and is not supported any more, right?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4515006 posted 11:24 pm on Nov 3, 2012 (gmt 0)

index.php in the url: Is that something I will notice with my eyes when creating new pages or something going on in the background of WP

That is precisely the definition of an URL. It's what you see with your eyes. It can be completely independent of where the content really lives.

indyank

WebmasterWorld Senior Member



 
Msg#: 4515006 posted 3:04 am on Nov 4, 2012 (gmt 0)

There are some reasons why "static made" posts are not a good idea, for example, when adding an onsite-blog. Of course ecerything can be fixed anyhow, but I'm not a coding expert and like things simple and efficient.


Posts will be static as long as you don't update them. Even pages are dynamically generated in wordpress as the content is stored in a database.

You can read this to understand the real difference between Posts and Pages in wordpress.

[en.support.wordpress.com...]

Besides my pages are pages and Google should identify that correct in the code, not reading there "post" blabla.


For google it doesn't matter whether content is served via Posts or Pages and it wouldn't try to know it either.

indyank

WebmasterWorld Senior Member



 
Msg#: 4515006 posted 3:09 am on Nov 4, 2012 (gmt 0)

The code is a simple one. It just appends .html to the Page permalink structure using the wordpress URL rewrite object.

I believe indyank's code still depends on the .htaccess implementation as again there is no way for wp without the help of apache/.htaccess to implement url rewriting.


no, it doesn't depend on .htaccess.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4515006 posted 3:34 am on Nov 4, 2012 (gmt 0)

whenever you are using permalinks there is a standard wordpress mod_rewrite section added to .htaccess that internally rewrites all requested non-file/non-directory URLs to index.php.


Using Permalinks « WordPress Codex:
http://codex.wordpress.org/Using_Permalinks [codex.wordpress.org]


http://codex.wordpress.org/Using_Permalinks#Creating_and_editing_.28.htaccess.29
Creating and editing (.htaccess)

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

indyank

WebmasterWorld Senior Member



 
Msg#: 4515006 posted 3:45 am on Nov 4, 2012 (gmt 0)

When I said that the code doesn't depend on .htaccess, I meant it doesn't require you to have any custom rules of your own in .htaccess. Yes, wordpress does have a standard mod_rewrite section and you will continue to have them in your .htaccess.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4515006 posted 3:59 am on Nov 4, 2012 (gmt 0)

thank you for clarifying that.

essentially it means "it all happens in index.php".

deeper

5+ Year Member



 
Msg#: 4515006 posted 10:04 am on Nov 5, 2012 (gmt 0)

@indyank:
Thanks for the code. So, this would make me independant from a plugin.
But then ALL pages obligatory had .html? Just to know...

And after updating my theme I must add the code again, right?

ergophobe

WebmasterWorld Administrator ergophobe us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4515006 posted 6:51 pm on Nov 5, 2012 (gmt 0)

Whew! There's a lot in this thread. There's one potentially really dangerous piece of advice, though, so let me hit that one first.

As for pages vs posts...I think you are right to uses pages.


Emphatically, absolutely and definitely do NOT use pages if you are converting a lot of content and importing your existing URL structure. This is a BAD IDEA.

While this is formally true:

index.php which does the grunt work in making almost any url structure you want work.


there are problems. The "grunt work" is actually done by /wp-includes/canonical.php, but yes it is true that Wordpress as a system handles this internally these days rather than in a massive generated .htaccess file like it did early on. And it's also true that Wordpress can deal with any URL structure.

That said, it will deal with some of them very poorly, in particular if you are using Pages. If you have Page URLs in the form %postname% or %category%/%postname% Wordpress won't (or as of 2011 would not) scale well. I forget the reasons, but I believe it has to do with the fact that Post slugs need to be unique (regardless of what the rest of the path is, the last element must be unique, so it's a simple lookup), whereas Page slugs do not have this requirement, dramatically increasing the difficulty of the lookup, especially with a deep URL structure. I'm not sure I have that right, but here's what Sam Wood (aka Otto, core contributor to Wordpress) has to say:

Once you have about 50-100 static Pages or so, and you’re using an ambiguous custom structure, then the system tends to fall apart. Most of the time, the ruleset grows too large to fit into a single mySQL query, meaning that the rules can no longer be properly saved in the database and must be rebuilt each time. The most obvious effect when this happens is that the number of queries on every page load rises from the below 50 range to 2000+ queries, and the site slows down to snail speed.

source: [ottopress.com...]
See also: [ottopress.com...] and [core.trac.wordpress.org...] and [wordpress.org...]

The general advice is that if you think you're going to have thousands of Pages in the WP sense of that, or very very large numbers of Posts, you should do the lookup based on numeric data (date, post id) at the beginning of the URL rather than the post slug or category. Since you're importing an existing URL structure, that can't happen. Therefore, Pages are a Bad Idea(tm).

Which leads me to
indyank wrote:
Are you sure you only want to migrate your static sites to "Pages" in wordpress? Why not Posts?


I would be more emphatic and say that, in fact, you should be sure that you want Posts not Pages. Definitely NOT Pages.

Honestly, compared to that site-threatening issue, I think the rest is details.... but I'm into details.

Nobody shares this feeling using a Url-migration-tool in order to keep up the Urls?


I think people are sypmathetic to that. It's a legitimate concern. The question is weighing your options. Generally, I agree with g1smd - file extensions on URLs are *evil*. It's only going on 16 years that Tim Berners-Lee laid this out (see "Axiom of URL Opacity"). No new site should do this. But you don't have a new site and the question is whether or not now is the time to make that transition and you can argue it both ways. If it simplifies technology upgrades present (maybe) and future (definitely), then maybe it's the time. If it's a major headache and you are uncomfortable rolling out a huge number of 301s all at once, maybe it isn't the time.

You could roll it out with the same URL strucutre you currently have and then change the URLs in bacthes internally until you have gotten rid of the extensions and then can get rid of any corresponding plugins. In general, Wordpress has come a huge way since some years ago when it was likely to be sending out soft 404s and 302s and all that. Nowdays, if you have a URL alias and you change it from within Wordpress, it will automatically handle the 301. Obviously, you have to check your own setup, especially when you're doing something custom like this, but overwhelmingly Wordpress will now handle these sorts of things intelligently and do a decent job of canonicalization below the domain level.

-------

g1smd wrote:
htaccess can never "make" a URL.


smithaa02 wrote:
I disagree about wp making those urls..


g1smd never said that "wp" can't make URLs, but that .htaccess can't. Of course, for internal links, WP is going to generate a lot of URLs (navigation, internal pingbacks, etc), the user will generate others (manual links inside a post) and, naturally, external sites will create URLs to your site. Of these three types of URLs, WP only really controls the first type, so you still have to deal with the others. Of course, it's important to get your internal linking structure right, because that is how search engines will crawl your site and will ultimately determine which URLs they have in their index, but in the case of a legacy site with old URLs inbound from other sites, you'll never have full control of who "makes" the URL and where. So .htaccess can serve as a traffic director, but it won't "make" anything.

----

g1smd wrote:
That code does nothing for .html URLs.


smithaa02 wrote:
that will work for .html urls


Again, it will work for .html URLs (and .asdfghwer URLS too) if those URL aliases are in the WP database and are a valid lookup path. But the .htaccess doesn't do anything for .html URLs. It only tests to see whether the URL points to a file or a directory, does a relatively expensive lookup to the filesystem for each of these checks, and if they fail, passes it to index.php which then does some preliminary checks and passes things on to canonical.php for all the URL parsing. So while that code will work for .html URLS given a WP setup designed to take that into account, the standard WP .htaccess does nothing with .html URLS except pass them on like any other URL.

So it's an apples and oranges discussion, which I think is leading to some disagreement where there need be none.

----------

smithaa02
The .htaccess script (which is common not only with wordpress ( [codex.wordpress.org...] ), but with drupal and other CMS'es),


Lucy24
doesn't mean it's good code.


It is indeed standard and common to WP and Drupal and is used without problem on millions of sites. True, that doesn't mean it's "good" code - the KSES code that was shared across Wordpress, Drupal, Moodle and others made them all vulnerable to XSS attacks in 2010. But in general, a site will work fine with the distro .htaccess and compared to the Pages/Posts issue, this is minor.

g1smd's point in mentioning it, is that the standard .htaccess file for these CMSes has somewhat expensive file system lookups that can be avoided and you can gain some easy efficiency by modernizing your .htaccess file to the more efficient one that JDMorgan, g1smd and others banged out. See the relevant threads on

WP: [webmasterworld.com...]
Drupal: [webmasterworld.com...]
Joomla: [webmasterworld.com...] and [webmasterworld.com...]


What WP is doing is SERVING CONTENT from one location while pretending to live at a different location.


This only makes sense if you do not think in terms of Tim Berners-Lee's original Axiom of URI Opacity

The only thing you can use an identifier for is to refer to an object. When you are not dereferencing, you should not look at the contents of the URI string to gain other information.


He follows up with an example in somewhat simpler language:

For example, within an HTTP identifier, even when access is made to the object, the client machine looks at the first part of the identifier to determine which server machine to talk to and from then on the rest of the string is defined to be opaque to the client. That is the client does not look inside it, it can not deduce an information from the characters in that identifier.

[w3.org...]

It is a violation of the Axiom of URI Opacity to think in terms of "pretending to live at a different location." A URL can't pretend anything. It simply points to a resource in a way that should be opaque to the user and the user agent. It's up to the script and the server to decide how this maps to a given resource which may or may not exist as a file. In other words, the URL is not "pretending" to point to a file on the file system, but is actually pointing to an object that is constructed in a black box either from a simple file lookup or a thousand rows from a hundred tables in six different databases.

Once you are in WordPress, the "real" underlying name of the page isn't html any longer though is it? It's .php.


I'm not sure what you mean by "name" here since we have URLs, meta titles, H1s and other things, but no names. Assuming you mean URL, I must again emphatically say NO!, the URL in the address bar is the "real" location of the page. The server can dereference that URL any way it wants and to see page.html being the "fake" one and index.php being the real one can't be so -- the PHP files in Wordpress have no content so they are not pages in any sense at all. Again to say otherwise violates the Axiom of Opacity. Neo, there is no real and fake. There is only the URL.

deeper

5+ Year Member



 
Msg#: 4515006 posted 7:34 pm on Nov 5, 2012 (gmt 0)

Puh, I have to think about some things and then will answer.

Thanks for your comprehensive thoughts.
.

[edited by: Robert_Charlton at 7:58 pm (utc) on Nov 5, 2012]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4515006 posted 10:55 pm on Nov 5, 2012 (gmt 0)

@ergophobe Post
#4516013 is music to my ears. You "get it". Your interpretation of my comments about "making" URLs and processing of ".html" requests, etc, is spot on.

Extensionless URLs have to be the way to go. I'd be tempted to build a custom PHP module that redirects requests for old style URLs to new extensionles URLs, especially where the new URLs have a path that begins with a unique ID number before the slug text. The site can then operate in a much more efficient way, especially if the standard junk WP htaccess code is replaced with something more usable.

I also find discussion of "real location" and "pretending" to be confusing and a bit unhelpful. Jim had a simple boiler plate text which he, and I, repeated often.

It went along the lines of... "URLs" exist "out there" on the web. "Here" inside the server there's just "paths" and "files". The two are not at all the same thing. They are merely "related" by the action of the server software. htaccess cannot "make" a URL. URLs are made in the href="" part of the links on your pages. htaccess does its thing only after that link has been clicked and a request sent to the server. The server can respond with a redirect telling the browser to make a new request for a different URL. Alternatively, content can be fetched from a non-default location inside the server, one that is different to that suggested by the path part of the initial external URL request, and this action is commonly known as an internal rewrite.

It is vital that you have a clear idea about the differences between URLs "out there" on the web and files and paths "here" inside the server and the differences between external redirects and internal rewrites. The confusing thing is that both actions can be configured with a RewriteRule with only minor syntax differences between the two.

smithaa02

5+ Year Member



 
Msg#: 4515006 posted 12:03 am on Nov 6, 2012 (gmt 0)

Ergophone...fascinating stuff!

I did not know about about posts being faster than pages or that non-numeric urls had bloat issues.

I did some more research into the url convention issue. Apparently in WP 3.3 this was fixed ( [core.trac.wordpress.org...] )? WP 3.3 certainly did seem zippier than 3.2.

I turned on mysql logging and tested this for some test urls and I only got 30 requests per page (not bad). This site had lots of custom permalinks with the rewrite turned on and the 'custom permalinks' module activated. I certainly didn't see all the requests I had expected to based on one of those articles.

So hopefully (I could be wrong) but these issues are all fixed now as long as you have the latest wordpress?

I checked Jim's improved .htaccess code and it is intriguing. The logic is solid...if you have 1000 images on a page, then doing 1000 file checks on the server to see if those image exist is a waste and images (and other non-cms files should be excluded first).

Here is his code for WP which replaces the official reconsolidated code:

RewriteEngine on
#
# Unless you have set a different RewriteBase preceding this point,
# you may delete or comment-out the following RewriteBase directive
# RewriteBase /
#
# if this request is for "/" or has already been rewritten to WP
RewriteCond $1 ^(index\.php)?$ [OR]
# or if request is for image, css, or js file
RewriteCond $1 \.(gif¦jpg¦ico¦css¦js)$ [NC,OR]
# or if URL resolves to existing file
RewriteCond %{REQUEST_FILENAME} -f [OR]
# or if URL resolves to existing directory
RewriteCond %{REQUEST_FILENAME} -d
# then skip the rewrite to WP
RewriteRule ^(.*)$ - [S=1]
# else rewrite the request to WP
RewriteRule . /index.php [L]
#
# END wordpress

I did try this however with a big WP site and honestly I didn't notice a difference. I ran an online speed test before and after and the changes didn't matter. I also monitored the load on the server and that didn't change either. Perhaps newer versions of apache have ways of caching 'if file exists' queries? I don't know, but in my case the result were inconsequential. Apparently, the WP team considered making Jim's code default, but were worried that you could then make things like dynamic images or dynamic files that ended in image extensions.

As for general performance, I have ooldes of WP sites on dedicated servers, and I don't experience performance issues (well I did before 3.3) so I hope we're not making mountains out of mole-hills. Deeper, if you're worried about performance you might ask your web hosting company for specs, their experience with wp and find out what your estimated traffic will be. If you control the server you can test it with benchmarks...but I would think you should be fine and if you get slammed you should be able to find a better server.

As for your specific question about index.php appearing or not appearing the url.

Basically, if you don't modify your .htaccess file you will be stuck with:

eg /index.php/url1/ or /index.php/url2/

If you have already installed the .haccess file, then you can do:

/url1/ or /url2/

If you install 'custom permalinks', you can then do:

/url1.html and url2.html

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4515006 posted 12:18 am on Nov 6, 2012 (gmt 0)

...and for best performance I'd be using
/url1 and /url2 with RewriteRule patterns that explicitly filter extensionless requests for direct processing by index.php without all that -f and -d stuff.

The performance gain is small on most sites, but certainly comes into play when traffic is high.

This 42 message thread spans 2 pages: 42 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved