| This 42 message thread spans 2 pages: < < 42 ( 1  ) || |
|301-redirect though URLs don't change?|
I have to manage a redesign of old static sites and their pages have .html-Urls. As the new CMS will be Wordpress and WP doesn't allow .html-Urls for pages I think about a 301-redirect.
But I don#t feel very good with a 301, because it is meant to manage a migration,i.e., changing Urls. In my case it would be the opposite, I would use to keep the former .html-Urls.
How will Google treat that?
Would you do this?
|It is vital that you have a clear idea about the differences between URLs "out there" on the web and files and paths "here" inside the server |
It always amazes me how well TBL had thought this stuff out by 1996, and how widely ignored it is 16 years later! The Axiom of Opacity is fundamental.
|I did some more research into the url convention issue. Apparently in WP 3.3 this was fixed ( [core.trac.wordpress.org...] )? WP 3.3 certainly did seem zippier than 3.2. |
Thanks so much! I didn't know this had been fixed. It persisted for years in WP. Glad to hear I can take it off my list of concerns.
It's too late to edit my original post, but hopefully everyone who reads it will read on to your link.
|if you have 1000 images on a page, then doing 1000 file checks... didn't notice a difference. |
Honestly, I think to notice a difference, you would need to be i/o constrained. If you have low traffic or you're CPU constrained, I think it would be rare to see much of a difference. But if you have lots of concurrent users reading and perhaps even writing files (e.g. image uploads or simply writing comments to the DB in a setup where the DB server and web server are on the same disk), you might start to see something there.
you really tied all the loose ends together there...
the performance loss for the -f and -d tests depends a lot on the web server OS and how it is tuned.
for example bsd with sufficient memory normally works well for this with its dynamically optimized disk cacheing.
the existence check is only "expensive" the first time.
Aren't axioms fun? They're the scientist's equivalent of "Because I say so, that's why." But calling something an axiom doesn't make it so; this isn't a+b = b+a we're talking about. You could with equal propriety use the linguistic model of marked vs. unmarked. There's more than one way to approach the same subject.
The file index.php may not in and of itself have user-viewable content. What it does have is a physical existence on the server.
Isn't Otto saying that the problem of scaling complex URL strcutures to large number of pages has been fixed in wordpress 3.3?
Anyway, I would prefer to migrate static HTML pages to Wordpress Posts, unless you have something like say an hierarchy of Pages, as Wordpress Posts are not designed to be hierarchical.
|And after updating my theme I must add the code again, right? |
That piece of code goes into functions.php file of your theme. When you change themes, you will have to make sure that the code is retained in functions.php file of your new theme.
When you are merely updating your theme to a higher version, you might not have to do it unless the theme developers change the content of functions.php. Good theme developers usually provide a separate <custom>_functions.php file for you to add your own custom code. If there is one, the code goes there and you needn't have to worry about updating your themes.
Thanks to you all, excellent forum.
Slowing down with pages is fixed whatever permalink setting in WP or manual creating/editing of the slug I choose? Or just when using "%postname%" as customized permalink structure in WP?
O.k., url extensions are not good for technical and performance reasons, but to make a real Url migration just because of that...
What about the trailing slash?
After inserting the code and rebuilding with WP is done I will add some new pages, which are not existing at the moment. This pages then MUST have .html too, I don't have a choice right? It's prefixed by the code.
Trailing slash URL denotes a folder or the index page in a folder.
Non-trailing-slash URL is a page (however if the name matches a folder the DirectorySlash directive performs a redirect adding the slash).
URL with extension is a file (but might be a page).
|Isn't Otto saying that the problem of scaling complex URL strcutures to large number of pages has been fixed in wordpress 3.3? |
Yes, my bad and already caught by smith. Sorry about that - going too fast. It's now relatively hard to break this:
|In theory, you could break this by making lots and lots of Pages, if you also made their hierarchy go hundreds of levels deep and thus make the loop operation take a long time. |
|this isn't a+b = b+a we're talking about. |
No, it's really more of a Really Well-Thought-Out Guideline than an Axiom. I have to say that for me TBL just consistently gets it right on URLs. He had the foresight to see that websites with URLs would be around for a very long time, but the technology that runs those sites would change over time.
A fundamental rule of good design is programmatic abstraction. The Axiom of Opacity is really just a corollary to the idea of an API to a class having a set of public methods and properties which should tell you nothing about what happens inside the class itself. Lots of experience creating stable and maintainable code has shown the importance of this sort of abstraction that allows you to completely change the underlying technology without breaking the API.
So when you look at a URL schema that tells the user or user agent something about the underlying technology, it's a mistake in my book.
|What it does have is a physical existence on the server. |
Yup, and that's none of the user's business. This is why the Axiom of Opacity is so important. Since I have nothing in my URL that is intimately tied to the underlying technology, I can completely change the technology, do away with the index.php file entirely, and the user need not be inconvenienced in any way but such low-level implementation details.
I get this flexibility and maintainability advantage precisely because the "page" location is not index.php and is not tied to a specific location on the file system. Rather, it's in the place pointed to by the URL.
To reiterate in abbreviated form what g1smd said, URLs are URLs, and system paths are system paths. It's hard for me to see it any other way.
|After inserting the code and rebuilding with WP is done I will add some new pages, which are not existing at the moment. This pages then MUST have .html too, I don't have a choice right? It's prefixed by the code. |
yes, .html extension gets appended or suffixed to all page permalink structures including the newer ones that you create.
There is a way to get around it but it gets complex.
How many HTML pages do you have currently? If there are few, you could retain and serve them as static HTML files on your new set up and go extensionless for your new pages in wordpress. I couldn't think of any other easier panacea now.
|What it does have is a physical existence on the server. |
One of the most important bits of the server configuration is the ServerName or ServerAlias directive. It defines (in conjunction with the DNS settings on the DNS server) what hostname requests the server will respond to, instead of (err, in addition to) requests for a particular IP address.
Another important directive is the one that defines DocumentRoot. This defines the base folder of the server harddrive that will be web-accessible. Everything above that (the server OS, etc) will not be accessible from the web.
If DNS and servers didn't act the way they did, then rather than accessing a site using
http://www.example.com/folder/this-page you would be using
http://188.8.131.52/c:Documents and Settings/Users/Jim Doe/My Documents/Websites/Shop Web Site/May 2012 Version/folder/page.html to access the site and when Jim changes his PC out for a Mac, the entire folder structure "above the site" would be different.
We take this functionality for granted, but URLs are a reference system used out on the web and files are a separate reference system used inside the server. They are related merely by the action of the server software mapping URL requests to internal server locations. The server doesn't make URLs for files. The server responds to URL rquests by looking for particular files inside the server. A rewrite works "exactly backwards" to the way that most people seem to think about what is happening. A rewrite causes the server to look for a file that is not the default file suggested by the path part of the URL. Once you "get" that, you can do so much more with your site technical setup.
It's a comment that Jim used to post on a regular basis here. I'll readily admit that I didn't "get" it until at least the third reading.
It is not only one site, there are several to relaunch. At the moment they all have not more than 50 pages. But two or three of them should grow slowly but steadily - that's one of the reasons for a CMS now.
Maybe an onsite blog will be added.
So there are two ways:
Using the plugin, retaining the pages with .html and new additional pages won't get the .html. Then I had two different url-types for pages on one site, what is not so nice in my eyes, but get rid of the .html partly.
Or using your code and all pages, including future new ones, will have .html. One url type consistently, but growing number of .html-urls.
| This 42 message thread spans 2 pages: < < 42 ( 1  ) |