homepage Welcome to WebmasterWorld Guest from 54.204.94.228
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

    
Moving into 2013 from 1999
OLD Html to CMS etc
GodLikeLotus

10+ Year Member



 
Msg#: 4544822 posted 5:27 am on Feb 12, 2013 (gmt 0)

OK, where to begin? I have been running a few successful sites for the last 10 years or so, however, they have been built using a very old version of Netscape (who?) Composer. Now, this for you new guys is a single PAGE EDITOR, and through many 1000's of hours of work, has led to building one site of around 900 static html pages. Mainly articles and directory pages, built with loads of cells and tables.

Just to add more problems here, each page is NOT exactly the same in terms of the design template. Some of the navigation at the bottom of the pages is different, again on some pages.

Where do we even begin to start in terms of bringing our sites up to date?

My uneducated guess, is that we are going to have to cut and paste everything we have created in order to move forward here.

Any advice would be much appreciated.

 

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4544822 posted 5:53 am on Feb 12, 2013 (gmt 0)

My uneducated guess, is that we are going to have to cut and paste everything we have created in order to move forward here.


That's one approach - or hire a guy to write a script to extract the data from all your pages and import it into a new format.

The nice thing about doing it in automation is if you find you made a mistake you correct it and run it again, you're not 1K pages of manual labor into the task when you start over.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4544822 posted 9:04 am on Feb 12, 2013 (gmt 0)

Some of the navigation at the bottom of the pages is different, again on some pages.

That's trivial. Just chop it off-- you can find a script to do that* --and reapply something global. It's the meat of the pages that will be the problem. I'll bet you have style names that mean one thing on one page and something entirely different on another page.


* She said, glibly, although I haven't even got one of those clever text editors that can change a bunch of files concurrently without opening them. I just keep telling people they exist, no problem, everyone else can find one.

4serendipity

10+ Year Member



 
Msg#: 4544822 posted 8:55 pm on Feb 12, 2013 (gmt 0)

I would suspect that it would be fairly easy to strip out the content. I would think that using a combination of curl and regular expressions the extracted content would be a good place to start.

The difficulty of the automated process would increase with the level of inconsistency in the site's pages. However, I'd suspect that even having to tweak a script a good bit would be preferable to copying and pasting ~900 pages. Also, with the automated approach, you could run the extracted html through tidy to help get outdated markup up to snuff.

GodLikeLotus

10+ Year Member



 
Msg#: 4544822 posted 10:35 pm on Feb 12, 2013 (gmt 0)

Thanks for some of the advice here. I am more than a little concerned about the URL's. Over the years we have been linked to by many good quality web sites, however the URL's look something like this:

domain.com/article-directories/blue-widgets.html

It has been suggested to us that we have a custom built Wordpress template designed, and then we gradually move the content over. Can the URL's be kept the same or do we need to set-up 100's of re-directs?

I find the thought of moving everything quite nerve-racking.

I am well aware that the outcome could be disastrous if done wrong, or we upset the Google machine, but I just don't know what to do.

For years we have stuck with the idea of having the best site within our niche, which we have done and continue to do. But we still create and publish new articles using very old html.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4544822 posted 11:06 pm on Feb 12, 2013 (gmt 0)

You can be more at ease about Google than just a few years ago. The number of ranking factors that are not dependent on your mark-up is way up. This doesn't mean that you can't mess things up - it's still certainly possible. But it is not as common as it once was. Just develop your new site in a test environment be fastidious about quality control before launch.

One big place to mess up is definitely URL problems, especially introducing new but unintended versions of URLs that still resolve. This kind of thing can act like a slow poison to a site, rather than an immediate and obvious issue - and that makes it nasty.

Whether you can easily preserve existing URLs of not depends on many factors, but you are best off to look for ways not to introduce URL changes - especially at the same time that you introduce new mark-up. Of course, some people just bite the bullet and change everything at once, thinking "I'll just get the pain over with and then deal with whatever happens."

That's a brave approach but I've seen businesses end up happy with the final results. At least it let's you move the focus back to content and off technical issues a bit faster. But if you go this way, you should project an income drop and have a financial "war chest" available, just in case you need to bump up advertising after a loss in organic traffic.

Swanny007

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4544822 posted 1:15 am on Feb 13, 2013 (gmt 0)

Ideally you don't want any URLs to change. So redirects are less desirable than keeping the URL the same. Google can handle design/template changes no problem. But change a bunch or all of your URLs and you will be hurting...

Personally I wouldn't use a CMS unless there was a strong need for it. In fact, I've taken a wordpress site and turned it into a static HTML site simply because it gave me more control over the layout and I didn't really need people commenting on the articles anyway. There are another level of issues with running a CMS: script updates, script vulnerabilities, the additional load on the server, comment spam, database backups, etc. but there are also benefits. Personally I shy away from CMSs unless absolutely necessary.

I say if you're used to doing the old HTML pages then keep doing it, but update the template, and keep the URLs the same.

If you use an app like Dreamweaver you can do a global search-and-replace for sections of the template code (stuff in the head section, etc.) that would save a bunch of time, but in reality you would have to manually check every page when you're done to make sure the process worked right. Then again, it doesn't matter how you convert the site, you should check every single page for accuracy when you're done.

Definitely the most fool-proof way to do the conversion is one page at a time. Is it possible to use SSI/PHP includes for the new template, so you can include head.html, header,html, footer.html etc. to setup the template so you can easily change it in the future?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4544822 posted 2:18 am on Feb 13, 2013 (gmt 0)

If you can't maintain the original paths the URL issues can be easily resolved with what's called an Apache RewriteMap where you list all the original URLs and all the new destinations in a single file, or it can be done in a PHP script, both will 301 from the old location to the new.

Also install a canonical tag in the new page with the absolute path to give Google very clear instructions that this is the new path name seems to speed up the transition process.

Of course, some people just bite the bullet and change everything at once, thinking "I'll just get the pain over with and then deal with whatever happens."


I've done it both ways, incremental changes or one big change which is like ripping of a bandaid. The advantage of incremental changes is that it becomes easier to spot the effect of smaller changes and reverse them. When you change it all at once it's like launching a new site, you can't depend on what it did before except hope those 301 redirects help keep what ranking you had intact.

Assuming the new site is all done with nice clean HTML5 and CSS3 it'll probably do well.

Whether you use CMS is negligible unless you get very heavy traffic and use a shared server which can become really problematic uder a load. I would suggest CMS with the ability to export to static HTML cache, the best of both worlds. I believe there are add-ons that do this for WordPress.

FWIW, as long as you have the ability to export to static HTML, and test to make sure it works right upfront, I personally wouldn't worry about it until the site showed it was slowing down before going down that path as I know some people running CMS live on very heavy traffic loads and it performs very nicely but it also doesn't have a bunch of crazy 3rd party add-ons installed.

GodLikeLotus

10+ Year Member



 
Msg#: 4544822 posted 6:37 pm on Feb 18, 2013 (gmt 0)

Thanks again for the comments and suggestions. As the site in question does not have high volumes of traffic, less than 1000 unique visitors each, we have elected NOT to move to wordpress or a CMS.

Instead we are having a new template design done in HTML5 and CSS3 as suggested here. We then propose to manually move each page individually to the new design, whilst keeping to original URL. Although this will take time, it does allow us to take a fresh look at every page of the site, and make any improvements that may improve the page/article.

The plan is to start with the articles that get the least traffic, allowing us to see if the new design and code improves any of our rankings.

Will post a follow up here shortly.

swa66

WebmasterWorld Senior Member swa66 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4544822 posted 8:25 pm on Feb 18, 2013 (gmt 0)

Good choice IMHO.

I'd suggest to make good use of SSI (server side includes) so that footers, headers, navigation etc. are included from an easier to manage spot.

What I do nowadays:
every page includes footer, head, header, navigation and advertising from a file with a very slow increasing version number (actually only increases when I need to edit all pages for the change).
every of those files is set to again parse SSIs as needed and/or includes the references to more volatile things like the CSS: those get a version number based on a YYYYMMDDXX numbering scheme so that I can let be cached "eternally". When I update them I only have a few files to update that reference them.

Aside of that I make sure all the files are valid xml so if I ever need to automate a change, I can and have the tools to do so easily.

Swanny007

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4544822 posted 2:21 am on Feb 19, 2013 (gmt 0)

Excellent plan, GodLikeLotus. You'll definitely want to use some kind of includes for the template setup so you are a little more "future-proof" for this sort of thing down the road.

I include head.php in the head section, header.php just after the opening body tag, and footer.php just before the closing body tag. It has saved so much time in the long run, I use that everywhere (I mainly use PHP but you can use SSI includes and stick with .shtml or .html or whatever).

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved