homepage Welcome to WebmasterWorld Guest from 54.227.34.0
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / WordPress
Forum Library, Charter, Moderators: lorax & rogerd

WordPress Forum

    
Trying out HTML website to Wordpress Conversion
What to watch out for? Post & Page Links?
contentmaster

10+ Year Member



 
Msg#: 4652430 posted 9:01 am on Mar 9, 2014 (gmt 0)

I am attempting to convert an HTML 200+ pages website into WordPress. I have basic knowledge of WordPress and I have successfully installed WordPress in a separate folder on my html website. www.mysite.com/WRH

I am willing to put in time and energies into doing this carefully so that it becomes a learning experience. After this, I plan to convert another 500+ pages website to WordPress too. Having said that, I am currently coming across several doubts as I read more and more about HTML to WordPress conversion and possible link issues that one may face.

To begin with, I have a simple theme in place and I have created top nav pages for the Wordpress site and filled in content from the original website. My HTML website contains several articles on different categories and I have to create these on the WordPress website.

My Questions:
1. My html pages follow this link structure:
www.mysite.com/home-improvement/index.htm
www.mysite.com/home-improvement/study-room.htm and so on

If I add the article pages as separate posts under the category home improvement, what should I select as the permalink structure for the pages and the posts so that I do not face any broken links when I move WordPress files to my root folder?

2. What is slug and how do I set it for Wordpress pages and posts?

3. Once I am done working on the theme and have moved the entire WRH folder to the main folder (where I currently have all my html pages), what changes should I make to the settings so that my website now shows up my WordPress theme.

4. If in case I find errors after the move, can I go back to the previous setting so that my current site does not suffer?

5. Can you suggest a resource that explains the HTML to WordPress migration in a systematic step-by-step manner with all the necessary things one should watch out for.

Thanks in advance.

 

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 1:28 pm on Mar 11, 2014 (gmt 0)

IMHO - loosen up on the reigns. :)

Move a chunk (50?) of the 200+ and don't try to change how WordPress writes the URLs. Do your homework on how to use Permalinks [codex.wordpress.org] and choose the best option for you. Use a redirect plugin like Redirection to reroute old URLs to the new ones. Let the dust settle and then move another chunk of the 200.

You will likely take a hit in the SERPs for a bit while the SEs figure out what you did but you will more than likely bounce right back. You can add an SEO plugin like Yoast's SEO for WordPress and gain more control if your theme doesn't already provide it. My experience has been that if I get the change over quickly, then I am able to focus on using the new platform and building on what it offers and it has always improved site performance.

The more you try to wrangle WordPress into doing exactly what you want, the more issues you will create for yourself later. Let it do it's thing.

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 1:39 pm on Mar 12, 2014 (gmt 0)

Move a chunk (50?) of the 200+ and don't try to change how WordPress writes the URLs.


That sounds like a good idea. Does that mean that the pages I move at one go will open up with the Wordpress theme and the others will show the old HTML pages?

Sorry if I'm asking a silly question. This is way confusing for me :(

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 8:38 pm on Mar 12, 2014 (gmt 0)

NP - not a silly question. :)

Yes. Just copy the text into WordPress one page at a time - just try a few to see how it works and deal with whatever issues come up. It's about getting familiar with the process that works for you.

Strip the formatting (insert your copy with the editing window set to Text) if you can and use WordPress to format the page (your theme and custom CSS can replicate the original design/formatting if you want to take the time to do that).

You can still link to your old HTML pages.

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4652430 posted 2:35 pm on Mar 19, 2014 (gmt 0)

I am pretty sure that there is an importer plugin out there that will let you grab all your html from your static site and will convert it into wordpress posts.

I remember using it on a site and it was OKAY but that site had so much BAD html* that it took me a while to get everything cleaned up.

So if I were going to do a large site I might try looking for that plugin and testing it out.

*By bad html I mean they had a TON of colored text to emphasize certain words and the colors were slightly of. So some text might have one color hex code and another piece of text might have a different hex code. Some links were red, some were blue, some orange, some purple. It was like a Leprechaun puked up a rainbow.


Also, each page had a hard coded bio of the author on it, but there were variations in each bit, so I couldn't just do a simple find and replace.

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 8:23 am on Mar 20, 2014 (gmt 0)

I am pretty sure that there is an importer plugin out there that will let you grab all your html from your static site and will convert it into wordpress posts.


Can you give me a name that I can search for and try out?
Right now, I am manually creating the pages as posts, planning the category structure, etc. However carrying out this large, time consuming exercise for a 500+ pages website is unimaginable!

Well, the 500+ pages website has been made without CSS. So what we're really dealing with is tons of HTML coding in the Head and Body tags!

I remember using it on a site and it was OKAY but that site had so much BAD html* that it took me a while to get everything cleaned up.


What is involved in the cleaning up the code once the plugin is used to import the pages?

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4652430 posted 4:29 am on Mar 22, 2014 (gmt 0)

Hi again, Contentmaster:

"Can you give me a name that I can search for and try out? "

I wish I could but it was sometime last year and I don't have access to that website any more (and I am pretty sure I removed the plugin after I imported the content anyway).

Maybe the plugin was HTML Import 2 ? I would probably give that one a try.

"What is involved in the cleaning up the code once the plugin is used to import the pages?"

Well, that depends on what you have in your html coded pages.

If I remember correctly, you should also find a search and replace plugin (not sure which one) so that you can remove unwanted code from several posts at the same time.

The best thing is to try it out and see how much code that you might need to remove gets copied over.

By the way: I think it is a lot easier if all of the content on your html pages was in a single div. So you could tell the importer plugin to grab the content from only within a div named "content" or whatever the name of the div on your html page is. then it will ignore content in different divs.

You did put all your content into a single, consistently named div, right?

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 10:07 am on Mar 22, 2014 (gmt 0)

You did put all your content into a single, consistently named div, right?


:( I don't think so Planet13. Tell me if what I'm trying to achieve is impossible given the current circumstances...

Would it make sense to shut down the current website and start afresh by setting up WordPress and redoing the whole thing? (worried - overwhelmed - confused-)

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 12:23 pm on Mar 22, 2014 (gmt 0)

contentmaster,
I found the plugin Planet13 referred to here: [wordpress.org...]

It says (I have not tested it) it will import well-formed HTML pages that have some form of consistent structure like a <div> OR even <body>. No matter what there will likely be some clean up involved.

I suggest you try it on a small batch of pages. They don't need to be live - put a copy of the live pages in a test directory. Look at the code and find some code that is common - it may be a <div> or it maybe a <td> but surely you have something that defines the content area you wish to import.

Download the plugin, try importing and see what you get. If there are issues, look for consistency with the issues. Something you can use to search for an replace, then locate a search and replace program that works on your local computer and try cleaning up the files before the import.

Delete the pages out of WordPress and try again. Keep doing this on a small batch until you get it work reasonably well then you can bring in other pages. But always work in small batches until you're confident no big issues will pop up.

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 12:51 pm on Mar 22, 2014 (gmt 0)

Great! Thanks a ton. I'll try experimenting with HTML Import 2. I hope it makes my job somewhat easier.

One question:
My Questions:
1. My html pages follow this link structure:
www.mysite.com/home-improvement/index.htm
www.mysite.com/home-improvement/study-room.htm and so on


I've set up the pages, categories and post permalinks. I need one more thing -

Since my current website has an index.htm page for each category of articles (all the articles under that category are listed here), what is the easiest method to have a category index page?

Will this index page get automatically updated each time a new post / article is added under the category?

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 2:23 pm on Mar 24, 2014 (gmt 0)

In WordPress language, each Category has what is referred to as an Archive page. It is the home where all of the posts assigned to that category, can be accessed. It works just like you would expect - most recent at the top with pagination as you set it in Settings. This is automatically updated based on what posts are public and published.

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 2:37 pm on Mar 24, 2014 (gmt 0)

Thanks, but can I rename each archive page as index.htm so that I can replicate my website's link structure? I tried to rename a page but instead of index.htm it automatically kept changing to index-1.htm

I also need to add some content to each index page followed by the links to the individual article pages? Is this possible? (Thanks for all your guidance).

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 4:59 pm on Mar 24, 2014 (gmt 0)

Out of the box - no you cannot. You'll need to create custom pages to do what you're looking for. Quite possible there's a theme out there that already does this or maybe even a plugin but you'll need to research to find one.

A word of caution, the further away from the normal WordPress URI structure and functionality you venture, the harder will be to maintain the illusion you create. :)

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4652430 posted 2:41 am on Mar 25, 2014 (gmt 0)

lorax is correct: Do the WP way... it is NOT an index.html top down website. It is a blog/cms software. The end functionality for the user is the same, but the actual structure is quite different.

Ultimately, I'd take the hit on the site change to a new url structure and move on from there. YMMV

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4652430 posted 4:03 am on Mar 25, 2014 (gmt 0)

Agree with lorax and tangor; by doing things the wordpress way, you will better avoid problems down the road.

There are a few things you can do as well to try and help yourself out:

1) I like the texteditor notepad++ and if you are on a windows machine (or even linux) you might want to download it. It is very good at find and replace, so if you have blocks of code that needs to be removed from a bunch of pages, then you can use the find and replace feature.

2) See if there is a sitemap tool that can spider your html site (you might already have a sitemap anyway). Then it will be a little bit easier to do 301 redirects (if you need to) from the old URLs to the new ones.

I think the last time I needed to get a sitemap for an html site, I used Xenu link sleuth (I think... not 100% sure). I am pretty sure you can use it to generate a list of all your pages. If not, I am sure there is something out there that can help.

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4652430 posted 6:26 am on Mar 25, 2014 (gmt 0)

There is no easy way to do this; however. a client and I did manage to convert a 25,000 page site to wp in the following fashion:

Html to txt (ie, everything stripped out ... everything! To a non-internet facing (password protected) dev site so no worries function of the html/index top version (no redirects required).

He wrote a Txt to WP insert (no skin off my nose) which generated all the pages and WP link hierarchy. I suspect there are any number of similar plugs out there that do the same.

And then the work began... adding internal links, sub-linking, pics... a general nightmare. But the text and the WP content management system was satisfied and the blueprint needed was in place. AND...

Since it was not "google faced" the site continued as present and, when the html/index version was offed and the WP version was inserted the full redirect for KEY pages was in place.

A six week hit at 25% loss was rewarded with a 220% increase result that lasted until February. He's now at 100% above the initial transfer, a loss of 100%-ish.

If you intend to do this live between web and wp just make sure that every new active page has been redirected (making the html version disappear).

You can do the same using cut and paste a few pages at a time, inserting pic links as needed, etc., redirecting, and not going nuts.

Nothing wrong with WP. Not a damn thing! But some clients... well it did make a difference and I am convinced that standardization... whether it is ordinary HTML or WP does have something to be admired.

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4652430 posted 2:43 pm on Mar 25, 2014 (gmt 0)

How did you do the original html to text transfer stripping out the html tags? What program did you use?

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 7:52 pm on Mar 25, 2014 (gmt 0)

Nice stats/share tangor. Appreciate the real world numbers.

contentmaster

10+ Year Member



 
Msg#: 4652430 posted 8:49 am on Mar 26, 2014 (gmt 0)

I think the last time I needed to get a sitemap for an html site, I used Xenu link sleuth


Downloaded Xenu, seems quite useful. Thanks.

You can do the same using cut and paste a few pages at a time, inserting pic links as needed, etc., redirecting, and not going nuts.


That sounds like a good idea.. not sure if I can manage not going nuts, though!

Thanks all (lorax, Planet13, tangor). This is a lot of valuable information to digest and I have my hands full for the next couple of months!

Thanks again.

lorax

WebmasterWorld Administrator lorax us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4652430 posted 5:20 pm on Mar 26, 2014 (gmt 0)

You're quite welcome. Let us know how it goes!

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4652430 posted 5:26 am on Mar 27, 2014 (gmt 0)

How did you do the original html to text transfer stripping out the html tags? What program did you use?

Oddly enough the prog was titled html2txt, but there are dozens of them out there... this one would just read a folder at a time (or whatever was selected) output to sameasorginalname.txt to make it easy.

All best wishes!

Again, if you are going to make the change, I'd make it WP all the way. Saves on the antacids at a later date!

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4652430 posted 5:36 pm on Mar 28, 2014 (gmt 0)

Thank you, tangor

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / WordPress
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved