My very question too Sly. What are the consequences of changing static to dynamic content? Are there any pointers from someone who has gone through the change? Is the best a mix of Static and dynamic?
First, make the frontend to the database so the urls look like normal static files, that is, keep the query parameters in the url itself. Don't use?a=b unless it is for a search interface or something similar. Instead of [site...] make it [site...] and parse out the information in the url.
Next, make sure all you old urls still work, if necessary by setting up redirects for each and every of your old files. The best solution would probably be if you could maintain your old namespace while moving from static to dymanic content.
How about leaving the old html files there and making some new dynamic ones and a link from the index page to both so they both get crawled....or I am just setting myself up for a page rank 0 for having duplicate content?
There's really no trick here. In the olden days (i.e. Before Summer, 2002) there was always the question of whether or not a dynamic page would get crawled.
That's not the case anymore - just keep the querystrings short and make it so that there's no "superfluous" data in the strings.
To make a map, just make a master/detail list and the bot will crawl the master list - you should have one of those anyway.
If you aren't much of a data developer, this project sounds easy - 1 table with a dozen or so fields in it for all of the site information. You could probably get someone to make it for you for under $300 (TRANSLATED: 30 hours a redundant data entry.)
There must be a trick here Grumpus. I just acquired a client who had no link backs, but who has other (paid for) pages pointing that are ASP, PHP and SQL based. Those links are not being picked up by Google and that is currently my reason for being here. I suspect that Google is not quite there yet as far as dynamic databases are concerned, and I would like to give him/her/it some good advice. Should I tell them to use a complicated system of dynamic retrival, or should I advise static html pages from the get go? I know that we can alter the appearance of dynamic pages, but is it worth the effort?
As far as I have understood only links from pages with PR>=4 are shown by google.
I believe it is worth the effort to cloak dynamic pages as static, because it gives the user a meaningful and understandable namespace. If it helps being indexed by SEs that is an added benefit.
I have many pages in Google which are like this:
These typically show one back link from a PR5 and carry PR4 themselves. But some of these are from PR4s and show no backlinks but are still in the database. I think you get into trouble where your pages look like this:
Gbot seems to have more trouble with multiple variables or avoids them altogether. IMHO, that is precisely where you want to make the URLs look like this:
Another site I'm involved with is in a database form and has absolutely no trouble getting pages spidered. That site uses 3 or 4 variables per page but we re-write so that it the variables look like subdirectories. So I'd have to conclude that databases where the URLs look "normal" are the way to go.
Any evidence that Google doesn't like cloaking of database pages to HTML?
Any reason they might not in the future?
My web site with hundreds of pages consists nearly exclusively of dynamic Active Server pages with .asp? extensions. I am well ranked and Google follows IMO all links as long as the parameter list behind the "?" is not too long or the links are not too deeply buried. Unless there is a way of transforming dynamic pages into static ones automatically and without any major effort or problems, I think it is not worth to do it.
If the database you are constructing is ODBC compliant or Access driven, you can use several Windows desktop programs that can create web pages based on a template page and content pulled from the database. It took me half a day to create and debug the template page. After the system was set up, it took five minutes to publish 2000 pages. ;)
Sticky me if you want a list of DB to static HTML/Web programs.
An alternative solution is to use some sort of third party CMS software which generates static pages from the database and simply updates the static pages when the database is updated (or at certain intervals).
Most of the trick here is to make sure that the database outputs to something that looks like plain old HTML when you look at the source. The name of the page seems irrelevant (asp/php/etc.) so long as what the bot sees looks like HTML.
I think the lingering problems many folks are having getting dynamic pages indexed is because the (resulting) page source is just too damned complicated. On all the sites where I've heard complaints it is inevitably a page with so much muck that the bot gets confused.
Not making a dynamic title to go along with the dynamic content. Googlebot hits a few pages and sees lots of words on the page, but the title on all of them are "My Site: The Best Site on the Web!". Therefore, they are all about the same thing despite the content, so there's no need to index more than one page.
And, as mentioned - there can't be stuff in the URL that doesn't effect the content of the page. In other words, don't use session ID's, click tracking codes, or other things like that.
Everyone seems to be making "work" out of nothing. Do it clean and uniformly in the first place and you won't be worried about "tricks" to get the bot in later.