Forum Moderators: coopster
I'm a little nervous because of my database newbieness, so it would be great if I could run my thinking by someone. Thanks for any replies :)
2 days in, I currently have a table in my database setup so that you can call up individual widget pages via www.mysite.com/?page=fluffy-wudgets and so on. I am calling various elements of the page for example sections of the body text (e.g. headings) from the database. I also have things like a category field in the database table, so that I will also be able to use the page to call up information on categories.
As far as I can tell, I can build the site entirely off one php page, by putting different variables in the url. Should I be using more pages? Would it be a disadvantage to have a site based on one template php file, or should I create mutiple pages? So instead of using the same page to display categories I could also make categories.php. Does it make a difference to search engines or server load for instance?
To be honest, I want some reassurance that I'm going about things the right way as there seems like so many possibilities with making a database-driven site that I could end up anywhere ;)
I'm progressing pretty well - I'm already starting to get all excitable and wonder how I ever got on without using databases ;)
>>Those pages can look like static pages without parameters with a little Apache knowledge and a little PHP coding.
Something i've been thinking about, but I can't use htaccess (and I can't/don't want to change hosts) so I was hoping that if I kept the variables to one per url (i.e.?widget=fluffy-green or?category=free-widgets) that this would not present any major problems with search engine indexing.
Am I wrong in this assumption? I guess I might be able to simulate mod-rewrite with some clever coding on my template pages if it was really necessary.
Hope this helps,
George
I can use custom error pages, and this was what I was actually considering. However, if I use a 404 that sends 200 codes, then what happens when someone gets a 'genuine' 404?
I can see that there are ways around this problem, but as I'm designing the site as a way of learning mysql, I think I might see how search engines handle the single variable in the url, and worry about url rewriting further down the line.
1 - do i have that page in the $_SERVER[REQUEST_URI]? yes -> 4, no -> 2
2 - don't change the 404 headers (as already made)
3 - output some pretty 404 page that you want & die()
4 - output 200 OK headers
5 - query database for the page
6 - output page
that way you can do www.widgets.com/mywidgets.htm
--> 404 page --> 200 headers --> output the page for mywidgets.htm
it's seamless to the user because it's not using a client side redirect - everything is server side
HOWEVER - if you DO want to play with variables in URLs - know that they LOOK odd to human browsers :-) oh, and they are likely to come out all as one page in your logs if you get a "summary" log from your virtual host. and for search engines - make sure you have a different <title> and <h1> tag on every page (as well as the obvious content differences)
With a custom 404, I guess all my links have to point at the pages that don't really exist? So, for example, if I have superwidget.com/?widget=super-widget I actually have to link to superwidget.com/widget/super-widget.htm and let the 404 deliver the correct page?
If I understand correctly, as the links are currently auto-generated based on the database content, I have to modify these before they are outputted on the page into the new format.
Also, wouldn't the redirect from the real page with variables to the fake page with folders cause the url in the address bar to change, and wouldn't this in turn cause the search engines to think that the urls with variables are the 'correct' ones?
I'm thinking myself round in circles a bit here, so any clarification would be greatly appreciated.
blah blah
<a href="/widgets.htm">link</a>
blah blah
but.... widgets.htm doesn't exist:
click --> 404 page
the 404 page queries the database:
"is widgets.htm in the database?"
--> YES! It is!
--> Gives 200 OK headers
--> return the content (note the URL still says /widgets.htm)
that is to say - you never ever actually have any?page= in your url - anywhere. it's just not actually needed.
:-) enjoy it - it definately works because i do it for:
[mysite.foo...]
[mysite.foo...]
[mysite.foo...]
[mysite.foo...]
... none of which files actually exist on my server - all are actually from the database - but the user never finds out
I didn't have access to an .htaccess file either, but I got around the problem on a smaller site by using a switch.
Basically, I exploded the url like this:
$path_info = explode("/", $_SERVER['PHP_SELF']);$page = $path_info[3];
switch ($page)
{
case "green":
$widget = "fluffy-green";
break;
case "blue":
$widget = "blue";
break;
}
In the navigation, I echoed what the url was supposed to be and added the $widget variable where it needed to be to get a url like this:
my-site.com/dir/$widget
The $page variable is so when someone clicks on a url, the script knows which page to deliver.
Yes, this is a big time hack, and I don't know how it'd work on a site of any size, but it does work if you can't do anything with the .htaccess file.
And, if anyone has a better way of doing this, I'd LOVE to know how.
all extensions are .html
no params in the url
100% db driven
no 404 -> 200
no mod_rewrite
example of a page
<?
$page_id = 1;
// add more vars here
include $_SERVER['DOCUMENT_ROOT'] . "/templates/pagetemp.php";
?>
You want to add 100 vars to your url, go crazy, add them into there. The admin scripts generate the pages when you make the entry into the db. Site looks 100% static with 0 static content.
where does the $page_id variable go in your code?
www.site.com/templates/pagetemp.php?page_id=1
which is a popular format, instead I have
www.site.com/pagename.html
of which the content is
<?
$page_id = 1;
include $_SERVER['DOCUMENT_ROOT'] . "/templates/pagetemp.php";
?>
In both cases I am passing the variable page_id to the template and the template is using that variable to get and display the data from the db. The first one using the classic method of query strings and the second embedding the variables into an actual html page that is parsed via php.
sorry, is that clearer grnidone?
<added>When you add content to the database there has to be some kind of interface. When that interface inserts the data into the db, it finds out the page_id for the row and actually writes it into an html file which it creates for that page.
if you have generated 1000 pages via the database that you want to access - then you really should use a 404 or mod_rewrite...
if you have 10 fixed name pages that you are just storing in the database for easy editing - then your scheme is great :-)
if you have generated 1000 pages via the database that you want to access - then you really should use a 404 or mod_rewrite...
I disagree completely, I can have 1000 pages or 10000, doesn't matter to me, the admin scripts manage everything. Menus, pages, sitemap, the works and all of my straight html pages rank quite nicely.
Actual space taken is only a couple of k, if that, per page. Why do you think there is a drawback?
Why do you think there is a drawback?
Because you are physically creating files - which must physically exist BEFORE the user requests them. Take the example of this:
I have a database full of products and descriptions of them
People go to widgets.foo/something.htm
Where something.htm is parsed for the "something", and then i find the most relevant product from the database, and return a page about that one...
So, I have in fact an very very large number of pages possible - at least one for every word in the database, and more when you figure in more than one word combined.
With a large product database - i'd soon have a number of possible pages in the millions.
So, if I used your technique - I'd need to physically generate ALL those millions of pages on the server!
And I've tried creating very very large numbers of files on a server root before now! :-) Not pretty.
So - that's why I said - your technique is suitable for a small number of fixed name pages which you just want to edit - for something with a lot of pages - you really need mod_rewrite or a 404 trick.
I've been trying to write a suitable custom 404 page. At the moment i'm doing something along these lines:
Check the url to look for a variable in a (non-existant) folder structure. If there are any, extract them and call the appropriate content from the database. My problem at the moment is that although i'm delivering the correct pages, I can't seem to get the correct response codes. Despite my best efforts, status is 404 and location is 404.php.
I'm thinking that rather than using $widget to determine the content to be delivered, could I not use /widget/ instead? What I mean is, rather than looking at a query string ina url like superwidget.com/template.php?widget=fluffy-green, could I not just get my page to extract the variables from superwidget.com/template.php/widget/fluffy-green.html instead? This way I wouldn't have to worry about response codes.
I understand that the page in question will not actually exist, however for some reason if I go to superwidget.com/template.php/anything-else-here, the server returns a 200 OK code, so I wouldn't have to worry about status codes. The server returns a 200 for any url with any amount of characters after the page name, even if I make it look like a folder, as long as the folder doesn't exist ;)
One final thing - if I call up a page that grabs content from the database, there is no 'location' in the response headers. I assume this does not present problems from a spidering point of view?
Thanks again for helping a database newbie :)
On existing large sites mod_rewrite it and be done with it, there is no point to this system.
The point to this system is a ground up solution, just a CMS with SEO in mind. You want to start a site, sure, here you go. If I walked in to work on an existing large site I would use mod_rewrite or possibly some other solution.
The thing I don't see as to your method vince3 is it sounds as if you don't actually have any pages. I doubt that is true. So why do people request all of these bogus urls that you have to handle them the way you do?
Search engines need to spider all of the pages and index them, then actually ranking them being more important than just spidering them. I don't see the necessity for the 404 way. Where are all these urls that dont belong to pages or is this a post production work around for a pre existing problem.
Summary:
Using the Jatar_K method of fixed files with include() [JK]
vs.
Using the 404 -> 200 method with include() [V3]
Comparison :
Totally transparent to browser: V3 JK
Fast : V3 JK
Allows easy content management : V3 JK
Suitable for less than 1000 pages : V3 JK
Fastest (server speed) : JK
No need for physical files : V3
Requires php to permission to write to root : JK
Suitable for more than 10000 pages : V3
No need for .htacceess : JK
Suitable for picture files : V3 (JK? [with changes])
Really pretty root possible : V3 (.htaccess and 404.php only - if the 404.php includes your template within!)
Consolidates your debt and saves $$$$ a month: Neither :-)
Gives you a discount at the bar: J/K
:-) As I see it - anyway!
[edited by: vincevincevince at 9:11 pm (utc) on July 23, 2003]