Forum Moderators: coopster

Message Too Old, No Replies

another mysql question

just want to check i'm on the right path ;)

         

pixel_juice

10:01 pm on Jul 17, 2003 (gmt 0)

10+ Year Member



I'm building a site to help me learn mysql. My test site is just a site about something I enjoy, albeit in a highly competitive area.

I'm a little nervous because of my database newbieness, so it would be great if I could run my thinking by someone. Thanks for any replies :)

2 days in, I currently have a table in my database setup so that you can call up individual widget pages via www.mysite.com/?page=fluffy-wudgets and so on. I am calling various elements of the page for example sections of the body text (e.g. headings) from the database. I also have things like a category field in the database table, so that I will also be able to use the page to call up information on categories.

As far as I can tell, I can build the site entirely off one php page, by putting different variables in the url. Should I be using more pages? Would it be a disadvantage to have a site based on one template php file, or should I create mutiple pages? So instead of using the same page to display categories I could also make categories.php. Does it make a difference to search engines or server load for instance?

To be honest, I want some reassurance that I'm going about things the right way as there seems like so many possibilities with making a database-driven site that I could end up anywhere ;)

olwen

10:31 pm on Jul 17, 2003 (gmt 0)

10+ Year Member



Your approach sounds good.

Those pages can look like static pages without parameters with a little Apache knowledge and a little PHP coding. There are several ways of doing this which have been discussed in this Forum.

pixel_juice

9:16 am on Jul 18, 2003 (gmt 0)

10+ Year Member



Thanks olwen :)

I'm progressing pretty well - I'm already starting to get all excitable and wonder how I ever got on without using databases ;)

>>Those pages can look like static pages without parameters with a little Apache knowledge and a little PHP coding.

Something i've been thinking about, but I can't use htaccess (and I can't/don't want to change hosts) so I was hoping that if I kept the variables to one per url (i.e.?widget=fluffy-green or?category=free-widgets) that this would not present any major problems with search engine indexing.

Am I wrong in this assumption? I guess I might be able to simulate mod-rewrite with some clever coding on my template pages if it was really necessary.

gbaker123

10:22 pm on Jul 22, 2003 (gmt 0)

10+ Year Member



Limiting the url variables should help you in the search engines. Keep it to one or at most two. However, I would seriously consider changing hosts down the road. Get one with .htaccess ability. I went from URL variables to mod_rewrites and the number of pages indexed went up exponentially.

Hope this helps,
George

vincevincevince

10:29 pm on Jul 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if you can't use .htaccess - how about setting a custom 404? some virtual hosting supports this through a web-managment system - may we worth a quick query...

with the 404 you just send out 200 headers, then return the content - and it's as if the page was in fact found

pixel_juice

1:04 am on Jul 23, 2003 (gmt 0)

10+ Year Member



>>how about setting a custom 404?

I can use custom error pages, and this was what I was actually considering. However, if I use a 404 that sends 200 codes, then what happens when someone gets a 'genuine' 404?

I can see that there are ways around this problem, but as I'm designing the site as a way of learning mysql, I think I might see how search engines handle the single variable in the url, and worry about url rewriting further down the line.

vincevincevince

7:57 am on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



for your 404:

1 - do i have that page in the $_SERVER[REQUEST_URI]? yes -> 4, no -> 2
2 - don't change the 404 headers (as already made)
3 - output some pretty 404 page that you want & die()
4 - output 200 OK headers
5 - query database for the page
6 - output page

that way you can do www.widgets.com/mywidgets.htm
--> 404 page --> 200 headers --> output the page for mywidgets.htm

it's seamless to the user because it's not using a client side redirect - everything is server side

HOWEVER - if you DO want to play with variables in URLs - know that they LOOK odd to human browsers :-) oh, and they are likely to come out all as one page in your logs if you get a "summary" log from your virtual host. and for search engines - make sure you have a different <title> and <h1> tag on every page (as well as the obvious content differences)

pixel_juice

9:05 am on Jul 23, 2003 (gmt 0)

10+ Year Member



Thanks vincevincevince. I have to admit i'm a little confused by the custom 404 idea.

With a custom 404, I guess all my links have to point at the pages that don't really exist? So, for example, if I have superwidget.com/?widget=super-widget I actually have to link to superwidget.com/widget/super-widget.htm and let the 404 deliver the correct page?

If I understand correctly, as the links are currently auto-generated based on the database content, I have to modify these before they are outputted on the page into the new format.

Also, wouldn't the redirect from the real page with variables to the fake page with folders cause the url in the address bar to change, and wouldn't this in turn cause the search engines to think that the urls with variables are the 'correct' ones?

I'm thinking myself round in circles a bit here, so any clarification would be greatly appreciated.

vincevincevince

4:53 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



in your page:

blah blah
<a href="/widgets.htm">link</a>
blah blah

but.... widgets.htm doesn't exist:

click --> 404 page

the 404 page queries the database:

"is widgets.htm in the database?"
--> YES! It is!
--> Gives 200 OK headers
--> return the content (note the URL still says /widgets.htm)

that is to say - you never ever actually have any?page= in your url - anywhere. it's just not actually needed.

:-) enjoy it - it definately works because i do it for:
[mysite.foo...]
[mysite.foo...]
[mysite.foo...]
[mysite.foo...]
... none of which files actually exist on my server - all are actually from the database - but the user never finds out

grnidone

5:03 pm on Jul 23, 2003 (gmt 0)



re: template and navigation

I didn't have access to an .htaccess file either, but I got around the problem on a smaller site by using a switch.

Basically, I exploded the url like this:


$path_info = explode("/", $_SERVER['PHP_SELF']);

$page = $path_info[3];


and then made a switch for the 'page variable'


switch ($page)
{
case "green":
$widget = "fluffy-green";
break;
case "blue":
$widget = "blue";
break;
}

In the navigation, I echoed what the url was supposed to be and added the $widget variable where it needed to be to get a url like this:

my-site.com/dir/$widget

The $page variable is so when someone clicks on a url, the script knows which page to deliver.

Yes, this is a big time hack, and I don't know how it'd work on a site of any size, but it does work if you can't do anything with the .htaccess file.

And, if anyone has a better way of doing this, I'd LOVE to know how.

Timotheos

5:28 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You might just try making your own .htaccess file and seeing if it works. I'm on a shared server and my host was no help in setting up a simple redirect. So I just opened up notepad, made my own file named .htaccess, put in the redirect command and stuck it in the main directory. Worked great.

jatar_k

6:00 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



i use scripts to generate html pages for db pages

all extensions are .html
no params in the url
100% db driven
no 404 -> 200
no mod_rewrite

example of a page

<?
$page_id = 1;
// add more vars here
include $_SERVER['DOCUMENT_ROOT'] . "/templates/pagetemp.php";
?>

You want to add 100 vars to your url, go crazy, add them into there. The admin scripts generate the pages when you make the entry into the db. Site looks 100% static with 0 static content.

grnidone

6:19 pm on Jul 23, 2003 (gmt 0)



I missed something in that example, Jatar...

where does the $page_id variable go in your code?

jatar_k

6:30 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I could have

www.site.com/templates/pagetemp.php?page_id=1

which is a popular format, instead I have

www.site.com/pagename.html

of which the content is

<?
$page_id = 1;
include $_SERVER['DOCUMENT_ROOT'] . "/templates/pagetemp.php";
?>

In both cases I am passing the variable page_id to the template and the template is using that variable to get and display the data from the db. The first one using the classic method of query strings and the second embedding the variables into an actual html page that is parsed via php.

sorry, is that clearer grnidone?

<added>When you add content to the database there has to be some kind of interface. When that interface inserts the data into the db, it finds out the page_id for the row and actually writes it into an html file which it creates for that page.

vincevincevince

7:12 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jatar_k - what your just said is a great idea - the drawback being that you actually have to make /page.php physically exist on your sever.

if you have generated 1000 pages via the database that you want to access - then you really should use a 404 or mod_rewrite...

if you have 10 fixed name pages that you are just storing in the database for easy editing - then your scheme is great :-)

jatar_k

7:18 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



if you have generated 1000 pages via the database that you want to access - then you really should use a 404 or mod_rewrite...

I disagree completely, I can have 1000 pages or 10000, doesn't matter to me, the admin scripts manage everything. Menus, pages, sitemap, the works and all of my straight html pages rank quite nicely.

Actual space taken is only a couple of k, if that, per page. Why do you think there is a drawback?

Timotheos

7:43 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry jatar_k, I don't get it either. I'm feeling dense. Wouldn't you have to have another html page with $page_id=2 and so on and so on?

vincevincevince

7:43 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why do you think there is a drawback?

Because you are physically creating files - which must physically exist BEFORE the user requests them. Take the example of this:

I have a database full of products and descriptions of them
People go to widgets.foo/something.htm
Where something.htm is parsed for the "something", and then i find the most relevant product from the database, and return a page about that one...

So, I have in fact an very very large number of pages possible - at least one for every word in the database, and more when you figure in more than one word combined.

With a large product database - i'd soon have a number of possible pages in the millions.

So, if I used your technique - I'd need to physically generate ALL those millions of pages on the server!

And I've tried creating very very large numbers of files on a server root before now! :-) Not pretty.

So - that's why I said - your technique is suitable for a small number of fixed name pages which you just want to edit - for something with a lot of pages - you really need mod_rewrite or a 404 trick.

pixel_juice

7:53 pm on Jul 23, 2003 (gmt 0)

10+ Year Member



Wow, thanks for all the responses everyone :)

I've been trying to write a suitable custom 404 page. At the moment i'm doing something along these lines:

Check the url to look for a variable in a (non-existant) folder structure. If there are any, extract them and call the appropriate content from the database. My problem at the moment is that although i'm delivering the correct pages, I can't seem to get the correct response codes. Despite my best efforts, status is 404 and location is 404.php.

I'm thinking that rather than using $widget to determine the content to be delivered, could I not use /widget/ instead? What I mean is, rather than looking at a query string ina url like superwidget.com/template.php?widget=fluffy-green, could I not just get my page to extract the variables from superwidget.com/template.php/widget/fluffy-green.html instead? This way I wouldn't have to worry about response codes.

I understand that the page in question will not actually exist, however for some reason if I go to superwidget.com/template.php/anything-else-here, the server returns a 200 OK code, so I wouldn't have to worry about status codes. The server returns a 200 for any url with any amount of characters after the page name, even if I make it look like a folder, as long as the folder doesn't exist ;)

One final thing - if I call up a page that grabs content from the database, there is no 'location' in the response headers. I assume this does not present problems from a spidering point of view?

Thanks again for helping a database newbie :)

jatar_k

8:33 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Timotheos, yes you do.

On existing large sites mod_rewrite it and be done with it, there is no point to this system.

The point to this system is a ground up solution, just a CMS with SEO in mind. You want to start a site, sure, here you go. If I walked in to work on an existing large site I would use mod_rewrite or possibly some other solution.

The thing I don't see as to your method vince3 is it sounds as if you don't actually have any pages. I doubt that is true. So why do people request all of these bogus urls that you have to handle them the way you do?

Search engines need to spider all of the pages and index them, then actually ranking them being more important than just spidering them. I don't see the necessity for the 404 way. Where are all these urls that dont belong to pages or is this a post production work around for a pre existing problem.

vincevincevince

8:47 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jatar_k - don't worry too much about this (the line between SEO and spam is often a cause of contention!) :-) as we now can agree - there are reasons to use both of the methods :-)

Summary:
Using the Jatar_K method of fixed files with include() [JK]
vs.
Using the 404 -> 200 method with include() [V3]

Comparison :

Totally transparent to browser: V3 JK
Fast : V3 JK
Allows easy content management : V3 JK
Suitable for less than 1000 pages : V3 JK

Fastest (server speed) : JK
No need for physical files : V3
Requires php to permission to write to root : JK
Suitable for more than 10000 pages : V3
No need for .htacceess : JK
Suitable for picture files : V3 (JK? [with changes])
Really pretty root possible : V3 (.htaccess and 404.php only - if the 404.php includes your template within!)
Consolidates your debt and saves $$$$ a month: Neither :-)
Gives you a discount at the bar: J/K

:-) As I see it - anyway!

[edited by: vincevincevince at 9:11 pm (utc) on July 23, 2003]

Timotheos

8:55 pm on Jul 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jatar_k and vince,

Thanks, the fog has lifted :-)

Tim