Forum Moderators: open

Message Too Old, No Replies

Dynamic pages always the same file

Will a website be correctly indexed if all the pages are diplayed by the sa

         

MrCrowley

5:37 pm on Apr 13, 2004 (gmt 0)

10+ Year Member



Hi all,

I developped a templating system for building websites that always uses the same php file but behaves differently depending on 4 variables.

The contents of these pages is very different from one page to another. I built it this way in order to make the code very easy to maintain and use.

I wonder though if the pages will be indexed by Google since the file is always the same. The URL might look like this : main.php?ID=2&l=e&dv=3&t=tprof so you can see the url is not very long and never will be. But it is always main.php, which is the script that loads the templates and modules.

Does anyone has any hints on what I could do to make sure my pages are included in google's repertory?

Thank you

Mart

the_nerd

6:48 pm on Apr 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google will never become your friend with this site structure.

AFAIK Google doesn't even index pages with more than two URL parms. (if one parameter is named e.g. "id" or "sessionid" this one alone is sufficient to keep you out of Srach engines)

And even if they do, your pages would be pretty useless, because (as a rule of thumb) every "plain link" to an inner page costs you about 1 PR-Unit (internal link from a PR 5 page most often results in a PR 4 page if there not too many links on the page). Every URL-Parameter means another PR-Unit off, so a link like ....mypage.php?p1=1&p3=3 from a PR5 page will result in a PR2 page.

Search webmasterworld for "rewriting" - that's what you need if you want to use just one parameter-driven template. This helps you keep your site structure and please the search engines as well.

doc_z

8:06 pm on Apr 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi MrCrowley and welcome!

Google is becoming better in spidering dynamic pages. However, as already mentioned, 4 parameters might be too much. If you're using Apache mod_rewrite might be a solution, i.e. using URLs like '/main/2/e/3/tprof.html'.

Every URL-Parameter means another PR-Unit off, so a link like ....mypage.php?p1=1&p3=3 from a PR5 page will result in a PR2 page.

PR is just based on the linking structure and nothing else. Therefore, it doesn't matter if the pages are static, dynamic with one parameter or dynamic with 4 parameters. However, there is a problem with the PR displayed in the toolbar for dynamic pages with an '?'.

jsbeads

1:22 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



I'm assuming that main.php?ID=2&l=e&dv=3&t=tprof means something.

Write your script to create a title page based on the variables. Not sure how you do this in PHP in ASP I just do something like this.

<%
SCRIPT_NAME = Request.ServerVariables("SCRIPT_NAME")
If Script_name = "main.asp" then
PageTitle = "Widgets "& Request.QueryString("catid") & " @ Bob Jones Widgets"
Keyword_Info = Request.QueryString("cat") & " "
End If
%>

<html>
<head>
<title><%= PageTitle %></title>
<meta name="description" content="<%= Keyword_Info %> Widget and Widget Supplies ">

In my case CAT is the Category name. I even take this futher by pulling the item description out of the database when I go to a detail page.

ogletree

1:41 pm on Apr 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



GG has been very vocal on not using ID= at all. He has also said a lot about not using 4 vaiables. It could work but I would not risk it. From my experience Google is very careful with those kinds of sites and therfor does not spider as often or very deep very often.

type site:webmasterworld.com GoogleGuy dynamic

encyclo

1:52 pm on Apr 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



[webmasterworld.com...]

message 12 ;)

ogletree

2:03 pm on Apr 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You have to watch what he says. Like doing better, normally dislike, loosen those restrictions. Until he says dynamic URL's are treated the same and we have no problems whatsoever then maybe I would start using them a year later.

MrCrowley

6:25 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



Hum... I don't think I have access to mod_rewrite, since I am hosted by a third party.

I guess I should change the name of the variable ID.

Maybe I could even combine all the vars in a single string to make Google think there is only one var, and then extract the values with a reg exp. But I think it is stupid to have to do it... even to change the names of the pages with mod_rewrite. I mean, the pages are all the same, and there are as many of them even if they do not have vars in their addresses. I just give myself trouble to make google think that my website is'nt dynamic.

The vars in my templating sys are as follows :
l is for language ( I live in a bilingual country!)
dv is for division (some businesses have many divisions with different visuals and menus and all. This var keeps everything well organised in a single database)
t is for the table in which the record is
ID is for the id of the record.

So basically I have a website that is very well organised with 3 levels of submenus. The pages outputted can be completely different one from the other : all the contents change, the title changes, the meta tags, etc. Pages from the same table but with different ID can have a different template too, main.php displays each record in it's own way if necessary.

By the way, all these addresses that I want indexed are present as is in the website's menu <a> tags and are not parsed through javascript.

Ahhhh.... man. I understand everything you guys tell me, and I thank you for your replies. I just hate to have to conform to the requirements of Google :-) They give me extra work...

ogletree

6:30 pm on Apr 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Only do it if you are having problems getting spidered on a reguler basis or deep pages don't get spidered. Google is working to fix this and have for some time and do it better than anybody else. There are people out there with very ugly url's that do fine. I would try to get my PR up first. Get some links from authority sites that get spidered a lot.

gaouzief

6:32 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



One of my sites works exactly like yours (multiple params on a single php page),

beleive me y're better off using Mod Rewrite from the start - I lost a lot of time (over a year) of near total absense from google because i didn't rewrite urls at first

Now that i use mod rewrite, all of the site has been indexed

MrCrowley

6:46 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



I think I am stuck... my website is hosted externally (host chosen by my client) and I do not have access to httpd.conf :(

Any way around this problem?

Thanks all

doc_z

7:37 pm on Apr 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You just need to place an .htaccess file in your directory. You should have a look at the Apache Web Server forum [webmasterworld.com].

Nikke

9:46 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



I have three sites running very much your structure with top ranking results in Google serps.

However. They only carry one or two parameters. Most often one.
index.php?ID=443

This has worked perfectly for us. However. When I due to a glitch in my publishing system happened to loose a lot of ID numbers and ended up with IDs ranging over 6 digits, the new pages never made it anywhere in the serps. After backtracking back to 4 digit IDs, the serps improved greatly, and the same pages now rank within the top five in serps.

A very important step here is to make sure to use descriptive, title tags. Don't make the mistake of showing the same title on lots of pages. Especially not if using ID= in urls.

And try to loose at least two of those params while you're at it.

[edited by: Nikke at 11:41 pm (utc) on April 14, 2004]

lunarboy1

11:22 pm on Apr 14, 2004 (gmt 0)

10+ Year Member



In addition, usually google assumes a long string of numbers or anything in the id= param to be session ids and it ignores. I have a site with

index.php
display.php?widgetid=100

and have over two hundred different widget pages, and each one is spidered after about 3 months.

richmarcia

1:28 am on Apr 16, 2004 (gmt 0)



I use mod_rewrite on one of my sites but the spiders still get stuck. PHP session ids are being sent through the urls - i've been told this is the problem. Does anyone have a script to fix this so Google will index my dynamic pages?

Rich

celenoid

1:46 am on Apr 16, 2004 (gmt 0)

10+ Year Member



I've (somewhat surprisingly) had pages indexed with a crazy number of variables in the URL. PR doesn't travel well, but they turn up ok in the SERPS.

I agree, the ID var is a problem. I had a site with 2K+ pages, most of which using one id var (ie. display.php?id=3). Never indexed. Changed the var on a whim one day and bam - hundreds of pages began being indexed in google...

Be extra careful if using a template page using the QS to call in your page content. These types of vars are easily identifiable as SQL variables and can be very easily hacked if you're not careful. Be certain to escape your vars in every, every way. (Sure you've thought of this already - sorry, can't help my security background kicking in!) Good luck!

MrCrowley

5:49 pm on Apr 16, 2004 (gmt 0)

10+ Year Member



I began to use mod_rewrite and it works like a charm to make the URLs appear to be standard html. Didn't put sites online though to see the behaviour of SE but am confident :)

As for session ID's I have a form of solution. In my templating system, all (and I mean all!) the links are created with a single class to which I pass parameters (basically variables to pass in the GET, but also the protocol, etc.).

I know that all the main pages of my templating system are depending on 4 variables. Some pages depend on more vars, but they are mainly used to display information that is not part of the main content of the website, for example they could be parameters passed to a search function, so I don't need these pages to be indexed.

Also, I don't need session-dependent pages to be indexed, since they depend on user input (for example user profile). The navigation of the website is never dependent on session. So I don't need these pages to be indexed.

Now, my class that builds the links checks for parameters and builds links accordingly : if a session is open or if there are more than the 4 base vars (or if don't want the adresses to be transformed at all through a config variable) the link will be standard php with get var.

On the other hand, if there is no session or if there are only 4 vars, the link will be rewritten to soemthing else (tmain23e2.htm for example). Since this link-creating thing occurs in only one place in all the program, if I have to change my address pattern, I do it only once. It is easier to keep the links 'synchronized' with the htaccess file then.

Mart