Dynamic = Static

Thinking about dynamic pages like php, asp and cgi pages most people know that some SE's has serious troubles with indexing this kind of webpages. But why?

In the early days we had MS-DOS, the first Windows versions where build on top of this operating system.. just to give the user the idea their are not using the 'difficult' commands you needed for DOS.

Just a little spin off...

Now we have Apache (and IIS but sssst!) as OS, it operates and serve your Website. Most basic version of a website is just send out .HTML pages.. no script, nothing at al.

Install some script engine in your Apache configuration like PHP and you can build dynamic pages... with some knowledge you could serve the .PHP files as .HTML (far as i know). With this kind of setup , SE just don't see its a PHP file right?

If you go use parameters in your URL's then you can get serious troubles with the actual SE's. Thats why most SE's advise to not use a Session Identifier in your URL because every user get's a different one..

Some webmasters use the parameter called 'id=', thats a dangerous one, SE's again, think that the id= parameter was used for the Session Id.. Thats why you can better use another parameter for identifing different pages like articles.

Think about this, you got 'articles/reader.php?a=83' , IMO thats a page and a unique URL. Be happy if it is indexed in the SERPS but; what if a user calles Joe Doe links to that URL/page with an extra parameter like; 'articles/reader.php?a=83&trick=joohoo

To be honest if you do something on my pages, it just serve the?a=83 article.. same page BUT different URL! SE's do see this as Duplicate content. Thats why you need to code your .PHP,.ASP etc pages so that if a user use parameters that does'nt has effect the script return a 404.

You could also do things with the robots.txt file, i have some pages who use extra parameters for sorting features... but's that's only ment for human and SE could see duplicate content on this pages, so i did add the following in the robots.txt: /articles/index.php?s=

Google understands this, they will crawl all the pages in /articles/ an notices that they can index /articles/index.php?paging=1,2,3 etc But Not /articles/index.php?paging=1?s=DESC

I just notice that different SE still not understand the meaning of a robots.txt file.. But their becoming awake, they putting man hours on working to a better robots.txt system in their engines.

In my opinion its good that SE crawl all the pages that are Disallowd by robots.txt... but if disallowed, crawling is ok, indexing is forbidden! I like the reports in Google sitemaps where you can see the pages included dynamic ones witch are disallowed. So dont think hey, they are overriding the rules.

This was what i would like to share with you, it's really a English exam for me, hope you understand a little.

Dynamic = Static

Its all Google's problem

r3nz0

theBear

g1smd

theBear

theBear

r3nz0

theBear

g1smd

ashear

g1smd

theBear

goubarev

g1smd

g1smd

theBear

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week