Forum Moderators: open

Message Too Old, No Replies

dynamic sites

         

DREAM

5:54 am on Jan 9, 2001 (gmt 0)

10+ Year Member



hi,

i am working on submitting some sites to the search engines which are in cold fusion.

i have been told dynamically generated pages are not indexed by robots.

what is the reality with dynamic sites and search engines?

who indexes them and who doesn't and are robots really unable to follow links which have question marks in them?

i am a new at this so i would really appreciate some help! :)

mousemoves

11:14 am on Jan 9, 2001 (gmt 0)



hi Dream,
It is suggested that robots do not follow anything after the question mark. you can read this at [info.webcrawler.com...] I'm not sure if it is within an RFC or not. My research is still in it's infancy. I know you can remove the question mark using a script. Off the top of my head I think pass the url string through an array and replace the question mark with a forward slash, obviously. Also, it seems google is indexing past the question mark now. Google, it seems, is indexing the universe! Anyway, hopefully an expert will be along shortly to answer your question in more detail with a better explanation as this would be very helpful for me as well!

grnidone

3:18 pm on Jan 9, 2001 (gmt 0)



Welcome to the forums, Dream.

You are correct. Spiders don't follow dynamic content. In fact, they pretty much don't follow anything except PLAIN text and PLAIN html links. (Frames are included as one of the things spiders don't follow..)

Luckily, there are some workarounds.

The following posts all talk about URL rewrites and dynamic pages:


Mod Rewrite Tutorial URL (Apache)
[webmasterworld.com]


Mod Rewrite and Clustering (also Apache
[webmasterworld.com]


Impact of URL rewriting
[webmasterworld.com]


Symbols in urls
[webmasterworld.com] (Excellent post..back when I had the same question.

The workarounds online that I have found documentation for are usually for Apache. I know from personal experience that Vinette (sp) Storyserver also has a workaround in its documentation.

This should get you started.

-G

cirelle

7:42 pm on Jan 9, 2001 (gmt 0)



Hi Dream -- Welcome

the vignette story server has its own engine to produce html pages from dynamic content, but from working with it in the past, I have found it generally pre-generates pages and modifications must be "published" before the content is seen on the page. The following is a storyserver url:

[yourdomain.com...]

From what I remember, the values (following article/) indicate 0 = use cached page, 1120 = template number. beyond that I am not sure.. Not to mention SS is a $250K plus product.

.cfm files face the same issues as .asp files from what I've seen. Not that I've seen it all by any stretch.

Some creative individuals have used the same methodology with cfm and asp as storyserver uses (enter data then publish) but you tend to restrict your really dynamic content to search pages while other content is distributed as static pages.

The other option is to force your server to dynamically generate html pages. I know it can be done on NT.

my 2c

c

cirelle

8:10 pm on Jan 9, 2001 (gmt 0)



Hi Dream -- Welcome

the vignette story server has its own engine to produce html pages from dynamic content, but from working with it in the past, I have found it generally pre-generates pages and modifications must be "published" before the content is seen on the page. The following is a storyserver url:

[yourdomain.com...]

From what I remember, the values (following article/) indicate 0 = use cached page, 1120 = template number. beyond that I am not sure.. Not to mention SS is a $250K plus product.

.cfm files face the same issues as .asp files from what I've seen. Not that I've seen it all by any stretch.

Some creative individuals have used the same methodology with cfm and asp as storyserver uses (enter data then publish) but you tend to restrict your really dynamic content to search pages while other content is distributed as static pages.

The other option is to force your server to dynamically generate html pages. I know it can be done on NT.

my 2c

c

Fusioneer

12:14 am on Jan 10, 2001 (gmt 0)

10+ Year Member




Peronsally I have had mixed results indexing dynamic pages, although I have seen Google, AV, and FAST all with .cfm, .asp, and "?" URL's in their database.

I have written a cloaking script in Cold Fusion and did some experiments with generating doorway pages out of a database with randomized content so they would not be recognized as such.

The URL's were encoded to hide the fact they were dynamic pages, eg.

[domain.com...]

For any of you Cold Fusion junkies (there's a lot of Apache in this forum I've noticed) CF stores the entire URL in a string called CGI.PATH_INFO.

Convert this into a forward-slash-delimited list, define a start point, strip out unecessary info and you can pass as many variables as you want through the URL with slashes.

Before you get too excited...it does not seem to work with all engines. AV now seems to recognize the "info.cfm" is the document rather than the index.html and requests it WITHOUT the appending slashed variables.

I have a possible workaround but am working on other things at the moment ;)

Comments?

PeteU

12:33 am on Jan 10, 2001 (gmt 0)

10+ Year Member



yes, you could get rid of info.cfm entirely - put something like
RewriteRule !^(.*txt)$ info.cfm (or whatever path to your script is)
into your .htaccess file
then parse the rest of url in the CF script itself from request_uri variable

DREAM

1:11 am on Jan 10, 2001 (gmt 0)

10+ Year Member



Thanks for your comments everyone (i have been to a few forums and posted and nothing has come out!). :)

I am following up on the leads with the tech guys. The answers i find will solve a big problem because the majority of everything we build are in CF.

What do you think of investing in cloaking software to work with CF pages?

Fusioneer

2:14 pm on Jan 10, 2001 (gmt 0)

10+ Year Member



By the way I am talking about CF running under NT/IIS so mod rewrite cannot be used. However there are alteernatives...;)

In my opinion Cold Fusion runs cloaking software quite well - the real success of your enterprise depends on the quality of your spider database and obviously how well your cloaked content is optimized.

We are still testing but have written a custom cloaking tag for Cold Fusion that apparently is working beautifully combined with a little URL rewriting. It is fast and easy to combine into an existing site so if you have CF experience already you shouldn't find it too hard to come up with a decent script. Check against IP's rather than UA's and you will be better off.