IMHO, scripting languages are all very fast, and will not be your bottleneck in delivering content to the users.
The biggest time consumer will be the database and data retrieval.
There are many variables to tweak and adjust to retrieve data from a DB fast.
Some of these include:
-Server Processor power.
-Server RAM and speed of the hard drives.
-Database normalization.
-Building proper Indexes.
-Building proper SQL retrieval statements.
When that is all done. And perfected, you should be able to achieve results similar to WebmasterWorld, EBay, and Google.
Hint: #1 Probably takes a professional. Maybe even a team.
Hint: #2 Oracle is an excellent solution for a DB. The problem is the cost of implementation. Do some research, call Oracle and SUN. You probably want a SUN SPARC system, although I hear they have gone down in price, they are quite pricy.
But to answer the question in a PERL forum :) Perl/CGI is extreamly fast, stable, and not that hard to learn.
Hope that helped.
Paul
One more question a little more specific to the site if we were going to do a paypal database would you use the same languages and database? Where could we get a database already designed for this type of application or could we?
I am a little unclear about what you mean by PayPal database. If you mean a database to store clients who use PayPal information, you can use the same database. I actually can not imagine a reason to use a different database, except to give you a headache :)
If you are worried about hackers, stealing your data, that is a legitimate concern and should be dealt with seriously. But creating another DB for this purpose will not solve this problem.
...if we did it in php and 6 months from now we want to go to a oracle database with pearl/cgi
>>did php on a site and it seems to have a problem with load times
scripting languages don't create load times, programmers create load times.
You might be confusing the fact that you have two seperate decisions to make here. DB and Scripting.
Oracle or Sybase are both very robust, powerful and expensive solutions. Either one of the two will do.
Perl or PHP will do it for you. If you are using millions of rows there will have to be a lot of testing done. Scalability will be your biggest issue. If you have sloppy queries or big loops you can easily slow the site to a crawl.
PaulPaul, it is a PERL and PHP forum ;)
[edited by: jatar_k at 9:13 pm (utc) on Aug. 29, 2002]
[edit reason] no url drops see TOS [/edit]
We try to keep things general and not discuss specific url's.
Whenever you need it, you know where to find us. ;)
<added>not a bad site
If it were mine thought i would still go with a *Nix based program because of the cost.
I imagine, and have seen huge performance variances in Databases based on who is maintaining them and who is writing code against them.
I have SQL Server on a dual P3 800 system with 1.5 gig of ram and it is a dog. mostly cuz I ain't got a clue...
Which I guess just mutes most of my not so pointed points here.
Todd
I am by no means an expert on either, but one mistake I see a lot of people falling into is expecting the database to be simply a raw storage facility with all the work being done by the sripts.
A good relational database of any flavour is designed to store and manipulate data using the inherant relationships. It makes sense to me to use those strengths to their best advantage.
Unfortunately, some people expect the scripting language to do all the grunt work wheras it is really best at retieving and sending data from/to the database.
I use ASP and Access for small projects and find that I can speed things up quite considerably by carefully designing the database to run internal queries and then using ASP to get access to the results of those queries. I would suggest a similar approach.
Of course the scope of the project bowpay is talking about would require a database far more robust than access, but the principles remain the same. Use each component, database and scripting, to do what it does best and your system should be quite speedy regardless of choice of database or scripting labguage.
This is assuming of course that you are/have/get competant professionals for each part.
Onya
Woz
When the recordset is returned you can run multiple queries and sorts on it without having to go back to the database.
It has an exprire mechanism also so you can have the set refreshed at an interval set in your code.
Once you write though the set should be killed and refreshed.
From what I gathered this is the middle tier of data.
Hemsell,
This feature, querying a returned recordset is a feature that has been available for a while in most scripting languages and I use it on my site using ColdFusion.
But in my practice, I have found that language speed is no longer an issue. Processors and Ram have come a long way; I think your slowest part of a webserver now days will be the bandwidth used to connect the server to the end user. Make sure your host is located on a backbone and has redundant connections with all the large data carriers.
As alreeady mentioned above, the server is the top priority for speed ( no. of processors / GB's of ram) can be v.pricey so you will need to carefully design the server based on your expected usage/biz plan.
Also optimised Oracle functions, stored procs etc on the db server will have an impact on performance.
I recall reading a book entitled The Zen of Code Optimization a few
years ago. The author mentioned a code contest where folks were to
take a perfectly good implementation of the computer game "Life"
written in standard C and optimize it for performance. The winner's
entry was 300 TIMES faster than the original program, and also written in standard C.
One thing to keep in mind is that there are often tradeoffs in making
a program faster. Often the way you do so code wise makes a program
less maintainable. One example is unrolling loops; i.e. instead of a
do...while or perform...until construct, one just repeats the code as many
times as it executes (maybe with goto logic based on a flag value; note most NEVER use goto logic). Unrolling
loops was one technique the winner of the coding contest used. However,
code like that would likely increase the time and cost of maintenance
over the lifetime of the program. And people time generally cost more
than faster hardware when you look at total lifecycle cost of a system.
Second, one really needs to profile to see where the bottlenecks are,
and where the most speed improvement for the time and effort expended
are likely to come from. This applies not only to individual programs,
but the entire system, including hardware components as well.
As touched on in an earlier post, it does no good to make your program faster if your
server is bottlenecking on the network connection.
Likewise, there are often optimizations one can do to software
configurations that will make a big difference. Making sure the
interpreted byte code from a script is kept in -- and executed from --
the webserver's cache after the first execution, rather than reinterpreted
during each execution of the script is one example.
Consider what the slowest pieces of hardware are in a typical server.
Often it's the hard drive. One solution may be to place the database
on one fast SCSI hard drive, and the web server and scripts on another.
However, an even better gain in system speed/thruput may be gained by
adding enough RAM to keep the entire database in memory -- and configuring
the server so the database is indeed resident in memory, yet with
updates going to disk in case of power failure.
Keep in mind that there may be practical limits of what one can do with
any given platform of hardware and software. Rather than trying to
get a level of performance a given specific infrastructure is simply
incapable of or not suited for, consider if another "tool" would be right
for the job. Maybe several servers behind a load balancer, for instance,
rather than trying to milk yet more performance out of a single server.
Moreover, while I too share a desire to find the "ultimate" language
to program in, to me, real world considerations like existing code
libraries, maturity of the code base, relative prevalence of security issues for
a language and the likely total life cycle cost of development in language
"X" .vs "Y" overshadow a theoretical advantage one language has over
another assuming guru level coding in either.
This brings up an interesting point. If you are talking code speed or
developer productivity, the skill of the developer must be taken into
account. Just because it's theoretically possible for a person to write
faster code in language "X" .vs language "Y" doesn't mean you will
realize the speed gain.
Moreover, I think it better to choose from among the languages that are popular
and considered good choices -- and become as expert in that language as
you can if your focus is on writing fast code. However, once you find
out what's involved in writing code with an emphasis on max speed, you
may decide it's wiser, more cost effective, and takes less total development
time to simply improve the hardware/software/server/network configuration.
Several years ago I worked as a contractor at a major nationwide
brokerage firm. Their computers ran at over 100% expected capacity during
the day. They had an online transaction they performed continuously
that really ate up a lot of their computer capacity. They had already had a lot
of folks working on a solution to improve performance -- for about 2 months.
A solution was decided upon and I was assigned to do the fix. When I looked
at what they wanted to do, I realized the solution would be a maintenance
nightmare. After I shared this with my project manager, he asked me
to see if I could think up anything better. I did. They could not
believe I came up with a simple solution that lots of very smart folks,
including system programmers, DBAs and senior programmers had missed.
I told them it wasn't because I was smarter -- it's because I simply
asked myself, while thinking about the problem, a simple question whose
answer led to a greatly simplified solution that would not only
perform better, but take less time and eliminate the maintenance
nightmare the original solution would have caused. They hadn't thought
to ask themselves that question -- because it was foreign to how they
thought about the subject domain. Better (or fresh) thinking about the
subject domain may lead to insights that yields much improved solutions.
They were quite pleased with my solution I implemented for them. It reduced resource
utilization by that online transaction by 69%. To them, that was
a smashing success, as the resources used by that transaction by all
their nationwide brokers all day every day was bogging down their systems greatly.
But alas, no bonus for me....sniff...
My point is that hard thinking about a problem often gives the best gains.
Doing things well up front during problem definition and design stages
often results in a faster less trouble free system than if one simply
chooses a fast language or just jumps into coding furiously. Think
data model and task/page flow factors thru well, and lots of the speed
gains will be realized without even trying hard.
After I implemented the above solution that resulted in the 69% reduction
in computer resources to do the same work as before, I decided to see
if there was anything else I could do. There was.
Buried deep in the code (they had copylibs/includes that called
copylibs/includes ad nauseum) was a logic error that resulted in
two calls to the database where only one was required. Since this
was in the main loop of code that was performed over and over again,
this simple one line code fix further cut resource utilization by about half.
Had someone taken a look up front to see where the bottlenecks were,
rather than assuming a big performance improvement project was
required, thousands of dollars in labor cost could have been avoided
with a two line code change. It took me about four hours to find and fix the
logic error. They had spend many months using several high paid people
designing a solution to the problem before I was hired on and assigned the
task. Had I found and fixed the initial logic error up front, the
whole project would have been unnecessary. But that was not my option,
as I was simply hired in and assigned the project.
This goes back to my earlier advice to focus on getting good at
whatever language (and configuring of server software) you choose
and profiling the system and it's components to see where the bottlenecks are.
Moreover, when one looks at the results of profiling, one has to use
one's brain to ask questions like "are these the results I would expect -- and what
might be wrong with this picture?" to find the type of logic error I found.
Hope the above is helpful food for thought -- and speed increases!
Take care,
Louis
[edited by: jatar_k at 4:49 pm (utc) on Sep. 5, 2002]
[edit reason] fixed broken b and i tags [/edit]