Forum Moderators: coopster & phranque

Message Too Old, No Replies

Fastest Web scripting language?

         

bowpay

7:32 am on Aug 29, 2002 (gmt 0)

10+ Year Member



Ok guys been reading the forum and getting some ideas on what scripting language i should use for my database. My main problem is right now all i know is HTML and PHP. I need a database able to hold millions apon millions of records at the same time providing the fastest load times available. This site will track very valuable so it also has to be setup for ecommerce. My question is What language should i use? PHP,CGI,ASP... I did php on a site and it seems to have a problem with load times. CGI i hear is very hard to learn and i would probably have to hire out for that. We are going to use oracle or at least want to. The sites frontend will be done in html because to me it seems the fastest. Any suggestions would be greatly appreciated. If this is to hard we are looking to hire a programmer. Our site although i can not say right now will be a major dot com company if we can work out all the kinks right now. if you would like to know more then please post or send me an email.

PaulPaul

8:02 am on Aug 29, 2002 (gmt 0)

10+ Year Member



I think the real question you are asking is, what is the fastest way to retrieve data from a database containing millions of records.

IMHO, scripting languages are all very fast, and will not be your bottleneck in delivering content to the users.

The biggest time consumer will be the database and data retrieval.

There are many variables to tweak and adjust to retrieve data from a DB fast.

Some of these include:
-Server Processor power.
-Server RAM and speed of the hard drives.
-Database normalization.
-Building proper Indexes.
-Building proper SQL retrieval statements.

When that is all done. And perfected, you should be able to achieve results similar to WebmasterWorld, EBay, and Google.

Hint: #1 Probably takes a professional. Maybe even a team.
Hint: #2 Oracle is an excellent solution for a DB. The problem is the cost of implementation. Do some research, call Oracle and SUN. You probably want a SUN SPARC system, although I hear they have gone down in price, they are quite pricy.

But to answer the question in a PERL forum :) Perl/CGI is extreamly fast, stable, and not that hard to learn.

Hope that helped.

Paul

bowpay

8:11 am on Aug 29, 2002 (gmt 0)

10+ Year Member



I was afraid of that :) We plan on getting a very nice setup im just trying to cut costs now so we can get this database up for the time being to try out things. One more question a little more specific to the site if we were going to do a paypal database would you use the same languages and database? Where could we get a database already designed for this type of application or could we?

PaulPaul

8:49 am on Aug 29, 2002 (gmt 0)

10+ Year Member



One more question a little more specific to the site if we were going to do a paypal database would you use the same languages and database? Where could we get a database already designed for this type of application or could we?

I am a little unclear about what you mean by PayPal database. If you mean a database to store clients who use PayPal information, you can use the same database. I actually can not imagine a reason to use a different database, except to give you a headache :)

If you are worried about hackers, stealing your data, that is a legitimate concern and should be dealt with seriously. But creating another DB for this purpose will not solve this problem.

bowpay

8:54 am on Aug 29, 2002 (gmt 0)

10+ Year Member



Actually the business is like paypal so we need a database just like paypals. but my problem is all i know is html and php.

PaulPaul

9:01 am on Aug 29, 2002 (gmt 0)

10+ Year Member



A typical DB, any will do. Most financial institutions run on Sybase. Sybase also comes with a very high startup cost, but is considered the best DB for financial applications within the financial community.

bowpay

9:05 am on Aug 29, 2002 (gmt 0)

10+ Year Member



so doing it in php right now shouldnt be done. I just wanted to get this site working so we could turn the key on next month and start the business. i guess what im asking is can we design it with php and upgrade later without losing our database? We will design in php if so.

PaulPaul

9:18 am on Aug 29, 2002 (gmt 0)

10+ Year Member



PHP is fine.

All the major scripting languages (CGI, PHP, CF, ASP) are "fast" enough.

The only real advantage of PHP, when compared to other scripting languages such as ASP or ColdFusion, is that it is open-source and cross-platform, and has extreamly small startup costs.

bowpay

9:19 am on Aug 29, 2002 (gmt 0)

10+ Year Member



ok so last question of the day lol if we did it in php and 6 months from now we want to go to a oracle database with pearl/cgi will we lose any information and will the site be able to be transferred over easily?

PaulPaul

9:27 am on Aug 29, 2002 (gmt 0)

10+ Year Member



Changing from PHP to Perl/CGI, would require all web scripting to be re-written. Whenever changing programming languages, at the very least some re-writing is in-evitable. Although, most the time this ends up being beneficial in the long run.

...if we did it in php and 6 months from now we want to go to a oracle database with pearl/cgi

Just to be clear, we both understand, that PHP can easily be set up to work with Oracle, right?

bowpay

9:40 am on Aug 29, 2002 (gmt 0)

10+ Year Member



i do now. lol Im not a programmer don't have much of a knowledge of it i do design work, ideas, logos, etc.. my partner handles all the backend of course he doesnt know cgi either thats why we didnt know if we should wait, get the funding, then build the site or just build it now with php and the resources we have now.

jatar_k

4:54 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld bowpay,

>>did php on a site and it seems to have a problem with load times

scripting languages don't create load times, programmers create load times.

You might be confusing the fact that you have two seperate decisions to make here. DB and Scripting.

Oracle or Sybase are both very robust, powerful and expensive solutions. Either one of the two will do.

Perl or PHP will do it for you. If you are using millions of rows there will have to be a lot of testing done. Scalability will be your biggest issue. If you have sloppy queries or big loops you can easily slow the site to a crawl.

PaulPaul, it is a PERL and PHP forum ;)

bowpay

9:08 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



I understand what your saying makes since. We will start the design work in php. <snip> is the site i designed in php however i think because of all the stuff we have on it it may take longer then the average person just seems as it doesn't parse files as fast as some of the other sites. Again its probably are coding and not being clean enough. <snip> is only the second site i have done in php so im still learning. Thank you to both of you for really helping me out. I really appreciate this. :) This site rocks.

[edited by: jatar_k at 9:13 pm (utc) on Aug. 29, 2002]
[edit reason] no url drops see TOS [/edit]

jatar_k

9:16 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Sorry bowpay, we do not allow urls see the TOS [webmasterworld.com]

We try to keep things general and not discuss specific url's.

Whenever you need it, you know where to find us. ;)

<added>not a bad site

bowpay

9:36 pm on Aug 29, 2002 (gmt 0)

10+ Year Member



oops sorry about that didn't read that. Won't happen again just wanted to show what I did. :)
Thanks again

jatar_k

9:41 pm on Aug 29, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



no problem and you're welcome.

Hemsell

1:32 am on Aug 30, 2002 (gmt 0)

10+ Year Member



Are you going to be doing a lot of selects or a lot of inserts?
That changes everything.
I just went to a developer conference on data access via ADO.NET
there are a quite a few benefits in the way it handles memory and recordsets. it also allows you to use XML AND relational tables.
The gist of the class was on using the methods of classes without creating instances of classes. they will load into memory when they are first called and remain open until the app closes.
When the recordset is returned you can run multiple queries and sorts on it without having to go back to the database.
It has an exprire mechanism also so you can have the set refreshed at an interval set in your code.
Once you write though the set should be killed and refreshed.
From what I gathered this is the middle tier of data.
It was a pretty good class for a free class. 5 hours of C# and ASP.NET and very little dog-N-pony.

If it were mine thought i would still go with a *Nix based program because of the cost.
I imagine, and have seen huge performance variances in Databases based on who is maintaining them and who is writing code against them.
I have SQL Server on a dual P3 800 system with 1.5 gig of ram and it is a dog. mostly cuz I ain't got a clue...
Which I guess just mutes most of my not so pointed points here.

Todd

Woz

2:01 am on Aug 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One thing to consider is that, as jatar_k says, there are two issues here, Database and Scripting.

I am by no means an expert on either, but one mistake I see a lot of people falling into is expecting the database to be simply a raw storage facility with all the work being done by the sripts.

A good relational database of any flavour is designed to store and manipulate data using the inherant relationships. It makes sense to me to use those strengths to their best advantage.

Unfortunately, some people expect the scripting language to do all the grunt work wheras it is really best at retieving and sending data from/to the database.

I use ASP and Access for small projects and find that I can speed things up quite considerably by carefully designing the database to run internal queries and then using ASP to get access to the results of those queries. I would suggest a similar approach.

Of course the scope of the project bowpay is talking about would require a database far more robust than access, but the principles remain the same. Use each component, database and scripting, to do what it does best and your system should be quite speedy regardless of choice of database or scripting labguage.

This is assuming of course that you are/have/get competant professionals for each part.

Onya
Woz

PaulPaul

2:34 am on Aug 30, 2002 (gmt 0)

10+ Year Member



When the recordset is returned you can run multiple queries and sorts on it without having to go back to the database.
It has an exprire mechanism also so you can have the set refreshed at an interval set in your code.
Once you write though the set should be killed and refreshed.
From what I gathered this is the middle tier of data.

Hemsell,

This feature, querying a returned recordset is a feature that has been available for a while in most scripting languages and I use it on my site using ColdFusion.

bowpay

5:08 am on Aug 30, 2002 (gmt 0)

10+ Year Member



Well now that i am totally lost. lol :) i know what we need to do. We have to just keep up the marketing and market this to a Venture capitalist that will take the idea. I don't have the type of backing you guys are talking about. Need to find 1 million dollars. Easy enough where is the Venturecapitalworld website? lol. Thanks once again guys you have been helpful up to the point you lost me. :)

Woz

5:27 am on Aug 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Haha, I think you have it BowPay. The type of system you are talking about is serious stuff, and would require serious dollars to recruit the expertise if you don't already have it. Maybe partner/venturecapiltal is the way to go.

Wish you luck, let us know how you fare.

Onya
Woz

v_1_c

4:57 pm on Aug 31, 2002 (gmt 0)

10+ Year Member



I have found PHP to be alot faster than ASP , but Ive heard that JSP (Java Server Pages) is good when it comes to large scale sites .

v1c

Lisa

7:23 am on Sep 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Speed in any language is determined on how close the language can get to machine code. C or Assembly Language would be the fastest since it can get the closest. However C is not a scripting languages. For scripting languages I would say look for a scripting language that can compile. Microsoft's ASP.NET stuff I believe compiles, so if you are using a Windows platform I would go that direction. But if you are going Unix based I would say PHP because they have the Zend Optimizer. If the scripting language needs to be parsed and interpreted every time it is run then it will slower.

But in my practice, I have found that language speed is no longer an issue. Processors and Ram have come a long way; I think your slowest part of a webserver now days will be the bandwidth used to connect the server to the end user. Make sure your host is located on a backbone and has redundant connections with all the large data carriers.

bowpay

7:45 am on Sep 5, 2002 (gmt 0)

10+ Year Member



Thanks lisa that really helps. Well if any of you programmers out there want to be part owners in a internet payment business like paypal let me know we need you. We will probably run it in php with an oracle database. If you are wondering about the business put it this way do you see my name add an extension on to it. lol I hope you mods don't get mad at that. The business plan is just about done and i have contacted some huge internet companies and waiting for some of them to respond. I'm affraid two people starting this company is going to be just a little too hard. Well once again thank you all for helping me out and have a great day!
Brian

aspdaddy

8:58 am on Sep 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IMO you should first be looking at building components written in C++ or Java for this, not scripting languages.

As alreeady mentioned above, the server is the top priority for speed ( no. of processors / GB's of ram) can be v.pricey so you will need to carefully design the server based on your expected usage/biz plan.

Also optimised Oracle functions, stored procs etc on the db server will have an impact on performance.

stlouislouis

2:54 pm on Sep 5, 2002 (gmt 0)

10+ Year Member



FWIW, a few thoughts...first about an individual program, then about a system:

I recall reading a book entitled The Zen of Code Optimization a few
years ago. The author mentioned a code contest where folks were to
take a perfectly good implementation of the computer game "Life"
written in standard C and optimize it for performance. The winner's
entry was 300 TIMES faster than the original program, and also written in standard C.

One thing to keep in mind is that there are often tradeoffs in making
a program faster. Often the way you do so code wise makes a program
less maintainable. One example is unrolling loops; i.e. instead of a
do...while or perform...until construct, one just repeats the code as many
times as it executes (maybe with goto logic based on a flag value; note most NEVER use goto logic). Unrolling
loops was one technique the winner of the coding contest used. However,
code like that would likely increase the time and cost of maintenance
over the lifetime of the program. And people time generally cost more
than faster hardware when you look at total lifecycle cost of a system.

Second, one really needs to profile to see where the bottlenecks are,
and where the most speed improvement for the time and effort expended
are likely to come from. This applies not only to individual programs,
but the entire system, including hardware components as well.

As touched on in an earlier post, it does no good to make your program faster if your
server is bottlenecking on the network connection.

Likewise, there are often optimizations one can do to software
configurations that will make a big difference. Making sure the
interpreted byte code from a script is kept in -- and executed from --
the webserver's cache after the first execution, rather than reinterpreted
during each execution of the script is one example.

Consider what the slowest pieces of hardware are in a typical server.
Often it's the hard drive. One solution may be to place the database
on one fast SCSI hard drive, and the web server and scripts on another.
However, an even better gain in system speed/thruput may be gained by
adding enough RAM to keep the entire database in memory -- and configuring
the server so the database is indeed resident in memory, yet with
updates going to disk in case of power failure.

Keep in mind that there may be practical limits of what one can do with
any given platform of hardware and software. Rather than trying to
get a level of performance a given specific infrastructure is simply
incapable of or not suited for, consider if another "tool" would be right
for the job. Maybe several servers behind a load balancer, for instance,
rather than trying to milk yet more performance out of a single server.

Moreover, while I too share a desire to find the "ultimate" language
to program in, to me, real world considerations like existing code
libraries, maturity of the code base, relative prevalence of security issues for
a language and the likely total life cycle cost of development in language
"X" .vs "Y" overshadow a theoretical advantage one language has over
another assuming guru level coding in either.

This brings up an interesting point. If you are talking code speed or
developer productivity, the skill of the developer must be taken into
account. Just because it's theoretically possible for a person to write
faster code in language "X" .vs language "Y" doesn't mean you will
realize the speed gain.

Moreover, I think it better to choose from among the languages that are popular
and considered good choices -- and become as expert in that language as
you can if your focus is on writing fast code. However, once you find
out what's involved in writing code with an emphasis on max speed, you
may decide it's wiser, more cost effective, and takes less total development
time to simply improve the hardware/software/server/network configuration.

Several years ago I worked as a contractor at a major nationwide
brokerage firm. Their computers ran at over 100% expected capacity during
the day. They had an online transaction they performed continuously
that really ate up a lot of their computer capacity. They had already had a lot
of folks working on a solution to improve performance -- for about 2 months.

A solution was decided upon and I was assigned to do the fix. When I looked
at what they wanted to do, I realized the solution would be a maintenance
nightmare. After I shared this with my project manager, he asked me
to see if I could think up anything better. I did. They could not
believe I came up with a simple solution that lots of very smart folks,
including system programmers, DBAs and senior programmers had missed.

I told them it wasn't because I was smarter -- it's because I simply
asked myself, while thinking about the problem, a simple question whose
answer led to a greatly simplified solution that would not only
perform better, but take less time and eliminate the maintenance
nightmare the original solution would have caused. They hadn't thought
to ask themselves that question -- because it was foreign to how they
thought about the subject domain. Better (or fresh) thinking about the
subject domain may lead to insights that yields much improved solutions.

They were quite pleased with my solution I implemented for them. It reduced resource
utilization by that online transaction by 69%. To them, that was
a smashing success, as the resources used by that transaction by all
their nationwide brokers all day every day was bogging down their systems greatly.

But alas, no bonus for me....sniff...

My point is that hard thinking about a problem often gives the best gains.
Doing things well up front during problem definition and design stages
often results in a faster less trouble free system than if one simply
chooses a fast language or just jumps into coding furiously. Think
data model and task/page flow factors thru well, and lots of the speed
gains will be realized without even trying hard.

After I implemented the above solution that resulted in the 69% reduction
in computer resources to do the same work as before, I decided to see
if there was anything else I could do. There was.

Buried deep in the code (they had copylibs/includes that called
copylibs/includes ad nauseum) was a logic error that resulted in
two calls to the database where only one was required. Since this
was in the main loop of code that was performed over and over again,
this simple one line code fix further cut resource utilization by about half.

Had someone taken a look up front to see where the bottlenecks were,
rather than assuming a big performance improvement project was
required, thousands of dollars in labor cost could have been avoided
with a two line code change. It took me about four hours to find and fix the
logic error. They had spend many months using several high paid people
designing a solution to the problem before I was hired on and assigned the
task. Had I found and fixed the initial logic error up front, the
whole project would have been unnecessary. But that was not my option,
as I was simply hired in and assigned the project.

This goes back to my earlier advice to focus on getting good at
whatever language (and configuring of server software) you choose
and profiling the system and it's components to see where the bottlenecks are.

Moreover, when one looks at the results of profiling, one has to use
one's brain to ask questions like "are these the results I would expect -- and what
might be wrong with this picture?" to find the type of logic error I found.

Hope the above is helpful food for thought -- and speed increases!

Take care,

Louis

[edited by: jatar_k at 4:49 pm (utc) on Sep. 5, 2002]
[edit reason] fixed broken b and i tags [/edit]