All I hear is how slow it is
| 8:49 am on Aug 11, 2002 (gmt 0)|
I wrote a script that ran through a 500kb html file, removed all of the tables (330 as it turned out), wrote each table to its own file and then wrote the remainder back to the original file.
All of this was done very quickly (no actual numbers).
So where does it slow down?
| 12:29 pm on Aug 11, 2002 (gmt 0)|
Each computing language is designed for its set purpose. Perl in general is a great parsing and reporting language. You might be using lots of regex to parse through the HTML, and I believe the regex engine in Perl would be written in C for optimisation. Therefore if that's where your program spends most of its time, then you will not see much slow down. In fact, coding the entire thing in C might make its execution slower because it is easier to make mistakes when you try to hack up something in C.
On the other hand, try to write a Perl-only program that does lots of computation. For example, matrix transformation or cryptographic problems, where there are lots of calculations in tight loops. You will find scripting languages are far slower than compiled languages.
| 4:50 am on Aug 12, 2002 (gmt 0)|
One of the big things that makes perl slow is that it's CGI. this means a new process is created for each call to a perl routine from a web page. Thus, if you have 1,000 people accessing your guestbook at the same time, 1,000 different processes are created. The process creation time is somewhat slow, and many of them can really slow things down.
That's why languages such as ASP and PHP are taking hold so well. They are more efficient.
| 5:21 am on Aug 12, 2002 (gmt 0)|
To back Rich up: Perl executes through CGI, which is an out of process technology which spawns a new process (an expensive operation) for every request. ASP, at least, is in process, which means it does not require a new process to be spawned to execute each request.
I haven't used PHP recently (my last expernience was on PHP3. I think it actually ran through CGI out of process also, but my PHP applications were just administrative backends and did not really test the performance issues. I will say, though, that we coded the the high performance front end of that app in C /Apache and didn't even consider coding it in PHP because we knew it would be performance critical.
All three of these languages are interpreted. This is a very large performance hit, because an interpreted language must translate the text to machine code in run time (while a compiled language does the translation when the code is compiled). It depends on the application, of course, but it can have a huge performance impact.
You really only notice this stuff if you are in a performance stressed situation. If you are running just one process or have a small to number of visitors, all three languages can perform quite adequately. You may need lots of traffic for the performance difference to be manifested.
This traps new web application developers all the time. You produce a great Perl application, test it and really shake it out. It works fine. You deploy it with some kind of promotion campaign and, suddenly you are in big time performance hell. All that interpreted code and out of process execution cause some kind of geometric progression of performance degratation as the number of requests goes up. I've been there, and it is not pretty. Do not be fooled by the apparent adequate performance of these tools in low traffic situations and attempt to use them in a performance intensive application.
The cool thing about perl, php and asp is that they are so accessible and hacker/developer friendly. You can easily and quickly throw together an application. They are also great for prototyping more performance intensive apps before translating them to C or a high performance compiled language.
The difference can be dramatic (orders of magnitude) in a high performance situation. And be unnoticable in a low performance situation.
I am working on an ASP app as we speak in another window. It is great and I would recommend it (or Perl or PHP) in many situations. But, not on a significant web app that expects lots of traffic running complex logic.
| 5:28 am on Aug 12, 2002 (gmt 0)|
If you like perl and speed is an issue use mod_perl.
| 5:30 am on Aug 12, 2002 (gmt 0)|
I don't know much about mod_perl. Why do you say that? Is it compiled? Is it in process? What are its advantages?
| 5:41 am on Aug 12, 2002 (gmt 0)|
It is compiled ones at the initial call, and then the byte code sits in ram there after. Because of this mod_perl scripts have to be well written to prevent memory leaks, however, mod_perl has been clocked at speeds faster than compiled C and php4.
| 5:43 am on Aug 12, 2002 (gmt 0)|
cool. jit perl. Like java.
| 7:15 am on Aug 12, 2002 (gmt 0)|
PHP now can run as a pure apache module too. Most of them do now.
Side-by-side, mod perl and php are almost identicle for speed. If you use a db, PHP gains a bit with it's compiled SQL routines whereas perl uses the slow bloated DBI package.
Even in a cgi environ, spawning a new process is not all that expensive under *nix any more. Most of the work is still cpu side, and with faster processors, the issue is declining. How many file accesses does spawning a new process cost Linux?
Still of issue though, is disk accesses. Although we've seen rpm raised to 7200 most places and even 10k for some scsi drives, the overall throughput is still limited. I've got a 60 gig 5400 rpm drive with 8meg cache, and a 120gig 7200rpm drive with 8meg cache and it's very hard to tell them apart speed wise. The difference is neglible. However, I can tell the second the system wants to go to the old 15giger with only 512k cache. Cache is "cash".
I've learned there is a real art form to tweaking speed out of Perl code. As far as I can tell, that art is only learned from experience. I've not run into any holy grails of reference. There are just some things you can do and not do to speed it up.
The biggest of which, is to realize that you can't "bolt on speed" after the code is written. In C or ML, you can pretty much fake it until crunch time and then optmize the code you have. However with Perl, the speed has to go into the design process. There's only so much you can do after-the-fact without rewriting major sections of code.
The real joy of Perl though, is the speed with which you can develop code. You can do some powerful aps in short order.
As an old school basic programmer of many years, ASP is interesting. I just can't see doing it with all the problems that ms servers have had and continue to have. Seems like you'd spend more time trying to figure out if the problem was yours or the systems. The reliability and speed isn't there (you can always tell a ms server because of the speed - lack thereof).
So far, we've been down 8 hours here this year ( 6 because of the data center outage last month, and 2hrs for sys upgrades - not one time because the system itself failed) - that's pretty reliable.
| 6:35 am on Aug 13, 2002 (gmt 0)|
If you are comfortable using Perl (or if Perl functions are an issue) and speed is critical -- take a look at this page:
| 2:11 pm on Aug 15, 2002 (gmt 0)|
I thought originally mdharrold was not necessarily asking about server-side scripting environment for perl, but just execution time in general... Anyway.
Spawning a new process might get cheaper and cheaper these days, as the overhead for OS to do context-switching, setting up new memory space, etc is fixed, and the computers are getting faster each day. However, there are something much more expensive associating with spawning a new process, where long-running application servers do not suffer.
First of all, there is dynamic library linking time, if some of the modules you load requires lots of .DLL or .so. Especially C++ dynamic libraries, it can be quite expensive to for the C++ stub to re-establish all the pointers in the memory.
Then there is application-level overhead. For example, making a connection to the database. It might not be a concern with flatfiles or MySQL, but it can be slow to connect some heavy-weight DB backends. If each CGI process needs to perform a new connection, then the overhead adds up.
There is more application-level overhead. Initialising global variables from XML files, initialising a CORBA interface to make IIOP calls, etc. Some of these ovehead won't be solved by using mod_php, but is definitely doable with Mod_Perl or FastCGI.
I personally prefer FastCGI because it allows me to write CGI backend with whatever language I wish (in fact I dislike Perl, but that's another story). Mod_Perl is running Perl intepreter inside the Apache process (like Mod_PHP), whereas FastCGI is linking the webserver (not necessarily Apache) with a out-process long-running CGI application using TCP/Unix Domain/Win32 named pipe.