homepage Welcome to WebmasterWorld Guest from 107.21.163.227
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

This 31 message thread spans 2 pages: 31 ( [1] 2 > >     
Unstable web layout when mixing utf-8 inuput/outputs.
css perl changing layout utf-8 unicode Linux
Paymaan




msg:3802941
 10:34 pm on Dec 8, 2008 (gmt 0)

Hi Everybody.

I am building a new website, everything is written in utf-8. I have several js and css files, along template html files. The output is in xml and xhtml, and I have tried to write everything strictly.

Generally, my perl program reads from a csv text database, mixes some of it with html templates, which also contain css and javascript refereces to files I wrote or libraries like reflection and prototype etc.

I have two servers to run and test these. First is local on my own computer running Linux Mint, Apache2, perl 5.8.8 etc. Second is a Debian which final results will be hosted on.

1. when I browse an html file (containing the same css and js etc, whithout passing the template through the perl file (there is one static index.html containg the same as what perl generates as index.html), there is no problem, everything is stable and a thousand refreshesh on any of two servers shows no difference or unstability.

2. Then when I browse the same code generated by my perl program, every now and then, when I refresh the page, it shows a little difference (change in font size, or line height etc.).

I guess this is maybe because I mix many files and a few of them might not be utf-8 (although I have tried to read all and save as utf-8 on Linux), and that the generated output misses some of the css info, but why only sometimes?! why this doesn't happen every time?

Can anybody please help?!

Paymaan.

 

phranque




msg:3816715
 7:03 am on Dec 31, 2008 (gmt 0)

do all your markup and stylesheets validate correctly?
do you ever get errors on your javascript console?
have you tried capturing the cached documents to see how they differ from a working version?
have you tried using the unix file command to verify that the data contained in your files is actually utf-8?

krugs




msg:3816741
 8:03 am on Dec 31, 2008 (gmt 0)

Very doubtful it has anyhting to do with the UTF8 format or perl. Its probably a browser issue. Try a different browser than the one you have been using and see if it does the same thing.

Paymaan




msg:3816747
 8:20 am on Dec 31, 2008 (gmt 0)

Thanks for answering me finally, This issue persists on different OSes and different browsers. I use firefox 3, IE7, Safari (On windows). Although layouts differ a little, but that special problem persists on all of them.

Can it be because maybe some of my javascript (maybe, only maybe scriptacalous or prototype), are not in utf8 while the rest is?

wildbest




msg:3816761
 9:08 am on Dec 31, 2008 (gmt 0)

It might be a BOM (byte order mark) issue.

If your perl program generates utf-8 encoded document but your server header content-type defaults to charset=iso-8859-1 for instance, then your layout may change rather unpredictably as browser doesn't know how to render those invisible BOM bytes.

You have to ensure that default content-type headers output in both your servers is set to utf-8.

Paymaan




msg:3816791
 10:50 am on Dec 31, 2008 (gmt 0)

How can I understand which type is being generated?

wildbest




msg:3816796
 11:22 am on Dec 31, 2008 (gmt 0)

Well, use some server header tool.

For instance, google for server header checker. Enter the full URI path for the document that you are checking server headers for. In your case that will be one of the documents/pages that is showing layout differences - that is passing the template through the perl file before html is generated and served to the browser. Test must show Content-Type: text/html; charset=utf-8.

If server headers are set to utf-8 then you might need to check some other bottle neck.

Say, check if your perl program correctly pulls and represents utf-8 encoded data from your database? BTW, are you sure your database is correctly setup to store utf-8 encoded data? A possibility is that the SQL files are written in Unicode with a BOM, which your MYSQL cannot interpret accordingly?!

phranque




msg:3816797
 11:24 am on Dec 31, 2008 (gmt 0)

you haven't stated specifically if you have a content type meta tag in your document head - something like this:
<meta http-equiv="content-type" content="text/html; charset=utf-8">

by the way, this thread probably has quite a bit more useful information than you need about utf-8:
Character encoding, entity references and UTF-8 [webmasterworld.com]

phranque




msg:3816798
 11:33 am on Dec 31, 2008 (gmt 0)

my perl program reads from a csv text database

are you sure your database is correctly setup to store utf-8 encoded data?

i misread the OP to mean text in csv files, not necessarily an actual db.
the db must indeed be configured for utf8.
if you are using mysql this may help:
MySQL :: MySQL 5.0 Reference Manual :: 9.1 Character Set Support [dev.mysql.com]

Paymaan




msg:3817123
 9:45 pm on Dec 31, 2008 (gmt 0)

well, let me resolve some issues here. First I have tested the site on two different servers, one local on my Linux Mint, which for sure doesn't alter the headers, and one on the real server of the site which is Debian.

Both sites show the same behaviour. Then About database. It is written by myself and is some kind of CSV, better say it is a pipe separated text file in utf-8.

Then yes, all my html/xhtml files contain the correct content type both for xml header and for html header, something like this:

--------------
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

-------

included on every page.

So what else might be the reason?

Something really simple came to my mind, I use xhtml for all tags, is there any "text/xhtml; charset=utf-8" ever exists?

Paymaan




msg:3817125
 9:49 pm on Dec 31, 2008 (gmt 0)

and this is Response headers from Webdeveloper plugin inside FireFox:

Date: Wed, 31 Dec 2008 21:47:56 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.4 with Suhosin-Patch mod_perl/2.0.3 Perl/v5.8.8
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html

200 OK

coopster




msg:3817144
 10:28 pm on Dec 31, 2008 (gmt 0)

Content-Type: text/html

Now go back up and read what wildbest said earlier.

phranque




msg:3817168
 12:03 am on Jan 1, 2009 (gmt 0)

what he said!

i just checked on a IIS-hosted site i've been working on and i thought i was ok because i have the following meta tag in the head:
<meta http-equiv="content-type" content="text/html; charset=utf-8">

if i do a "lwp-request -eSd 'http://example.com/'" i get the following Content-Type headers:
Content-Type: text/html
Content-Type: text/html; charset=utf-8

however if i use the firefox Web Developer plugin the response headers only show:
Content-Type: text/html

so it looks like IIS is adding its own HTTP Response header and that is getting precedence over the meta tag.
maybe the same thing is happening with your apache servers.

wildbest




msg:3817371
 4:17 pm on Jan 1, 2009 (gmt 0)

Thank you, coopster.

phranque, yes, not just IIS but every web server (incl. Apache) response header is getting precedence over the meta tag if BOM's are at hand, because those byte order marks are sent just after server headers i.e. before any meta tag included in html head section.

Paymaan, if you have no access to web server settings where your site will be hosted you can make a small modification to your template/perl program mix. You might want to include something like this:

<?perl
Header (type=>'content',val=>'text/html; charset=utf-8'); # HTTP Content-type header
?>

Hope that will resolve your issue.

coopster




msg:3817406
 5:06 pm on Jan 1, 2009 (gmt 0)

Thank you, coopster.

Well, I'm thinking Paymaan overlooked that part so I wanted to bring it back to the focus.

Paymaan, if you have no access to web server settings where your site will be hosted

And if you do you may want to consider using Apache to enforce the charset delivery.

Paymaan




msg:3817421
 6:21 pm on Jan 1, 2009 (gmt 0)

thank you all, Phranque, I issued this command and received something even more suspicious:

lwp-request -eSd 'http://example.com/'
GET http://example.com/ --> 200 OK
Connection: close
Date: Thu, 01 Jan 2009 18:07:03 GMT
Accept-Ranges: bytes
ETag: "7ac0e6-367f-9c6bb1c0"
Server: Apache/2.0.54 (Debian GNU/Linux) FrontPage/5.0.2.2635 mod_python/3.1.3 Python/2.3.5 PHP/4.3.10-16 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_perl/1.999.21 Perl/v5.8.4
Content-Length: 13951
Content-Type: text/html
Content-Type: text/html; charset=utf-8
Last-Modified: Sun, 28 Dec 2008 10:13:35 GMT
Client-Date: Thu, 01 Jan 2009 18:13:44 GMT
Client-Response-Num: 1
Link: </css/site3print.css>; /="/"; media="print"; rel="stylesheet"; type="text/css"
Link: </css/site3.css>; /="/"; media="screen"; rel="stylesheet"; type="text/css"

Almost the same happened to the http://localhost/ server, and its content type matches. Also that webdeveloper plugin in Firefox gives utf-8 in both servers as content encoding type, so do you still think this can be a BOM problem?

ANd wildbest, you pointed something which I guess it might lead us to something useful, I have my start page in two versions, one is an static all html page, and the one is the same but passes though the perl script. The static page is always the same with no problem, you made me think that my perl script gives the wrong header before templates, so that is the problem? Let me check this and I will give my report in a few minutes.

[edited by: phranque at 7:10 am (utc) on Jan. 2, 2009]
[edit reason] exemplified/unlinked urls [/edit]

Paymaan




msg:3817427
 6:38 pm on Jan 1, 2009 (gmt 0)

ok, I don't know what to say yet. I changed my subroutine html header to this

sub html_header
{
if ($HEADER != 1)
{
$HEADER=1;
print "content-type: text/html; charset=utf-8\n\n";
}

and firefox now says the content-type is text/html and charset is utf-8, Also I have this line in the start of my perl script is "use utf8".

But still the same problem happens in my localhost, any other suggestions?

I have access to my own localhost Apache server, but the main server is not easily accessible and I prefer resolve the problem without altering the server, if possible at all.

Something else, I have two content types produced, one inside templates, one in Perl script, may this be a source for problem?

wildbest




msg:3817439
 7:25 pm on Jan 1, 2009 (gmt 0)


if ($HEADER != 1)

What do you check with that if statement? There is no point because you would have the default header anyway. You have to overwrite it!

From what you've posted I can see that your web server header conetent-type is:
Content-Type: text/html

This must read:
Content-Type: text/html; charset=utf-8.

Paymaan




msg:3817443
 8:00 pm on Jan 1, 2009 (gmt 0)

Wildbest, That if only checks to see if the header is sent before or not, preventing double headers, specillay for errors.

The content-type includes charset=utf-8, and two times, specially since I added the charset to the perl script.

wildbest




msg:3817457
 8:41 pm on Jan 1, 2009 (gmt 0)

checks to see if the header is sent before or not

Yeah, but it's sent... So you do nothing to change it to utf-8!

It's a wrong approach to avoid "headers already sent" error!

If you have some heavy scripting/checks before html output you have to open a buffer, do whatever you have to do, get buffer contents into variable, clean buffer, send headers, print that variable. Voila...

Paymaan




msg:3817472
 9:10 pm on Jan 1, 2009 (gmt 0)

Maybe I can not understand what exactly you mean, but my program, simply checks a few conditions before giving out the whole template out. If any errors happen, it will instead give the header out and an error message and quits.

If no problem happens, it will give the header and out put. That's all. So where I am doing a mistake? please describe so I can understand.

wildbest




msg:3817496
 10:18 pm on Jan 1, 2009 (gmt 0)

Paymaan, make sure your code doesn't send any output before you change server content-type header to utf-8. I understand that this is a very general advice but I'm afraid that's all I can do without going too much into scripting details.

Paymaan




msg:3817558
 12:16 am on Jan 2, 2009 (gmt 0)

The first thing my code sends to out put is that line of Content-Type:, now including charset=utf-8, but the unstability still exists with dynamic pages

[edited by: Paymaan at 12:28 am (utc) on Jan. 2, 2009]

krugs




msg:3817564
 12:28 am on Jan 2, 2009 (gmt 0)


Also I have this line in the start of my perl script is "use utf8".

Has nothing to do with the output of your perl program, it enables UTF8 in the source code.

Quoted from the UTF8 function:

Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.

[perldoc.perl.org...]

krugs




msg:3817566
 12:36 am on Jan 2, 2009 (gmt 0)


The first thing my code sends to out put is that line of Content-Type:, now including charset=utf-8, but the unstability still exists with dynamic pages

I still think this is a browser/server/caching problem. The fact that the pages are delivered dynamically by your script might be contributing to the problem. But the fact that it seems to happen only occasionaly in thousands of page refreshes seems like its nothing to worry about either. Few people are probably going to refresh your page like that over and over.

The static page is probably cached locally on your computer, and the dynamic page is not. Everytime you refresh the dynamic page the page and the style sheet has to be fetched and parsed, and there is a bit of a timing issue sometimes where the page loads before the style sheet loads. There might be something in your html/ccs that also contributes to the problem.

Paymaan




msg:3818383
 9:18 am on Jan 3, 2009 (gmt 0)

I am still wondering why exactly this happens, and I am sure somewhere in my code something should be changed.

If it was once in a thousand times as you say, there was no concern, actually IE on windows shows worse than that when it is gets the page first and images and CSS later and I am not too much concerned with IE problem.

But this one we talk happens something like every 5 to 10 times, and when it happens, while no major problem happens, most of the page content except one div are shifted about a half a centimeter upwards. And this is seen obviously, which makes me very uncomfortable.

wildbest




msg:3818429
 12:01 pm on Jan 3, 2009 (gmt 0)

actually IE on windows shows worse than that when it is gets the page first and images and CSS later and I am not too much concerned with IE problem.

I still don't understand this. How does browser get page first and then images/css later?! You should not allow this! You have to use output buffering to process templates and db and only when final html is generated you should get output buffer contents and print it!

But this one we talk happens something like every 5 to 10 times

If default browser zoom level is different from 100% and if your template uses 1 px div's to simulate graphical elements like arrows or whatever, then yes, upwards or downwards shifts are possible as well.

Paymaan




msg:3818767
 6:10 am on Jan 4, 2009 (gmt 0)

About first problem, which has not been the concern of this thread, I guess you are right, but as I remember I buffer the page and then output it. CSS is inside templates, not inside the Perl, and has never had any problems on any of my other sites based on the same script. But none of them has been in Unicode Perl.

That problem is like IE loads the pages, sometimes all the paragraphs and images are not laid out correctly (text over image, over table, and an image over another image!), and when you refresh the page, everything is fixed then on.

The main problem, and zoom level, all zoom levels are normal, and I don't use 1px div's. I only simulate an spacer in only one of templates, and that is with a png image. Problem happens aon every dynamic page.

krugs




msg:3819024
 8:43 pm on Jan 4, 2009 (gmt 0)

I still think the problem has nothing to do with perl or CGI but maybe it does. The fact that your perl code is UTF8 is also not a perl/cgi problem, at best its a browser problem or some issue related to timing or caching. Maybe run the html code that gets generated by your scripts and templates through some html/css validators and see if there are things that need fixing.

phranque




msg:3819160
 1:53 am on Jan 5, 2009 (gmt 0)

have you tried capturing the document generated by your script and analyzing it to verify that the data contained within is actually 100% utf-8?
the unix file command can help you determine the encoding of file content.

i would imagine the css and javascript content should also be served utf8 encoded.

This 31 message thread spans 2 pages: 31 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved