homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Local / Foo
Forum Library, Charter, Moderators: incrediBILL & lawman

Foo Forum

What is the worst mess you have cleared up
Bad designed, insecure, incompetent SEO, bad coding, whatever

 7:47 pm on Oct 3, 2011 (gmt 0)

Can anyone beat this?

A site that I was originally asked to make a minor change to, but which, when I looked at it, I just said it needed a complete re-write.

Lets start with the security. No protection against SQL injection: user inputs were concatenated with strings to create SQL. I do not know enough about PHP sessions to know if they were being used properly. None of that really mattered because PHPMyAdmin was running (a very old version, needless to say) without requiring a password. All you needed to know was the URL.

SEO was good. The same set meta description and keywords on every single page. For good measure both, and the title were repeated as meta http-equiv=keywords (and description, and title).

Each of the main product pages appeared in several variants on different urls: the basic page, with an image gallery, a printable version, etc. All spiderable, with nothing to indicate which was canonical.

The PHP was in a class of its own - and that is not a reference to OOP. I really, really hope it was auto generated. The original developer did not seem to like using include, so the only thing that was included on most pages was a database class. There was a LOT of stuff repeated.

The CMS was clever. Each type of page had its own folder in /admin containing a completely set of standalone admin scripts - not quite because they used include for the database class and a set of input validators. All the validator functions did was check the input was non-blank: except for image validators which check size as well.

The only connection with the rest of the CMS was that they all checked the same settings in the PHP session to check the user was logged in (all boiler plate code repeated on each and every page).

The database structure was inventive. The home page, the "about us", the T & C, and the front page of each section, and some settings each got their own one row table. As this did not create enough tables, some pages were spread across multiple tables with the same primary key to tie them back together. There are good uses for SQL one to one relationships, but this was not one of them.

I persuaded the owner of the site that he was better off if I did a new site from scratch: same text and images, URL redirected, etc.

Once the new site was up, the previous developer contacted the site owner to warn him that the the meta keywords were missing and this could damage is search engine rankings.



 8:21 pm on Oct 3, 2011 (gmt 0)

Some sites are beyond repair and it's better to start over from scratch. But in some cases you don't have a choice and you have to deal with incompetent code and spaghetti structure created by people who know just enough to be dangerous. This is when I hate my work.


 8:33 pm on Oct 3, 2011 (gmt 0)

A 30 page site using a CMS that was so botched, it exposed more than 1000 URLs. Duplicate Content in the extreme.

Another site where in order to write content out to hundreds of paragraphs, it used a massive chunk of PHP code (reading from the database then outputting as HTML) and with just a minor change it was repeated hundreds of times; never a loop or counter to be seen.

Product pages with more than 30 parameters in the URL, only TWO of which had any effect on what was displayed.

Parameter based URLs where some of the parameter values were used to write data directly to the HTML page, such as the page title or section headings. You could type in a slightly different URL and show a page with the title "We sell overpriced junk". With a link from elsewhere, said URL could be indexed too.

Site supposedly using search engine friendly URLs, but hovering over any on-site link revealed a long non-www URL with multiple parameters - certainly not an SEF URL. Clicking the link showed the new page and the browser URL bar showed a www URL that looked to be friendly. But how did we get there? The Live HTTP Headers extension for Firefox showed a series of three or four chained redirects invoked when any link is clicked. Total disaster.

This is a thread that could run and run...


 8:51 pm on Oct 3, 2011 (gmt 0)

Worst mess? How about one that never got to be a mess? Inside of two minutes (phone call) I stopped 'em and asked: "Are you out of your mind? Count me OUT!"

Some jobs just aren't worth the money, no matter how much is thrown at you.


 9:03 pm on Oct 3, 2011 (gmt 0)

Some jobs just aren't worth the money, no matter how much is thrown at you.


At my age I can do without the hassle. I reject any work that looks like it may cause any sort of trouble. I am a very generous person. I let others do this. ;)


 10:33 pm on Oct 3, 2011 (gmt 0)

Worst mess I would dearly love to clean up, if only I could get past trifling obstacles like not knowing a word of Java and having no clue as to the putative webmaster's password (why can't he do like on TV and just use his birthday?):

<br><nobr>&nbsp;&nbsp;&nbsp;<input type="Radio" name="font" value="NunacomU, Ballymun RO" checked>
<font face="verdana, helvetica, arial, sans-serif" size="2" color="#000080"><b>&nbsp;Unicode :</nobr>
<br><nobr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Pigiarniq, </nobr>
<br><nobr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NunacomU or </nobr>
<br><nobr>&nbsp;&nbsp;&nbsp;<input type="Radio" name="font" value="prosyl" >
<font face="verdana, helvetica, arial, sans-serif" size="2" color="#000080"><b>&nbsp;&nbsp;ProSyl or </nobr>

(with some line breaks added by me)

This particular page only has three-and-a-half nested tables (that is, four <table...> and three </table> ). The main page has a total of 45 (22 open, 23 close). I'm kidding. They only nest four deep. That's counting plenty of these:

<table border="0" cellpadding="0" cellspacing="0" width="100%" height="10">
<td height="10" align="center" bgcolor="#000080"><img SRC="/images/singlepix.gif" width="10" height="10"></td>

If you are thinking that this sounds suspiciously like {border: 10px solid #000080;} you are entirely right.

I couldn't find any of the pages that give their charset as 1252, though they exist. In general it's

<%String current_browser = request.getHeader("User-Agent");
if(current_browser.indexOf("MSIE") >= 0){%>
<META HTTP-EQUIV="Content-Type" content="NO-CACHE; text/html; charset=iso-8859-1">
<%}else if(current_browser.indexOf("Mozilla") >= 0){%>
<META HTTP-EQUIV="Content-Type" content="NO-CACHE; text/html; charset=UTF-8">

How this is intended to work is anyone's guess. I do remember that back in the day, MSIE couldn't read the Charset declaration, so maybe they decided not to bother.

Honestly, it's like the proverbial idiot child. I never stop wondering what the java beans will come up with next.


 10:42 pm on Oct 3, 2011 (gmt 0)

Two of the sites I built for 2 different customers, that I let go when they hired in house programmers -- that are messed up now beyond repair.

The homepages are now reminders of designs from 10 years ago ... with left and right scrolling text which you can only read in entirety after seeing it pass 3 or 4 times. The table borders are so loud out - the site is all skeleton and looks like it hasn't "eaten" anything, well, basically since I left.

I used to show them to new customers proudly, now I have to let them go.


 10:50 pm on Oct 3, 2011 (gmt 0)

Oh, heck. You reminded me of a site where the content had originally been typed using Lotus Notes or something similar. It had then been imported in to Microsoft Word (we're talking back in the Windows 98 era here) and then saved as an HTML page, and then each page had later been re-edited several times using Microsoft Frontpage, Microsoft Word, and assorted other editors over a period of several years.

Each word on each page was surrounded by between 6 and 10 <font> tags, with conflicting and repeating styles all intertwined. Wrapped around each word were multiple mso:normal tags too. Headings were all done with more of the same bloat, and a bunch of <b> tags, nested several deep and on a word by word basis. There was not a <h-something> tag to be seen anywhere.

The meta keywords tag consisted of more than 1000 words. The title tag was 100 words long and stuffed with multiple HTML font and other such tags.

Re-coding the pages as headings, paragraphs, lists, and links, and styling it with some basic CSS, and with the exact same text content (after fixing quite a few spelling and grammar errors) reduced the file sizes from ~200 to 400 KB down to ~6 to ~20 KB per HTML page.


 12:42 am on Oct 4, 2011 (gmt 0)

That description had me laughing out loud. Which is probably not how you felt at the time.

Did you even try to work with the html, or just export it as plain text and start over? :)


 6:31 am on Oct 4, 2011 (gmt 0)

Since it was mostly the same tags repeated over and over, a series of find and replace operations mostly did the trick. Some editing involved RegEx matches for <size ="X"> type issues, and it took less than half an hour or so to fix each page.

I believe that for some pages I exported the content as plain text and then simply added HTML tags where needed.

What I do recall, is using the W3C HTML validator to make sure I had gotten them all. Additionally, a week after uploading the tidied pages, many of them were now in Google's first page of results, rather than languishing dozens of pages into the list.


 4:50 pm on Oct 4, 2011 (gmt 0)

LOL . . . every post so far has described, almost to the letter, a typical day for me. :-)

<----- Janitor of the Internet

Worst case scenarios are those that bear the benchmarks of outsourced work task by task.

- Design a site using quantified Ipso Lorem text, use limited vertical height background images, which lock it into a pixel width and height so that when it renders in variable content, you have to add tons of useless Javascript and patched up CSS to make it sorta work.

- From that, build a badly constructed site, no validation, patch it up with ton of IE conditionals and as many hacks as your Google finger can find. Use tables "to their fullest capacity" for layout. Be sure there are tons of <br><br> (or <br/><br/>, because we all "know" that HTML is "dead") and &nbsp;'s everywhere. Oh, and exclude those useless H1 tags, they are ugly, we want our tabbed pages to look "clean". :/

- Now put all that in a CMS.

- Now add social widgets.

- Now connect it with a mailing list database because we're too busy to enter the data manually.

- Now add RSS feeds.

- Insert a google calendar and show us how to use it.

- Integrate every form into SalesForce or some other "free" third party widget.

- Your CMS is too difficult for us to understand, we want Wordpress because it's the hottest thing since sliced bread. Berate previous developer for using anything else.

- Now change the design, because we're smart and have browsed through a free template directory and have found one that will efficiently bog down our site with Cufon and 5 or 6 Javascript libraries to make our home page thumbs bounce up and down like a carnival ride. (Genius piece of work that it is, does anyone else find the widespread usage of the Simplepress theme a bit annoying?)

- Apply several hundred hacks and patches to make it all at least look the same due to the previous step.

- Stand back and wonder why it dud'n't wurk.

- Call me, or someone like me, to clean up the mess.

I deal with these every day.


 4:22 am on Oct 6, 2011 (gmt 0)

@g1smd, half an hour per page! It could have been worse, but it does not sound fun. At least you confirmed that clean HTML does help in the SERPS.

@rocknbil, so these sites have been worked on by multiple people? Same with the site I started the thread about: at least four developers over the years.


 4:56 pm on Oct 6, 2011 (gmt 0)

Most of the time, yes, but there are a few that looked like someone's "learning on the web" experience.

Speaking of "bad SEO", here's a good one.

Company sells widgets and has over 24 franchises in a specific geo area. Due to "conflict of interests," this was outsourced to a local SEO company that someone recommended. The site has showcases for each of the franchises.

The meta description and titles thrown at us for the specific locations was something like "Come to Location A for the Best Widgets in town."

So for a moment let's forget G's guidelines to avoid using Stupid Fluff Words like "best" and "largest" in your meta info, let's look at the franchisees. How would Location B react to seeing Location A's main title as "Best in the area?"

Overall the SEO documents looked like they were picked out of a cat's a- uh, picked out of the air. Really sad, and people are making good money for this garbage!


 7:28 pm on Oct 6, 2011 (gmt 0)

It was a firm in Florida asking for help with Adwords - honest to god this firm turned over several Mil but the ecom cart was a joke.

In all honestly I told to redesigned teh entire site - then said good bye.


 8:19 am on Oct 10, 2011 (gmt 0)

First site I did for money I took one look at it and told them right off they had to ditch it.

The "webmaster" apparently was setup with one of those expensive type hosts. Something along the line of what you might expect from Quicken except worse. He was reselling to my new client at a rate of something like $40 a month. The site itself was setup with a really bad template, couple of images, some cheesy JS gimmicks... errors galore. Just a giant mess.


 9:21 pm on Oct 10, 2011 (gmt 0)

At least my client was paying for the expensive hosting directly.

He still has it, I have no idea why. He is paying for good shared hosting, and extra for multiple IP addresses, and is currently running one functional site - and that is entirely static. He is now also paying another host for hosting for the site I redid for him (the one I started this thread about).


 5:13 pm on Oct 11, 2011 (gmt 0)

I just went thru a similar mess. User was editing site in Word from Office XP aka 2003.

With a domain of under 10 characters a link to news.htm was several thousand characters! Each monthly change he did left further dust bunnies in the html, increasing the bloat.

The default homepage index was 1.21 MB :-) After cleansing it dropped to a reasonable size. Small site of a few pages dropped from over 20 mb to under 1.5 mb!


 9:34 pm on Oct 28, 2011 (gmt 0)

I found an interesting error message recently (I added two line breaks).
An error occurred at line: 14 in the jsp file: /users/../includes/header.inc 

Generated servlet error:
[javac] Compiling 1 source file

C:\Program Files\Tomcat 5.0\work\Catalina\localhost\_\org
cannot resolve symbol
symbol : variable nsInputFieldStyle
location: class org.apache.jsp.users.login_jsp

OK, I'm not real strong on Windows, but doesn't C:\ mean your personal local hard drive?

btw, I've edited that quotation slightly. It isn't actually an error message. It's the first of 64 error messages. Overall, this site doesn't seem to be really strong on coding for the possibility of errors. There is a Java equivalent of "catch{}" isn't there?


Elsewhere in these Forums I asked a serious question about domain name registration, mainly to confirm a hunch. While double-checking something I mistyped an URL, leading to further sniffing around. Results are so over-the-top, they had to go here.

I know it's pretty standard to buy up a bunch of domain names and then redirect them all to the one you like best. Or, ahem, the one you think g### will like best. I don't think it's supposed to go like this, though.

Pay close attention:

name1 comes with .com, .org, .net and .uk extensions. (It's not really .uk. I made that up.)
name2 comes with the same four extensions.
All come in with- and without-www form. I make that sixteen domain names. The with-and-without forms always behave exactly the same, so let's call it 8.

name1.org, name1.net and name2.uk are registered to the "real" owner. (That is: if it were a business, this would be the owner of the physical store.)
name1.uk is registered to, I think, the person who physically built the site in the first place.
The other four variants are registered to, let's say, Employee. By some oversight he is also the contact person for all eight. To make up for this weird consistency, they use four different registrars. (Possibly five. It's in another country and I don't know the names.)

name1.net and name1.uk (but not name1.org) are parked on one of those mega-servers. They redirect (302, not 301) to www.name1.org (with www).
name2.uk is parked on the same mega-server. It redirects (302) to name1.com (without www).
All three (or six, if you prefer) have robots.txt files in their respective names, blocking everyone from everything.

The remaining five (or ten) names live on a dedicated server. Nobody has a robots.txt. There is no further redirecting; you can use any of the ten names and get served the same content. This probably explains why almost every line of the jsp code includes the element "sessionData.getUrlBase()" instead of a single named constant that you could change if your preferred domain name changed.

File under: Things You Can Only Get Away With If You Are The Sole Occupant Of Your Niche.

We will not talk about the part where pages require UTF-8 encoding to display correctly, but to enter text you have to manually change your browser's encoding to Latin-1 and try to ignore the way everything outside the input window temporarily changes to garbage.


 2:15 pm on Nov 2, 2011 (gmt 0)

Just took over a site. Big company honest. The same as above same titles keywords, descriptions on every page. No seo non www and www live, index.htm live, they bought several I think 6 domains around the name and put the exact site up on all of them, but the worst part was the developer owned the domains all of them. Checkout was paypal. Traffic was coming from TV commercials, magazine advertisement, etc.

This company didn't have a merchant account so one had to be set up.

It has taken several weeks to get this done. After I got the domains moved to an new register under their name changed the main site DNS to new server, fixed the duplicate issues. I am in the process of getting an SSL installed and bought a cart for CC processing.

I think one of the biggest issues new websites face is they think they own the domain names but really don't. This company was really shocked when I asked them if they owned the domains and then told them the news.


 4:41 am on Nov 4, 2011 (gmt 0)

As my father would say, Now I understand everything. A bit of additional snooping reveals

<html xmlns:v="urn:schemas-microsoft-com:vml"

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="./{snip, snip}/filelist.xml">
<link rel=Edit-Time-Data href="./{snip, snip}/editdata.mso">
<!--[if !mso]>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
<title>Nw[ot3ymJ5 wvJDt4nw5</title>

I have obviously missed a few chapters in the development of Microsoft Word, Microsoft Office, Internet Explorer and/or FrontPage. Apparently one of the four can read the page designer's mind and deduce without prior instructions that the title is to be in a specific legacy font.* There's a bunch of @font-face later, but nothing before the title where it might be useful. In fact, the body text contains no non-Roman type at all. Unless you count the nonbreaking spaces, which are mysteriously required to be in the unicode font NunacomU.

* It is perfectly intelligible when fed into the proper transcoder. Which, incidentally, we are very lucky to have. I recently went looking for a legacy Greek transcoder-- which ought to be much easier to find, considering the numbers-- and found nothing but Word plug-ins.


 6:00 am on Nov 4, 2011 (gmt 0)

If you see traces of MS Office code on a web site, burn the whole thing down and start over with the insurance money ;)


 7:40 am on Nov 4, 2011 (gmt 0)

This thread shows that Google et. al often has to work very hard in order to be able to find intelligible life on the web.


 9:05 pm on Nov 4, 2011 (gmt 0)

12+ hours later, I'm still trying to figure out whether the above was a typo or just Too Subtle For Me :(


 7:41 am on Dec 10, 2011 (gmt 0)

I thought g1 was exaggerating a couple months back when he wrote:

Each word on each page was surrounded by between 6 and 10 <font> tags, with conflicting and repeating styles all intertwined.

I have just met the following-- on an individual person's solo site, dating back only a year or two all told. Line breaks added by me so you get the full glory:

<div id="content3">
<div style="display:block" >
<div style="display:block" >
<font face="Georgia, serif" color="#296717" size="6">
<font size="2">
<font color="#000000">
<font color="#000000" size="6">
<font size="2">
<font face="Impact, Charcoal, sans-serif"> Above is {product being marketed}!</font>
<br />
<font face="Impact, Charcoal, sans-serif">Each one of these {container type} contains</font>
<font face="Impact"><br />
{amount} of {product}<br />
to {more stuff}! Order now !</font>

There are a further four </div> tags which I have not disentangled. In one of the document's six external style sheets (to go with the five external javascript files) I find

float: right;
width: 43%;
margin-top: 15px;

div#content1 , div#content2 , div#content3 {
padding: 5px 10px;
overflow: hidden;
position: relative;

The site owner wants a new logo for his product, and likes my painting style. D'you suppose I could throw in a new web page while I'm at it?

Postscript: There was a recent thread asking about using CSS, Javascript or something else to auto-generate Copyright tags. I think the conclusion was that this really ought to be hard-coded. But at the bottom of the page is, yup,

<div style="display:block" >Content copyright <script type='text/javascript'>if (typeof(getCopyrightDate)=='function') document.write(getCopyrightDate(2009, null, '-')); else document.write((new Date()).getFullYear());</script>. {product name} !. All rights reserved.</div>

Got a vague idea the "product name" is supposed to be the page title, but the owner forgot, so it's called "Home_Page". I assume it's the editor's default.

Oh, and the whole thing is php, not html. I have no idea why.

Idly wondering what his htaccess looks like

Global Options:
 top home search open messages active posts  

Home / Forums Index / Local / Foo
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved