Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

URL structure redesign

most of long complex URLs is already indexed

         

online

2:23 pm on Jan 4, 2007 (gmt 0)

10+ Year Member



Hello everyone, really need your advice here, please, simply can't figure what to do at all.

The site is an online retail, 600+ pages, been online for almost a year and google has about 300-400 of them indexed (approx. - i just used "site" command) by now (last week showed ~200).

It wouldnt be so bad, but the URLs look horrbile, they are very long and complex, you can find anything there - underscores, commas, about 4 "&" symbols in most of them.. anything but keywords. I guess it's not a disaster for SEO, but doesn't seem right at all, especially considering there's not enough keyword content for product items.

Do you think mod_rewrite and 301 redirects to new "user-friendly" URLs with keywords (same content on pages) would be something worth doing here? Or should I stick with current URLs not to ruin current rankings doing SEO for this client at first time? ;)

Also, from your experience, what are the effect of redesign with 301? While gg finds new pages, indexes and drops old ones - do the rankings of the new pages fall compared to the old ones? If so, what time period for "recovery" to be anticipated?

look forward..
trying not to bite nails ..
online

online

5:05 pm on Jan 4, 2007 (gmt 0)

10+ Year Member



Both outcomes seem quite realistic to me in theory -

1st, when gg, as it indexes new pages, gradually transfers its "opinion" of old pages to these new ones without hurting rankings during this period;

and 2nd - "complete disaster"..;) i mean positions drop and it takes a long long time to recover

however i've had no experience with it so far.
c'mon guys, any suggestions? I know you've been through that before ;)...

Patrick Taylor

5:32 pm on Jan 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I haven't done this myself, and wouldn't want to either - especially for a client to whom you are professionally accountable. For a year-old site, I think the time has passed and risks outweigh the potential benefits. There are lots of successful e-commerce sites with horrible URLs.

If the client is insisting, then I'd call in an expert and distance myself from the job.

man in poland

5:37 pm on Jan 4, 2007 (gmt 0)

10+ Year Member



If you have such horrid urls, I suggest you bite the bullet and get them switched over to search-engine (and USER FRIENDLY) urls as soon as possible. The longer you leave it, the harder it will be. If you are in a competitive field, you really will be at a big disadvantage without decent urls in the long run.

However, do take your time to read everything you possibly can about mod_rewrite and 301s - it's not that easy to get a handle on initially, but makes good sense with time.

Finally, make sure you run a headers checker on your new set up to check that old urls are redirecting correctly with 301s not 302s. Check, check and check again. I have a 5 year old site. After 1 year I sorted out my horrid urls to better ones using mod_rewrite. Last year I had to 301 quite a few pages for some technical reasons. Both transitions went perfectly, without any drops in ranking, but it took time and very careful planning. I am sure you can do it, and best of luck!

man in poland

5:44 pm on Jan 4, 2007 (gmt 0)

10+ Year Member



Patrick, I really have to disagree with you on this one - your recommendation seems a little defeatest to me! Sure, I agree that if one's technical ability is not that high, it makes absolute sense to call in an expert, but in principle a good webmaster should be doing everything in their power to improve their clients' site rankings...

I agree that there are plenty of profitable e-commerce sites with horrid urls, but this is precisely where a start-up e-commerce site can make a jump on the 'big guys' - by being smarter!

jdMorgan

6:21 pm on Jan 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doing things in the right order and at the right time is everything:
  • Add code to rewrite (not redirect) the new friendly URLs to the old unfriendly ones needed by your script(s).
  • Change the links on your pages to use those new friendly URLs.
  • Get your responsive linking partners to link to the new friendly URLs.
  • Let this sit awhile, until you see the new URLs appear consistently in the SERPs for important pages.
  • Add code to 301 (permanently) redirect the unfriendly URLs to the friendly ones to handle non-updated inbound links.

    Don't take exceptional measures to do this fast or all at once, or you can "pull the rug out from under your site" in search. Proceed slowly and very deliberately with regard to your top-ranking pages and main landing pages.

    Someone here (I wish I could remember who, so as to give credit) has argued that starting with updating the links on your lowest-level, least-important pages (at step 2 in the list above) is a good plan, and I tend to agree -- build new internal supports for your top pages before removing the old supports. On a per-page basis, consider this a balancing act between maintaining the PageRank/link-pop support for a page, and avoiding long-term duplicate (old & new) URLs for the same page. This should work well for sites with a small number of well-ranked landing pages, and lots of supporting pages below -- for example, an e-commerce site with a few "main" pages and categories, and lots of product pages below that.

    Jim

  • Patrick Taylor

    6:42 pm on Jan 4, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    man in poland, I actually agree with you. I practice nice clean URLs myself, but the context of the question seemed to be "hmmmm... I don't really like my client's URLs... maybe I should change them." Apologies if I've misread this, but that's how it seemed.

    Jim's advice below your post is what I meant by 'get an expert'. For some, the procedure would be technically difficult and will most likely hurt traffic for a while. So if this drop is going to cost a client some money, then really it should be requested by the client (though of course I don't know what the contract is in this case) and undertaken by someone professionally competent and with the client's acceptance of a likely drop in traffic for a while.

    I have one or two Zen Cart installs on the go (not my sites) and to attempt to clean up the URLs on those sites would be a big step to take - even if a good one in the very long run.

    man in poland

    7:58 pm on Jan 4, 2007 (gmt 0)

    10+ Year Member



    Hi Patrick,

    No offense intended and I trust none taken! I'm not familiar with out-of-the-box type scripts like the Zen shopping cart. The mess I made at the beginning of my site was all my own mess, and I knew how to sort it out. Trying to fix someone else's mess - well, you're right - I'm not up for that and nor should a webmaster be handling that sort of 'learning' on a client's time and money (IMHO)

    The risks/rewards clearly need to be spelled out to the client before proceeding. I have to say that if I were the client, my first question would be - "Why did you not tell me a year ago?"

    Decius

    8:20 pm on Jan 4, 2007 (gmt 0)

    10+ Year Member



    jdMorgan: I am not certain I agree with your suggestion entirely because according to my understanding of your suggestion, for a while two urls will point to the same content. I would be much more afraid of being nicked for duplicate content than having the old urls drop quickly out of the index for a short period of time.

    jdMorgan

    8:48 pm on Jan 4, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Yes, there's a lot of fear and doubt about duplicate content. In my opinion, duplicate content is not penalized or even filtered unless there are signs of intentional duplication of content. Think of all the sites out there where the same page resolves for example.com/page, www.example.com/page, and either of those URLs with *any* arbitrary query string appended. For search engines which support query strings in their index, this means that millions of all-static sites have an essentially-infinite number of URLs pointed to each of their pages.

    So even discounting the query string problem, most sites still have four duplicates of their home page available:
    example.com
    www.example.com
    example.com/index.html
    www.example.com/index.html
    (Substitute htm, shtml, php, asp, or cfm if you like)

    This is because the Webmasters haven't taken any steps to rectify these duplicates caused by default server configurations and the behaviour of common HTML authoring tools.

    Some of these sites even link to all of their own home page URL variants through ignorance or incompetence, and I don't see any evidence that any of them are actually "penalized" for it.

    So to quantify my statement, if you have four or fewer duplicates of any given page in your site's URL-space, the worst that will happen to you is that the PageRank/Link-pop of a page may be split across one or more of those URLs. Since I don't know the logarithmic base that G uses to calculate PR, I can't say what effect that might have on what you see in the Google Toolbar PR display, but in the linear PR domain we can assume that two URLs for one page could -- worst-case, assuming both are equally linked-to -- divide the linear page PR by two, splitting it equally between the two URLs. Four URLs, also equally-linked, could divide the page's linear PR by four.

    However, we are discussing replacing old URLs with new ones here anyway, so a temporary split in PR is a given -- It *is* going to happen. As I stated above, the trick is to watch the changes being picked up by the search engines, and when they're solidly locked-on to a new URL, then complete the redirection for that old URL as soon as possible. So, there is an important timing element needed to minimize the disruption and limit it to a short period of time.

    If your site already has the split-PR problem described above, and if you fix that at the same time as re-architecting your site's URLs, then you can come out way ahead in the end.

    Jim

    [edited by: jdMorgan at 8:50 pm (utc) on Jan. 4, 2007]

    Patrick Taylor

    9:04 pm on Jan 4, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    man in poland, all is well.

    I have to say that if I were the client, my first question would be - "Why did you not tell me a year ago?"

    I'm not sure what there would have been to tell, and this does of course highlight the immaturity of the web and much that goes with it (compared to other long-established professions). My guess is that most 'contracts' between website owners and whoever builds the sites are very vague on such matters as URL design and exactly how a site will comply with Google guidelines, how it might perform in one search engine or another, etc etc, not to mention who is actually responsible if (say) a site is dropped, penalised, filtered, whatever... for reasons that no-one may even fully understand.

    So really, if you've been paid for the job a year ago, everyone is happy, think hard before opening up a can of worms like redoing all the site's URLs - unless it's part of a 'brief' with an agreed aim in mind, and you know what you're doing.

    man in poland

    9:06 pm on Jan 4, 2007 (gmt 0)

    10+ Year Member



    Jim,

    I agree with your point about fixing the split PR issue, but I would suggest to do this at a differnt time rather than at the same time as redirecting old urls to new ones. My experience has been that its best to fix one thing, wait until that beds down, than fix another, particularly if that involves htaccess files, regular expressions and the like. It's really tempting say to fix the non-www to www issue (or vice-versa) and restructure your site architecture with new urls (for example) but this can often cause 'double jump' issues, whereby the server gives 3 responses to an old url - 301, 301, 200. In my experience, Google can handle one jump, but has difficulties with two. It probably thinks it is being spoofed...

    I'm probably not being very clear here but the bottom line is my approach would be:

    fix one thing at a time, check, leave for at least a few weeks, then fix the next thing.

    I've seen countless examples in these forums (fora!) where rankings tank and the webmaster struggles to know what the reasons are because they coincide with a whole raft of changes that have been made. But there's a thread on Changelogs which I think covers this better than i could...

    man in poland

    9:22 pm on Jan 4, 2007 (gmt 0)

    10+ Year Member



    Hi Patrick,

    I would have to argue that a year ago search-engine friendly urls were nothing new. It seemed to start gaining a lot of coverage around 2002, with a decent number of articles emerging at that time. I'd say a webmaster would have had an excuse if asked "why did you not tell me a year ago?" in 2003 or 2004. But not now, surely?

    But point taken. Anyone who tackles this needs to know what they are doing. As webmastering becomes more complicated, it does get harder and harder to keep on top of it all. However, I'd have to say that creating sites with good spiderable architecture should be an absolute basic that a client should expect of any webmaster. It shouldn't be the obligation of the client to ask questions about things he's personally not expected to know! The onus is on the provider - the webmaster in this case.

    I'm rambling here, but I hope my point is clear-ish. When I buy a car, I can probably choose the colour I like and aircon or not. But I'm not expected to have a degree in engineeering so I can quiz the manufacturer about the engine. I expect it to have the latest accepted technology behind it - fuel consumption, emissions and so on. Surely the same goes for websites?

    Patrick Taylor

    9:59 pm on Jan 4, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Well, if you're approached for an installation of an open source shopping cart (on a tight fixed budget), how much time are you going to spend explaining the ins and outs of URL design for search engines, knowing it will mean very little to the client unless you spend hours and hours of unpaid time before you even get the job, especially when you believe the URLs are actually spiderable but just not very pretty?

    Decius

    10:01 pm on Jan 4, 2007 (gmt 0)

    10+ Year Member



    jdmorgan: I think the www and non-www idea is a non-issue. Google will have accounted for that for every site already. Also, i think that is very different than having two entirely different urls point to exactly the same content on the same server. This, to me, is the definition of duplicate content created by incorrect internal linking.

    It is abhorrous to me to permit the viewing of the same content via two different urls. The www and non-www issues are definitely problems (in regards to PR) which is why it is recommeded that you never permit users to see your site in both ways. All of the sites I run make sure you view every single page in the exact method I have arbitrarily set.

    www.domain.com/pagename/id/keyword.htm

    If you try to view it without the www, it does a 301 to the www version. If you try to view it by PHP, or without the keywords or even without the htm, it 301's to the correct page.

    So, I would have to say I think it is incorrect to advise people to have two different dynamic urls pointing to the same content. This is probably exactly what Google is attempting to police, regardless of one's intentions. This is especially tricky ground given the outcrys of people who are being penalized for duplicate content and have no idea why.

    In my opinion, you should htaccess the site to be ready for the new urls, and then 301 all the old ones to the new ones and hope Google will pick it up ASAP without much of a burp. Anything is is far more risky imo.

    [edited by: Decius at 10:03 pm (utc) on Jan. 4, 2007]

    jdMorgan

    5:03 am on Jan 5, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    MIP,

    > It's really tempting say to fix the non-www to www issue (or vice-versa) and restructure your site architecture with new urls (for example) but this can often cause 'double jump' issues, whereby the server gives 3 responses to an old url - 301, 301, 200.

    It's not necessary to use multiple redirects. You can do it all at once if the code is properly implemented.

    Actually, given a choice, I'd prefer to fix the canonical domain problem before even starting the URL re-design, as long as there was a considerable ranking difference and it favored my preferred domain. Otherwise, I'd prefer to build up that ranking difference before forcing it with redirects. This makes a good project while you go to all the URL re-design meetings and briefings. And of course, the best cure is prevention: On a new site, install the domain canonicalization redirect before the first page or even the robots.txt file goes up.

    Decius,

    Google has often gotten confused about domain canonicalization, since www and non-www are different domains and they must use back-end processing to detect that the two domains are aliases. Sometimes that processing is apparently defective, maybe because their confidence threshold is set too high for cross-domain comparisons of extremely-dynamic pages, or maybe they just don't get it all done before the next re-spidering phase, but problems crop up from time-to-time and we can't take it for granted. A 301 from the non-preferred domain to the preferred domain is best practice.

    I'm not advising people to intentionally link to their own pages using multiple URLs, but rather stating that it is a fact that your internal links (all updated) and inbound links (relatively few updated) will and must co-exist. Based only on my own experience, the SEs like to see your site pointing to its new URLs before they see a massive number of redirects from old URLs to those new URLs. Because of all the latencies in the Web and in the SEs' indexing itself, I think it's best to let them find your internally-consistent link changes, and then roll in the 301's to correct the obsolete inbounds. Otherwise, because of the indexing and datacenter latencies, you risk them fetching a bunch of 301 redirects for URLs that are (apparently) still present on your own site -- actually, still present in their stored 'snapshot' of your site, containing a mixture of older and recently-updated pages, which to them represents 'current reality'. It wouldn't surprise me if that 'snapshot' were distributed too, with pieces of it older and newer at different datacenters, and internally inconsistent over intervals in the hours-to-daily range.

    Ignoring all these temporal intricacies, and differences of opinion on the above, the one thing that I'd say is critical is to *not* do the redirects first -- before updating all your own internal links. That would almost certainly put the "site quality meter" well into the red zone.

    Jim