One of the first things I learned when I worked with a big site was exactly the point that Wheeler makes here: content is not king, STRUCTURE is king. It's a lesson that I've taken with me to every site I work on now, of any size. If the technical structure - or even the information architecture - is a mess, then crawling, indexing and ranking for any content will be compromised.
So even though quality content is essential, I try to ensure that there's a clear and unambiguous structure in place as a primary. foundation step. That way, powerful content can be found and appreciated - both now and in the future.
|So even though quality content is essential, I try to ensure that there's a clear and unambiguous structure in place as a primary. foundation step. |
Could you elaborate on what you would consider a "clear and unambiguous structure"?
what sorts of things do you see people do that might sabotage their structure?
I don't think it takes a massive site to make structure more important than content - any site's SERPs depend on its crawlability.
(Oh, OK, an exception is where you have a 4 page site - its kinda hard to hide all the pages from the crawler in that situation :) )
|Could you elaborate on what you would consider a "clear and unambiguous structure"? |
Have a clear taxonomy and hierarchical structure.
|what sorts of things do you see people do that might sabotage their structure? |
I often see people putting too many unrelated links on a page in an effort to reduce click depth. Having too many choices isn't clear to the user, and it isn't clear to Google. In general, what works well for the user works well for Google.
It's useful to remember the variety of possible meanings of "structure":
There are many ways to use structure in website design.
If you use more than a few dimensions and combine them, you end up in a mess of navigational pages.
If you use a strict hierarchic structure, you end up with a huge number of pages per level or with a scary level depth.
In my opinion a good structure must fit to the specific section of the page. This can cause unintended consequences on a huge scale.
The challenge is to automate the creation of navigational elements and URL structures by recognising important patterns and semantic relations. Standardising on-page elements and reducing the variety of elements could be a first step to organise huge sites.
I've seen (and walked away from) sites where people are trapped by a crap (custom) CMS that they're heavily invested in and that is killing their site, so it's interesting to read that even in a company with Microsoft's resources there isn't the willingness to start again and do it right. This poor guy has to 'make do and mend' - even when you have multiple CMS systems for one site that have generated 20 million urls to get to 14,000 pages of content! Mind you, it's Microsoft so maybe not that surprising :)
If you want to see some good examples of this, visit some .edu and .gov websites. Many of them have lots of pages with unique useful content, but they don't rank as high as they deserve because the sites are poorly designed.
In mega-sites, the website's chaotic nature often reflects the size and complexity of the business itself. Upgrading the website in a significant way can require corporate change - and that's a tough nut to crack. Internal company politics can get in the way - as well as nearly antique "models" of how business should be run. A good example of a company that runs differently and therefore succeeded on the web is Zappos.
It's a funny thing, but even a one person "web team" can show the same patterns. It's as if that person has several internal fragments and they don't communicate well with each other. One fragment might be very visitor oriented and the other is search engine focused. But unless the person pushes each fragment toward a robust understanding of its focus area, they will undermine each other.
A classic example of this is following a particular list of "SEO actions" in a rote fashion - adopting the latest SEO buzz in a blind fashion. Many smaller sites actually hurt themselves trying to do "PR sculpting" or using the canonical link tag. It would have been better for those webmasters if they never heard of the concepts.
When it comes to the microsoft.com site, I am a little surprised that they still use the URL to track the user's click path. They know about it, and they don't use the canonical link element (or haven't been able to). They have canonical issues all over the place, beginning with "/" and "/default.aspx"
Wheeler really has his hands full just trying to prioritize that pile of problems!
Do most companies such as Microsoft (the type you refer to as mega sites) "get" SEO, btw? As in..working with very good consultants or having a great in-house team in place, already?
or are they hopelessly behind on this "new internet stuff"?:D
I don't have any corporate experience with megasites, but I've done a lot of work for a particular school at our friendly local major university. It's so many different kinds of nightmare, I can't even begin.
IMHO, methinks microsoft is primarily concerned about deploying support information to their end users and enabling efficient crawling to save on site running costs, its a given that their customers do not necessarily use bing
and in any case, bing developed by a different team would probably be designed to crawl all sites by rules for scalability reasons,
ergo seo means something different for microsoft
As a complete entity, no - huge corporations do not "get" SEO. As a matter of fact, for most big corporations there really is no one entity that you can call "the corporation". It's a fiction, a trick of language to even think of a corporation as "a complete entity". The systems are too diverse and only loosely connected.
Huge companies often have great in-house teams and great SEOs (Marshall Simmonds for the NY Times for example) but the corporate structure often hampers their work.
I usually take on big site consulting with a different mind-set. I need to be extremely pragmatic to get immediate wins for them, even while I work for corporate change. Some huge sites are making major strides in SEO these days. Others can only rank well for their own brands and trademarks - and some not even that much.
The difference with a company like Zappos or Amazon is that they were built for the web from the beginning.
Along these lines, Derrick Wheeler posted his "Mega SEO Oath" - pretty funny
|Until search engines and all XX,XXX people in my organization get smarter, |
I will continue to ask others to do more work than they would otherwise do
…until everyone hates me.
But I will work hard and do smart things that eventually get results
…so everyone loves me.
Until the algorithm changes
…and everyone hates me again.
In these circumstances, how much involvement should the CEO of a major organisation work with the SEO. I noticed Derrick's comments about Steve Ballmer, CEO of Microsoft being the only one with overall responsibility for the website.
It almost smacks of holistic organisation at this level, and then finessing the site's on the web.
If the website is not the only presence of the company (like Zappos or Amazon) then I think the CEO should simply empower the web team and personally get out of their way. [**ahem, Ballmer **]
It depends how big the company is, and especially how much revenue is tied to the web - but in many, or possibly most cases, I think the website needs someone at the C-level.
Sorry Tedster - I don't exactly agree that for mega sites, structure is the king.
We did a text book restructuring and turned a search engines hostile mega site with couple of million pages ( dynamic url's, keyword cannibalization, dup content, etc) into SE friendly site with flat IA, good interlinking, unique content, etc )
Google came, saw, recrawled, indexed everything fine - then it started dropping pages.
Bottomline: LOADS OF LINK EQUITY is the REAL Queen for mega sites! One can have a perfectly structured megasite, but it will implode if not backed by massive link building campaign.
In the transition of moving WebmasterWorld to a new server [webmasterworld.com...] we lost one superb message in this thread that I remember... but I don't remember who posted it. :( It was an example of a hierarchical directory structure and a discussion of possible sample categories.
I know it was a lot of work and hope you have a copy of it. If you do, I'd appreciate seeing it posted again. I think it would help the discussion quite a bit.
@ Robert Charlton
Have a clear taxonomy and hierarchical structure.
Aren't Taxonomy and Hierarchy synonym?
|Bottomline: LOADS OF LINK EQUITY is the REAL Queen for mega sites! One can have a perfectly structured megasite, but it will implode if not backed by massive link building campaign. |
I'll disagree with the above. Links can only get you so far, then the rest is up to you.
|Massive link building campaign. |
Tell me, what would be a massive link campaign for a site with 25 million pages in the index?
You don't chase links when you have a megasite - they chase you!
I recently updated my site to be much better structured and while I may be a small fish in a big pond (several hundred URL's) I think how URL's relate may have a lot to do with structure. I'll go over what I've done and whether it's relevant or not wouldn't mind constructive criticism though at the same time I think it may give some people insight. Honestly I think many site's have very poor URL structures which may be the owner/developer's inability to understand the content as a whole or simply the lack of perception that it's not organized. Often when working with clients I have to remind them that the way we look at a site is limited to us as others aren't building it.
I've structured my URL's the following way...
domain / section / area / page
We all know domains, duh. Depending on your site's content the way you organize content may vary. In example there may be a few "section-like" pages like about and contact with a huge very specific number of pages of say a product catalog.
Some "areas" of my site are direct pages themselves and I try to make "pages" not end with a trailing slash while "areas" that contain "pages" obviously contain a trailing slash; such "areas" contain an index with their own content to help human visitors better understand the "page" content with a bullet list of URL's.
Clean looking URL's with keywords seem to be the way to go...
I also use a hierarchy of links (#location) that clearly shows how URL's relate both to search engines as well as to human visitors.
Also I moved to a database which allowed me to automate a site index. With a mega-site you'd still be able to automate it with the URL limit on each site index page. I just don't see the point of having tons of content and no way to easily manage all of it. I think there's too much developer and way not enough designer in the mix. A developer makes something work sure, though you need a designer who understands development, humans and spiders to bridge between who is paying to have the site and the people building and maintaining the site. On top of that I think having a proper understanding of HTML (e.g. proper use of headers and keeping your content at the top of the source with non-content below your content) also goes a long way to help search engines understand what is content and how various pages relate. Also not having tons of excess code also helps with a greater content-to-code ratio. If a detailed page has a URL that has the same format as a lesser detailed though more mainstream page (e.g. about) then not enough work was made to make URL's humanly readable. Details still matter even at a large scale. I don't know if file extensions for content come in to play though I think offhand not having them would be beneficial, your keywords would weigh heavier on shorter URL's which may suggest the page is more on-topic.
|You don't chase links when you have a megasite - they chase you! |
Very well said.Another thing is not every site with millions of pages is a megasite unless you have something unique or good to offer, that naturally attracts links...
|I also use a hierarchy of links (#location) that clearly shows how URL's relate both to search engines as well as to human visitors. |
I have a question / issue with this statement.
Could not two different human visitors have different ideas of what "clear" is?
Take a company that sells car parts.
One person might consider the following as a clear structure:
part type -> Make -> Model -> Year
(all manufacturers of the part appear on the same page)
One might consider this as clear:
part type -> Manufacturer -> Part
(i.e., all parts made by XYZ company on one page)
another might find this more useful:
Function -> part
(e.g., All products that increase fuel economy on one page, or all products that boost torque on one page, or all products for improving the sound system / audio quality on one page)
So when it comes to structure / navigation, how do we know whose version of "clear" to use?
One other thing:
Matt Cutts has said in his videos and interviews that a human site map is a good thing. He has also said that faceted navigation CAN be a bad thing.
So maybe there should be one "left-hand" navigational structure, and an alternative structure for how the site map is laid out?
|Aren't Taxonomy and Hierarchy synonym? |
They're in the same territory, but they're not synonyms. You can create a hierarchy of pages that uses a crappy taxonomy for labeling the various levels. Conversely, you can have an excellent taxonomy but arrange those buckets in an ambiguous hierarchy.
|Tell me, what would be a massive link campaign for a site with 25 million pages in the index? |
You don't chase links when you have a megasite - they chase you!
Consideration needs to be given to whether the pages index and rank for the query terms in the first place. Relying on brand to build up links may work for some , but not others and time is an issue.
Not all mega sites get easily linked to in the areas that count , so it may require some help.
Agreed. Although millions of links will "chase" the mega-site, those links may not be optimal. This is especially true given the chaotic canonical problems that often exist.
Creative and highly useful content (i.e. highly link-worthy), placed at an unambiguous URL, plus a touch of SM promotion for that content - these are the most effective approaches I see. This requires first getting closer to the market, knowing what information people need to have clarified and so on.
If the taxonomy is correct, and if tightly related topics support each other internally, then large link equity is not needed, only strong trusted authority links.
The internal linking on a domain of that size works positively in magnify rankings on its own so long as there are a reasonable number of high level, inbound trusted links to the domain and unique content on most pages.
Love this thread. I'd like to make a recommendation that the next time Brett updates the sections here at WW, that he considers a section just for mega sites.
We mega site webmasters have unique challenges, and it gets tiring having people who don't understand these challenges make outlandish accusations because they are used to SEO'ing their little 4 page websites. Since few of us want to display our sites in our profiles, it's hard to identify who is working with a large site and who isn't.
To me, this is the biggest weakness of WW.
|So maybe there should be one "left-hand" navigational structure, and an alternative structure for how the site map is laid out? |
Yes, there can (and should) be several concurrent or alternative navigation structures, but not too many of them. You have to think through carefully what they're for, and be careful that they don't detract from each other.
And I wouldn't make the human site map my home page, unless it's a very small site. Optimizing a very large site, IMO, is all about data organization and prioritization.
For purposes of discussion, I'm sloppily going to use "PageRank" interchangeably with link juice, link equity, link love, etc.
Thinking just about top-down nav structure for a moment... you want to distribute your homepage link juice wisely, which means that you don't want to give any page a much larger share of link juice than it needs, nor do you want to give it a much smaller share than it needs. While you also have lower level category pages as inbound link destinations, it's helpful to start with a logical home page structure....
In the example that had been posted previously in this thread (which got lost in the server migration)... there was a suggestion that all products be funneled through one "Products" link from home. I don't think that was a wise idea... it's much too narrow.
What the poster suggested as his second level pages... main product categories... were in fact a much better set of links for home. You need to organize your products into main categories and then subcategories, and to link to those main categories from home.
What often happens, though, is that there's a huge temptation to link to your subcategories from home as well. Some product sites also link to individual products from home. Some of this can be done for emphasis, but the more links you have on your home page, the smaller is the share of PageRank-associated link equity that is distributed by each link to the pages below.
So the more subcategories you link to from home, the less link juice each of your main categories gets... and very quickly, unless you're very careful, you put yourself into a position where, if you link to very many subcategories from home, you've almost got to link to all of them. This, IMO, simply doesn't scale very well in terms of semantic or structural clarity, and it fights usability.
Just as a user might have a hard time picking out the right link from several hundred on a page, so might Google. A hierarchical structure with a well thought out taxonomy, with deep links only to your most important pages, is a much more strategic way to go.
The home page of dmoz.org [dmoz.org...] is a page I've pointed to over the years as a good example of a page where deep links to popular subcategories work well for users and for search engines.
IMO, small sites can't link to as many subcategories from home as megasites do because they don't have the link juice to support that much navigation.
I frankly don't want to get into much greater detail about what I do, because we've all got our secret sauces. But I see that there's a lot of stuff being pushed around the web about having very "flat" sites to preserve PageRank between levels, and I think that the link equity lost in the drop between levels is insignificant compared to the semantic confusion that many sites have with a hundred or more nav links on a page.
|I think that the link equity lost in the drop between levels is insignificant compared to the semantic confusion that many sites have with a hundred or more nav links on a page. |
Very intelligent way to look at the process. I take supportive pillar topics in a hierarchical structure over flat file structures for this reason any day. Often 1 link is enough to gain top 5 rankings for long tail terms, and that trust and relevance is passed up to the level above it (and related content on the same level if related links and breadcrumbs are used well.)
WMT sitemaps plus a good heirarchy might be a good start as noted above. Pushing sitewide links into a 25 million page website to achieve a good spread of link juice would seem to me a bit of a tall order , even for Microsoft with a PR 9 or 10 home page.
Best to reserve this strategy for fewer areas of important emphasis. Mega sites with strong brand names probably have it easy for PR0 pages.
The thing that strikes me is the principles are the same for small and mega websites. What's different here is the magnitude of persons, departments and procedures that find it difficult to come together. and the longer it goes for the harder it becomes to alter the legacy build up - even for Microsoft - I'd say. So it needs to be done right at the very start of the web build process. digging one's way out is no fun.
| This 55 message thread spans 2 pages: 55 (  2 ) > > |