Forum Moderators: open

Message Too Old, No Replies

page size

faster indexing?

         

giga

6:23 am on Nov 8, 2004 (gmt 0)

10+ Year Member



I have a theory, i'm sure that most of you can confirm or deny. We typically build large website full of ALOT of pages with real informative data, constructed for the end user. Another common trait of our sites is that they are typically graphically heavy laiden, meaning we have almost as much beautifully rendered graphics for the template of each page as the amount of data itself. The challenge we are having is that we would like to get as many pages indexed as soon as possible. I theorize, based on past results, that if I build some POS ugly website with simple color coded borders (rather than shaded gif backgrounds) and simple html text rather than rendered smooth fonts, that google will eagerly goobble up tens of thousands of pages, whereas the graphic heavy site it seems to slowly chew and cautiously swallow each page. One example is we have a forum with 2 versions, one is a completely text only version minus our background template (ie NO graphics) and google indexes and adds each page as a backlink almost immediatly and by the hundreds, however it tends to only grab a few of the templated versions of teh same page. Based on this, does google want us to build simple non user friendly sites, just to guarantee we can get all 100,000 + pages spidered? Or am I looking at this all wrong?

Giga

Patrick Taylor

12:47 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As I understand it, Google only looks at the code. It definitely doesn't care what the page looks like, and doesn't actually download images to look at. I would imagine it makes no difference to Google what file sizes your images are. I think "small pages" means small only in terms of the size of the actual page code that you see in the browser source. Apparently it is a good thing to have a higher proportion of indexable text content in relation to total page code but exactly what difference it makes, I don't know.

giga

6:27 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



No, from our experiements and experience if you took two equal sites (in terms of PR and seo strength) put them side by side, had one site be text only and the other cluttered up with high quality graphyics and the same text, the text only site would be indexed much quicker than the second scenario (I think....) What do you think?

BigDave

8:15 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I cannot imagine that anyone at google would take the time to code such a thing. It makes absolutely no sense, and I have seen no evidence of it.

While that does not mean that it isn't happening, I would be much quicker to believe that it is other factors related to the sites.

For one thing, fancier web pages tent to have fancier ways of linking. If you love your graphics there is a good chance that you also like to use JS in your links, or you are linking using images.

I could easily see Google having a prefernce for text links over image links. I don't necessarily agree with it, but I can understand it.

What I have seen that has made the most difference in how fast a site is indexed, is the number of deep links and the variety of pages receiving them. Is it possible that One of the forums just happens to have more people deep linking to threads?

How many different data points do you have (different sites) that show the varying behaviors?

Do your graphics intensive sites use any other tricks of the trade that you do not use on the simple sites?

How well do the different sites rank in Google? Higer ranking means that more people find it, which means more people will deep link to things of interest....

What else can you show me to back up your hypothesis?

giga

4:38 am on Nov 9, 2004 (gmt 0)

10+ Year Member



Here are a few things that support this theory:

-We run a vBulletin forum, which comes with a feature that makes more SEO-friendly pages in addition to the normal ones. They are very barebones-looking, with no graphics at all. I figure that the developers did this for a reason. Google used to favor our regular forum pages over these SEO-friendly versions, but lately this is no longer the case. I've noticed this change on other sites besides ours as well.

-Some of our competing sites are very plain looking, with a bare minimum of extra graphics, fancy HTML, etc. Many of them seem to do abnormally well considering in many cases their SEO is minimal at best.

-Blogs and similar sites that typically contain several pages of content and are not very graphic-intensive seem to be favored in the SERPs.

So in essence, I'm thinking that Google has a "real content" to "HTML code" ratio. The more actual data on a page compared to HTML formatting, JavaScript code, image maps, etc., the better it fares. Has anyone else noticed this?

Tech2004

6:10 am on Nov 9, 2004 (gmt 0)

10+ Year Member



Yes, you are right. The text only site is more SE friendly, but generally all SE are programmed to ignore the pictures and Java scripts and flash etc that bog down most typical browsers. If you want to see your site as a SE (search engine sees it try the poodle predictor:

[gritechnologies.com...]

There are lots more sites like this, but this one seems to operate similarly to Google. Also, I have been told there is a 100K Page size limit that most SE obey. (Google for certain) This size is the actual file size of the page as viewed from its local folder...and does not involve image content.

Powdork

6:19 am on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, I have been told there is a 100K Page size limit that most SE obey. (Google for certain)
Not so anymore. look at any dmoz page that is over 101 k. Then pick some text from the bottom of the page and search for it in quotes. Not only will the dmoz page show up but so will all the clones.

BigDave

7:57 am on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Look at what you wrote and you might find some other possible answers. You even mention many things which you could consider, but it seems you already have your mind made up.

-We run a vBulletin forum, which comes with a feature that makes more SEO-friendly pages in addition to the normal ones. They are very barebones-looking, with no graphics at all.

Well, if the pages are made specifically for the search engine, why would you be adding graphics?

But a much better question would be: What other differences are there with these pages that makes them more appealing to search engines?

Most forums are quite search engine unfriendly. These pages are a kludge to get them to rank higher rather than fix the underlying problem.

I figure that the developers did this for a reason.

Yes, they were concentrating on hitting the main SEO points with these pages, not designing them for people. Try adding some images to thses pages and see if that causes them to disappear.

You are comparing dissimilar things.

Google used to favor our regular forum pages over these SEO-friendly versions, but lately this is no longer the case. I've noticed this change on other sites besides ours as well.

Gosh, you call them SEO-friendly. Do you suppose that they are ranking better because they are designed to rank better in *all ways*? Actually, I believe that they are intended to be SE-friendly. Who the hell cares if they are friendly to the optimizers.

-Some of our competing sites are very plain looking, with a bare minimum of extra graphics, fancy HTML, etc. Many of them seem to do abnormally well considering in many cases their SEO is minimal at best.

Here you mention "extra graphics,fancy HTML, etc.", but you are concentrating on the graphics, while you admit that there are additional differences.

As for the minimal SEO, Google isn't trying to rate pages on how good the SEO is, they are trying to rate the page on their own critera of what a good page is.

-Blogs and similar sites that typically contain several pages of content and are not very graphic-intensive seem to be favored in the SERPs.

You are comparing completely different categories of pages. Blogs are different from forums are different from stores are different from ...

I'm not saying that you are wrong. I'm saying that you don't have the data to convince me.

giga

8:08 am on Nov 9, 2004 (gmt 0)

10+ Year Member



There are two of us posting as "Giga" i'm the more blunt of the two. I think you are severely missing our point. This is NOT about SEO, this is about google's apparent indexing and spidering of barebones text vs a user freindly graphically enhanced webpage. Our exmaple in this case is vbulletin. One feature of vbulletin is to offer a search engine freindly version of the message board (hence a completly clone-like supposedly search engine friendly version of the same data). Well, it appears to be working, as those pages tend to get indexed incredibly faster than a regular (non SE freindly page) forum page. We are NOT trying to discuss SEO tactics, simply what is the difference specially that Google's spider smiles upon in the two versions? I thought it was the obvious lack of graphics, perhaps somthing else?... The comparison of compedators sites, is some of the sites which have over 100,000 pages indexed are TERRIBLE looking with almost no graphics at all, text based, and using color codes as the only life on the page. Since you claim that googlebot is blind to all of this, then why would it favor dog food dish A. over dog food dish B?

I know the answer is obvious its just not clear yet... I mean if there was a clear guideline perhaps we could all build our pages according to the spider's standard and get two times as many pages into the index that much faster...

Orrrr perhaps we should build two compeletly seperate versions of our websites, one nice pretty one for the visitors, and a arcaic manual looking one for Google?

(btw our site has well over 100,000 relevant pages all related to our topic, and is very similar to our compedators, the big difference is the looks and possibly size of the page..)

giga

8:40 am on Nov 9, 2004 (gmt 0)

10+ Year Member



Yes, they were concentrating on hitting the main SEO points with these pages, not designing them for people.

And when those pages are indexed and appear in the SERPs, guess who sees them? Any webmaster would prefer that incoming traffic sees nice-looking pages that match the rest of the site, as opposed to bare text-only content with practically no formatting. I'm sure the vBulletin designers realize this. So why would they nonetheless choose to make those pages the way they did?

As for the minimal SEO, Google isn't trying to rate pages on how good the SEO is, they are trying to rate the page on their own critera of what a good page is.

The whole point of SEO is to attempt to make pages match these criteria. And the whole point of this thread is to determine if one of these criteria is a high content-to-fluff ratio.

You are comparing completely different categories of pages. Blogs are different from forums are different from stores are different from ...

Umm, that's my point. What characteristics do blogs have that make them fare so well compared to other categories of pages? One possible explanation is that they tend to have a lot of content with relatively little graphics and HTML formatting.

Patrick Taylor

9:50 am on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I theorize, based on past results, that if I build some POS ugly website with simple color coded borders (rather than shaded gif backgrounds) and simple html text rather than rendered smooth fonts, that google will eagerly goobble up tens of thousands of pages, whereas the graphic heavy site it seems to slowly chew and cautiously swallow each page.

It isn't clear what your theory actually is. Google doesn't recognise ugliness nor does it do anything with images other than read the <img> tag. Obviously a piece of text is better content, as Google sees it, than an image.

You later go on to talk (whoever "you" is) about content to code ratio, which is not the same as "graphic heavy". Incidentally I wasn't aware that blogs performed better than other types of pages. They're just pages like any other.

giga

10:05 am on Nov 9, 2004 (gmt 0)

10+ Year Member



Typical responses for WWW, obviously missing the point of my question. I could care less about ugly pretty or anything else regarding one's personal taste. I'M ASKING IF THE EXTRA CODE CREATED BY SUCH TEMPLATES CREATES A NEGATIVE DAMPENING AFFECT IN REGARDS TO THE SPEED AT WHICH GOOGLE INDEXES PAGES verses the same page minus the bells and whisles...

Patrick Taylor

11:06 am on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



missing the point of my question

Yes. It must be the way you tell it.

Anyway, it is possible to build pages with very a high content to code ratio indeed. Hardly any markup at all, in fact. People say this is a good thing to do - especially those who argue for validating code, accessible pages, separation of content from style, etc. I am one. However I have never experienced, seen or read any hard evidence that this produces a better performing page for search engines, either in ranking terms or in speed of indexing (the second of which makes no sense anyway). In a one-off case it would in any event be almost impossible to prove one way or the other.

BigDave

6:55 pm on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are two of us posting as "Giga" i'm the more blunt of the two

Then, for god's sake, get two different accounts! It isn't like they cost you anything.

As long as you are posting under one nick, I will continue to treat you as one person.

I think you are severely missing our point. This is NOT about SEO, this is about google's apparent indexing and spidering of barebones text vs a user freindly graphically enhanced webpage

Nope, I have understood your point from the start. You are complaining that your graphics intensive pages are not spidered and indexed as quickly as bare bones pages.

My arguments have been strictly that there are quite easily other factors that are more liely to be causing the delay. Or more accurately, there are other factors leading to some sites being indexed faster.

And by the way, this IS about SEO. Why would you think that getting indexed as soon as possible is not important in terms of search engine optimisation.

Our exmaple in this case is vbulletin. One feature of vbulletin is to offer a search engine freindly version of the message board (hence a completly clone-like supposedly search engine friendly version of the same data). Well, it appears to be working, as those pages tend to get indexed incredibly faster than a regular (non SE freindly page) forum page. We are NOT trying to discuss SEO tactics, simply what is the difference specially that Google's spider smiles upon in the two versions? I thought it was the obvious lack of graphics, perhaps somthing else?...

Yes, that was my point. There are many, many things different about those pages. You are just grabbing at the first thing that is apparent *to you* and assuming that must be what is causing that.

Dynamically generated pages, such as those in forums, toss up many roadblocks to the spiders. The SE friendly pages are designed to remove all those roadblocks.

You seem to have missed my suggestion to try adding some graphics to those pages to see what happens. This would be the true test. Don't compare apples to steak, which is what you are currently doing.

The comparison of compedators sites, is some of the sites which have over 100,000 pages indexed are TERRIBLE looking with almost no graphics at all, text based, and using color codes as the only life on the page. Since you claim that googlebot is blind to all of this, then why would it favor dog food dish A. over dog food dish B?

Because that blind dog still has its sense of smell and taste. Google likes the on-page and off-page factors for that page.

You have never answered whether you use JS links or image links. How about the URLs of the pages that you want indexed, do they pass variables? Are there session IDs?

I know the answer is obvious its just not clear yet...

It is quite clear. It is discussed here often enough. Clean search engine friendly navigation and deep links from already indexed sites. Oh yeah, also make sure your domain has a clean bill of health before buying it.

Umm, that's my point. What characteristics do blogs have that make them fare so well compared to other categories of pages? One possible explanation is that they tend to have a lot of content with relatively little graphics and HTML formatting.

That is a very minor characteristic of a blog. An almost overwhelming characteristic is that bloggers tend to link freely to each other when they find what the other person wrote is of interest. Hell they even run RSS feeds to each other which generate tons of links.

I'M ASKING IF THE EXTRA CODE CREATED BY SUCH TEMPLATES CREATES A NEGATIVE DAMPENING AFFECT IN REGARDS TO THE SPEED AT WHICH GOOGLE INDEXES PAGES verses the same page minus the bells and whisles...

No, that wasn't what you asked. You asked specifically about images. Images can be added to content pages with very minimal code bloat. You can make a nice looking page without going nuts on the bload, but it may not work out to be as perfect as a designer would like.

Is your added code hand tweaked or is it mostly thrown togeter in a wysiwyg editor and jammed into the forum software? Do you move everything you can into .css and .js files?

Code bloat can make a minor difference, but it is more likely to be in regars to ranking rather than crawl speed. It is far more likely that it is the linking pattern and the URLs that make the difference.

the_nerd

8:13 pm on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



TERRIBLE looking with almost no graphics at all, text based, and using color codes as the only life on the page. Since you claim that googlebot is blind to all of this, then why would it favor dog food dish A. over dog food dish B?

a. looks like you describe exactly what's needed for a successful site. Who needs fancy graphics, after all? I have no idea about design, but I bet a real good designer can go a long way with just colors, fost size and nice positioning. Don't we all hate pages that go "pling" "pling" .... until all that fancy little graphics pieces have loaded - remember? There's still people out there with modems. They don't give a damn about your rounded borders. They want the meat. At once.

b. you mentioned nicely rendered fonts. Probably gifs or jpgs. Won't help a lot in search engines (might be the reason for different ranking - hiding the text in graphics. They won't OCR it for you). Nobody will see your very nice graphically enhanced website.