Welcome to WebmasterWorld Guest from 184.108.40.206
Forum Moderators: martinibuster
According to W3, the CSS validates as do all of the XHTML page (900+).
2,580 pages in MSN index
939 in Google
93 in Yahoo
The site is now 15 months old with 13 topics covered. For one particular topic I've got three on-line educational instutions that seem to use my site as part of their lesson plan. Even Wikipedia has a link to that section.
After the April 1st update, Yahoo gave me over 300 referals a day - which I considered good as an entry point. The June update turned traffic into 1 or 2 vistors a day, despite the fact they still had 400+ pages in their index. With this July update, the scene is even more dismal.
I have a high quality trio of content sites, all hand written by myself and linked to by many org/edu sites and Yahoo have really helped all three of them.
not all doom and gloom....
I would have to say that I think the results are starting to stabilize too. I know Yahoo mentioned over the next two weeks, but this guess of mine is based on the June update - which settled in over a period of about three days. If the same pattern holds true, those getting pushed down will continue down and those getting pushed up might see more improvement.
On the night of the update, I went to 11 pages indexed. The pages that were indexed rank(ed) much (20 or so positions) better for some very competitive keywords. The next afternoon I returned to my normal numbers of indexed pages (around 2K), but not my normal positions.
Today, I was looking through my indexed pages to see if I could find any difference in the pages that were ranking compared to the pages that weren't, so I started viewing the caches of my pages. This is the interesting part... only of very small number of pages have a cache, and those are the pages that are ranking!
My conclusion is this: If Y! updated the algo and did not have a cache of the page, they could not apply the algo to the page. So, the reason most of my pages do not rank, is because they changed the algo and do not have a way to apply the new algo until they revisit my site... This would also explain the 1 - 2 weeks for the full algo to roll out, because any page that is not currently in the cache, will have to be refetched and this will take some time.
hope this is just an algo change by Y! and they don't see this increase as a potential red flag
As jd01 said, the handful of pages that slurp spidered as a 304 are all doing very well. The other 500 pages that slurp gave the old "200" that were mostly top 10 rankings are simply gone
I added a page totally about a specific thing. I have a couple other pages that mention the thing, and those are the two that rank. The spcific-thing-focused page is shown as indxed, but it is not ranking. At worst it should be the second result listed, even if it is brand new. So maybe there is a black hole.
And now that I wrote that, when I search for the term plus site:mysite.com, the page that is listed first in the regular results for the term doesn't appear at all, and the new page shos up first.
<closes Yahoo search page, decides to look again in a week...>
Here is a little of how I think it works... I do not work for a SE, but have worked with DB's quite a bit and know this is how I think things could be structured for efficiency.
1. Index pages. These pages are stored in a 'site' or 'domain' section of the DB.
This would be where the results for site specific searches would come from... Why? relatively small page count, easily searched for any information using almost any query.
2. Page Caches. A simple link to the location of the 'saved' or, ahem, cached page stored with specific 'site' or 'domain' information.
3. Index results. A completly separate set of results. This is where I would not store the entire page, but rather an logrithmic value of the page. After having applied the algo to the page in question. When applying the algo, I would assign values according to what the page contains, and very probably convert all information to a numeric, or alpha-numeric value, rather than full text. Like a large scale scoring system.
This set of data would include all signs of quality stored as shortened numeric or alpha-numeric values, and would also include an assigned value based on what the algo determined to be the theme, focus, or purpose of the page (whatever you would like to call it.)
In doing this I would have the ability to quickly search for the value of the search query and then order my results by the quality values that have been defined by the application of the algo.
I think this is a close (very short) explaination as to what they are doing. It would also explain why searching for a keyword or query based on a domain will result in different information than searching the full index. Pages without the quality (or whatever you would like to call it) value assigned would not show in the full index results, because they have no value assigned, so they cannot be more vaulable than the pages that do have a value assigned.
The time it takes to assign a value, even with extensive servers must be time consuming, and therefore it is very possible that pages are indexed for a period of time before the algo can be applied to assign a value to each individual page. So, pages that are indexed show in the site: results, because they are indexed in the site, but they will not show in the full index results, because they have not yet been tested for a weighted value.
Just my (much oversimplified) thoughts on what I would do if I were designing a SE and how I would apply the algo individually on the way in to be able to generate the number of results necessary in a very short time on the way out.
Added: The main point was without the cached page, there is no way to assign a value to the pages based on the new algo... The only way is to refetch the pages and then apply the algo - hence the 2 week time line.
Added 2: This would also explain why dup. filters, etc. are never applied until the end of an update... Why compare factors from every page when you can shorten that and only compare pages with factors that are within X% of the same in each 'shotened' theme value, based on the algo?
My conclusion is this: If Y! updated the algo and did not have a cache of the page, they could not apply the algo to the page.
You might be onto something here...
My site dropped to around 83 pages. I just looked, all of the pages that I have in the index were cached before the April 1st update. Prior to the April 1st update, I had around 85 pages in the Yahoo index.
I also looked at the pages that were cached, these were definately cached before April 1st. I know that to be true because I made a blanket change to the site before April and I see the old style pages. The only page with a good cache is the home page.
Every other page that I created since then has vanished - poof!
I sincerely hope this is not a form of Sandbox and that if your pages were created after April 1st then you have to wait a few months until the next update and then they will appear!?
A cruel retrospective April Fools joke? Tell me I'm wrong please!
My tinfoil hat tells me that Y! is nuking as many commercial sites as possible to force us all into Y!SM and Paid Inclusion.
Some searches have way more Y! Directory listed sites dominating the top listings, but that's no more than an assurance of some level of quality and relevancy, even by virtue of the topical category placement. And though it may seem unfair to some, those sites are a whole lot better than some of the swill that was sitting in some top spots before what seems to be a clean sweep for those searches.
There's a delay happening in indexing new sites, but isn't there a possibility that there may be some kind of backlink analysis that needs doing before the final scoring rolls in for them?
For my sites that went up, one common aspect is that they have strong on page work.
mb - I'm assuming this is still an update in progress, and perhaps more layered than what we've previously seen on Yahoo, so I haven't watched too closely... but this is what I'm seeing thus far on an established site whose index page rankings are due to core keyword optimization on page, plus inbound links boosting the core words and various modifiers. It's in a moderately to extremely competitive and spammy area.
Generally, but not always, the index page is a little bit up and the inside pages are a little bit down. I attribute the index page jumps to inbound links that kicked into our Google rankings in roughly late June. The inside pages have more specific onpage work, but with many less external inbounds than the home page has.
File naming conventions matter...
I can't say that they don't, but I see our index page going up for searches driven by inbound links, and the interior pages, with filename matches, going down a few notches on the same searches. I wouldn't generalize based on this observation, though. It could just be that we've worked on links to the index page this time around, but have let the interior pages sit as is, and more competition has entered the arena.
On page 2 - 3 of one of the searches, incidentally, Yahoo had previously returned 9 identical pages (judging by idiosyncratic filename) from different sites. Many were redirect pages. They've gotten the count down to 3 so far on this update, so that's obviously an area they're working on.
[edited by: Robert_Charlton at 6:54 am (utc) on July 22, 2005]
Google seems to be just sandboxing new domains ad-infinitum which from a searchers point of view means that they are getting no new sites to peruse. If a webmaster redesigns their website it seems they get penalised too, and in terms of spam I'm still at a loss with one of the major travel based websites that is using Adwords on the right and piling up the Google index with every city in the world optimised pages on the left. Then due to the volume of pages indexed it's considered an authority site and as a result appears ahead of the hotels who have built websites which actually show some real content. The crazy thing is that when you then drill down to the offers pages on these hotels, they frequently offer better rates than the "guranteed low rate" of the affiliates. So the end user experience is - sites that look like spider food with lousy offers.
My tinfoil hat tells me that Y! is nuking as many commercial sites as possible to force us all into Y!SM and Paid Inclusion. Y!'s financials the other day were pretty bleak and their stock reacted accordingly, they are desperate to generate some money somehow for Q3.
This is quite untimely for Y! considering Google is finishing up the new PageRank formula which is, for the most part, increasing interior PageRank throughout the web. People all over are praising the fact that their little green bar is one notch higher, and then Y! goes and shoots themselves in the foot with this. If this is Y!'s answer to the Bourbon Update, which Google (correctly) predicted would cause a major increase in Adwords revenue, then they are in for a big shock...fool me once, shame on you, but we won't be fooled twice. Especially when there's such a big gap in user traffic.
This is quite untimely for Y!
It's been pointed out several times already that the update is rolling out over two weeks. No sense complaining that dinner tastes lousy when it's still sitting in the refrigerator marinating.
Rather than complain, which does nothing to help improve your situation, it would be more productive to do some real analysis about what kind of things are happening.
Superficial remarks like, Y loves affiliate sites, and Y hates affiliate sites are not only contradictory but do nothing to further the knowledge base.
I was discussing this with a friend, and he had some interesting comments:
I dont bother whining about it because there's nothing to do
except learn your lesson and carry on. In the Y search blog some guy was complaining about the update killing his "legit site" and he posted the URL and its way past grey. It has a ton of KW landing pages called "search pages"... so some people just cannot deal with the fact that just because their grey / fast and loose stuff used to work, it doesnt work any longer.
I have to agree to a certain extent. There are some so-called legit sites that are playing it way fast and loose and deserve the boot, I've seen the URLs. But there are some that are right on the line and that's harder to deal with.
The important thing to keep in mind is that this update is not over. Superficial complaints don't help, but meaningful analysis is gold.
[edited by: martinibuster at 5:43 pm (utc) on July 22, 2005]
It's been pointed out several times already that the update is rolling out over two weeks
I'm just saying that it was a bad idea for them to start it now, because if it really does take two weeks for it to finish you will see a heavier lean towards reliance on Google. These two weeks will be essential to their future. I'll admit that most of our keywords dropped in the rankings on Y! but I hardly worry about it. 80% of our profit comes from Google listings. I'm more worried for Y! than I am for myself...Maybe I'm just blowing things out of proportion. Wouldn't be the first time. SEO is like a rollercoaster for me...
Thats fair enough, but meanwhile the results are a junk yard and surfers will stop using them. Im certainly glad im not holding the stock.
Sometimes when something isnt broken, why fix it?
The results currently are not an improvement. The sector im in has so much junk in it, the Yahoo results are laughable.
Its a shame i cant give you exact examples, but some of the results return lists of the same site. Its obvious that external linking with anchor text keywords plays a massive part in the new algo hence why im seeing sites with thousands of sub-sub-sub domains listing one after the other because Yahoo sees them all as different sites.
IMO they have just taken a step backwards at a time they could be advancing forwards at a real pace.
It really is the best thing to observe and share notes about meaningful observations.
I've had a problem since the June update, so any improvement from zero is great.
For those that have constructive feedback, there is an email for feedback. I have sent in mine already, telling them about the dropping of pages and the fact I dont show for my own site name.
I mentioned the April 1st update and perhaps a rollback in the index to around that timeframe. I do know for a fact that the only pages I have cached are pre-April's update (except for the home page). I can tell because the site uses a different template and the old one shows in the cached pages.