| This 85 message thread spans 3 pages: 85 (  2 3 ) > > || |
|OS Commerce & Google - The Indexing Nightmare!|
OS Commerce Google indexing
Has anyone experienced problems with getting OS Commerce sites indexed in google? I'm only really assuming it's an OS commerce issue as there is no other logical explanation.
> rewritten the URL's to html format (3 months ago)
> Implemented dynamic meta tags (3 months ago)
> implemented an XML Google sitemap (2 weeks ago)
However, Google is still only indexing the old dynamic URL's and old generic meta (title & descrip).
I don't think it's a time issue as other google sitemaps that we submitted at the same time have had the desired effect.
Can this be an OS Commerce issue or had google crossed the site off their chistmas card list? (the google PR hasn't changed and no unethical linking strategies are being used)
Thanks in advance!
Do the "new" pages have equal or better links pointing at them? If not, they may be being out-competed by the old urls.
Many thanks treeline,
The exisitng incoming links all point to the homepage and we have just recently started a deep linking campaign with libks pointing to the new rewrittn .html URL's. No dynamic URL has links pointing to them.
Can't see this being the problem!
*Implemented dynamic meta tags (3 months ago)*
So how does this work, inserts the search phrase?
A lot of osCommerce sites have had trouble being indexed for the last 6 months or so.
I see many such sites that have lost between 80 and 99% of their indexed page count.
UKSEOconsultant, about six months ago i adopted oscommerce. I did not use it as it is and is mostly an integration with existing system, so the products pages i never used and can't comment on them but the category pages were crawled everyday but only the main page of osc ever made it to the index. Eventually, i dropped even osc category pages and now only use its backend, database structure, order processing system/sections and the front end is a html mod rewrite and has no problem at all being indexed.
I checked hundred times i did not have any problems that i could detect, find with osc pages and have no clue what kept them from being indexed. But then that was about the time google was working on big daddy, so that could have been the reason perhaps.
|A lot of osCommerce sites have had trouble being indexed for the last 6 months or so. |
g1smd, I know you've been watching the SERPs like a hawk. Can you tell us if you see anything common amongst those sites? Is there something in the code that may be causing this?
Well, nothing more than the fact that many osCommerce sites have multiple URLs for the same bit of content, many have session IDs, non-unique title and meta description per page, and many URLs that return "you are not logged in" type errors - just like the stuff that I wrote about vBulletin and PHPbb just a few weeks ago.
The software is very clever, and very flexible, but the interface presented to search engine spiders is poorly thought out (ummmm, NO thought has gone into it at all) and very badly implemented.
|However, Google is still only indexing the old dynamic URL's and old generic meta (title & descrip). |
Is it possible that the dynamic URIs are still browsable? If so, that may be one of the major issues. When you implement a rewrite, you should not be able to browse to those dynamic URIs anymore, they should be 301'ing to the new rewritten URI.
I'm going to assume that you've updated all internal links to reflect the rewritten URIs? And, that you've addressed any other issues that may cause a spider to get into a dynamic URI instead of the rewritten one?
i've been using oscmax lately so i kinda forgot some of the names for the standard oscomm install,
its been about a year since we used the standard so bear with me.
which seo mod are you using?
i think the one i used was called easy seo urls v 2 or something (its comes prepacked with oscmax), it requires
you to reset the cache after every product add or edit, if you don't every product since the last reset, will use the regular url structure
not the rewritten, so if you aren't careful, you end up with half a site rewritten and half not.
If i remember right the popular google sitemap and froogle mod for the standard build, also didnt use the rewritten url structure, since they pulled
straight from the database.
I never used any of the meta mods, i just edited the template so the product name echo'ed out for the title,
and then used the short description contrib to populate the meta description.
Also check and make sure you have sessions turned off for spiders in your admin configuration.
Of course you probaly don't want to hear this, but since it was the first oscomm project i did, there was a lot of mistakes,
mainly the site and froogle maps using the dynamic urls and the site itself using the rewritten urls, but that site ranks really well,
with some product pages actually outranking the manufacture's page on it, and has held steady for a year. But the domain was 4 years old
and was a hobby site turned into a store, so it was well linked and i got really really lucky lol...
but theres some of my mistakes, hopefully you can use them.
I follow these topics with great interest. In reference to this particular product, I've watched various topics come and go over the past couple of years.
Let me paint a scenario and then you tell me if I'm "barking up the wrong tree". From this point forward, think footprints.
So, a particular piece of software becomes very popular amongst certain groups. Because of it's popularity, it gains more exposure. Not only from a marketing perspective but, from an indexing one too.
The number of users increases steadily over time. The number of sites built using that software increases steadily over time.
During that time period, Googlebot is indexing anything and everything it can get it's hands on and this includes both static and dynamic content.
Due to inherent problems with the software, certain things begin to happen. In this case, we have rewritten URIs and dynamic URIs being indexed simultaneously from a variety of queries. Duplicate content. Footprints.
If you were a Search Quality Engineer (SQE), would you want your spider to get trapped within an environment where you would be indexing the same content under 10-15 different URIs?
So, are we possibly seeing a cleansing of the index based on certain technical issues? It's easier to filter based on the footprints and of course you're going to get caught in the net, there's no way around it. Or is there? ;)
P.S. I've watched Google (and other search engines) wipe out entire networks over the years based on footprints. They are easy to detect and it's unfortunate but there is going to be unintentional fallout.
Let me paint a scenario and then you tell me if I'm "barking up the wrong tree"
great point...and to be honest on the whole most oscommerce shops aren't essential to the index,
2 clicks from fantastico and you have your own online "business". And like you said why risk having a spider
getting trapped, especially gathering "useless" information.
Any estore is easy to spot, just not oscommerce, lol and for the conspiricy minded, wouldn't google be better off
if those sites out to make money, would pay for their traffic via adwords?
i'll go dig up and give the store logs a quick look and see if anything has changed drastically over the past few months,
but considering i haven't got a panicked email about traffic drop-offs from the store owner, everything is probaly about the same.
so it may be domain and link age is the way around it, lemme double check
pageoneresults, I'm not sure if you're barking up the wrong tree or not, but even if you are, the point you raise is key when deciding to implement any generic solution, prepackaged, cms, shopping cart, blog, or forum, it's all the same issue.
I don't see the issue as being one of footprints per se, but rather of Failure. That's failure to fix the errors of these applications to generate decent, search engine friendly, long term stable urls.
The only application I've seen that does that pretty much straight out of the box is wordpress (maybe others are finally now catching on), the rest require a lot of tweaking and hacks to the supposed 'seo' mods that supposedly 'fix' these issues. And some are just a disaster when they do try to implement these fixes out of the box, like the vbulletin stuff for example.
But that's not the only failure point, the real one is from people who don't spend the time or money to fix these errors, then fill up the google etc forums for the next few years complaining about their site not being indexed, whatever. This is to me probably the single most annoying thing I come across year in and year out here, especially the abject refusal to even admit that this was and probably still is the case.
I haven't seen any real indication of the footprint phenomena, but I have seen plenty of indication that if your fix your urls asap, use mod_rewrite well to redirect the fixes, and so on, the benefits are unmistakeable.
Re the topic thread, do yourself a huge favor and dump oscommerce, it has some nice offshoots, zencart is well regarded, I don't know the other one that was mentioned, but I do know that I have NEVER seen a worse executed open source application than oscommerce. Rumour has it that the key developers have sort of gotten to like the fact that they get paid to fix their own work, or something to that affect.
I use zen cart. One of my sites is fully indexed and ranks well.
Thanks Pageoneresults and others!
The dynamoic URL's are using the 3-1 redirect to the new HTML pages and SPider sessions are set to off.
The Server headers looks as follows; Can't see naything wrong with it myself/
#1 Server Response: <snip>
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Sun, 21 May 2006 14:56:15 GMT
Server: Apache 3
Set-Cookie: cookie_test=please_accept_for_session; expires=Tue, 20-Jun-06 14:56:15 GMT; path=/catalog/; domain=<snip>
Set-Cookie: osCsid=c1d4826240264f1962394cea0600b282; path=/catalog/; domain=<snip>
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Redirect Target: <snip>
#2 Server Response: <snip>
HTTP Status Code: HTTP/1.1 200 OK
Date: Sun, 21 May 2006 14:56:15 GMT
Server: Apache 3
Set-Cookie: cookie_test=please_accept_for_session; expires=Tue, 20-Jun-06 14:56:15 GMT; path=/catalog/; <snip>
Set-Cookie: osCsid=0395ad236823f335aab1e47045147011; path=/catalog/; <snip>
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
[edited by: lawman at 3:31 pm (utc) on May 21, 2006]
[edit reason] No Urls Please [/edit]
Well, since I was able to see the URI and do some initial research, you've got some issues on your hands.
For one, you have blank pages returning a 200 status. You have popup pages being indexed. You have some fatal errors in the html. The list could get quite lengthy upon a full blown discovery.
If you did all this three months ago, you may be dealing with Google's time factor in getting things sorted out after the rewrite. You've got enough going on there to confuse the heck out of a spider. It's going to take some time for a spider to sort out what you've done.
My advice? Hire someone who knows exactly what they are doing. Have them do a complete discovery of before and after. There will need to be some cleanup from the previous incarnation and also from the current one. This is not a push-button fix. You first need to undo what is being done and then redo it the correct way.
Google is going to continue to index the old URIs for a few months. At first, they will go supplemental. Then they will slowly start to disappear. During this process, the new URIs are getting indexed and may go supplemental too. But, once they get fully indexed, the supplemental tags should go away. This will all take time. I'd plan on at least 6-9 months to sort everything out that you have going on.
Personally, I would have consulted with someone who specializes in this area before making the change. You may have added insult to injury and it is really difficult to tell without a full discovery.
I have seen similar problems with osCommerce on many other sites. It needs a lot of work to fix things properly.
Getting Google to deindex stuff they had already found at the "wrong" URL was the most difficult of all the tasks.
Many of the problems come from basic design errors within the package itself. It needs a lot of work to make it SE friendly.
|Personally, I would have consulted with someone who specializes in this area before making the change. You may have added insult to injury and it is really difficult to tell without a full discovery. |
You've hit the nail square on the head here. It's my belief that failure to do exactly this is directly responsible for a vast majority of so-called 'google is broken' etc type comments over the years.
This is exactly what I saw: implementing a complete correction and permanent seo url solution to a prepackaged product, especially one as bad as oscommerce, is not trivial, and the odds of someone who isn't fairly well versed in mod_rewrite etc doing this successfully are very low. But, sadly, so are the odds of people actually doing what's necessary and paying someone to help them.
For one, you have blank pages returning a 200 status. You have popup pages being indexed. You have some fatal errors in the html.
kinda sounds like contribs/mods not jibing with each other. you have to be really careful when you add in contribs from the oscommerce site, none of them take into account any other mod/contribs you may have installed, and it gets really tricky adjusting for each new contrib you install, you have to know exactly what the previous one(s) changed and how to adjust each to work with each other as well as the system. And even if you are running a base install, the coding on the contrib site can be really iffy at times, other than a few i really trust most of the time i just take the extra time and write my own version, so at least i know where to look for the mistakes when it doesn't work. Very rarely if ever is there an easy copy and paste mod to oscommerce no matter what people may say and after you add one contrib its almost never.
i would dump the store to a testing server or domain with a clean install and rebuild the changes in very carefully and make sure everything is working together and right and error free. im not smart enough on the indexing stuff to even hazard a guess on what to do after the store is working right, but i would get the store 100% before worrying about anything else.
I did get a dedicated OS commerce company to do this for me.. someone I actually found on this forum!
Well, I have a few sites that use oscommerce. One uses a Mod-rewrite (static url) and another does not. It's funny because the one without the mod-rewrite ranks better than the other.
Both of these stores contain some of the same products. So it is interesting to see how the different pages rank.
In fact, the mod-rewrite one is having a lot of supplemental issues with google and it taking forever to get new pages indexed.
I agree there are issues with the oscommerce stores. However, prior to Big Daddy, my osc stores ranked very, very well.
With so many sites reporting google issues (not just osc stores), I am not sure jumping in and changing everything is such a good idea.
Hopefully google will fix their latest buggy update.
You also need to look at duplicate content issues: the same content indexed at more than one URL.
I see that osCommerce is full of that sort of stuff.
yeah, out-of-the-box OSC installations are riddled with dup content problems.
indeed. Best way (imho) is to carve all the links out and hard link into it from a regular site setup.
What is the point of having the XML sitemap if google doesn't use it. I have read where they say the bot will crawl but not possible index everything?
Does anyone else find this statement odd? You found a page but decide not to index it? Why don't you index it? Oh we can't tell you that.
The will not index pages that have a low amount of "trust".
Matt Cutts made reference to this a week or so ago.
Google's trust stuff has been obvious for almost 1 year now. Especially after jagger. And bourbon for technical errors. The Matt Cutts comments simply served to openly admit what has become increasingly obvious over the last year, and what google themselves had published in their patent application of march last year.
The only people I see in denial about this are the ones who think that it's the same google with some light window dressing type changes. And whose sites not only do not have any trust, they will never have it. And who will never fix their site's technical errors, let alone admit that they exist, or should be fixed.
|I did get a dedicated OS commerce company to do this for me.. someone I actually found on this forum! |
LOL, that's really funny. I did one OSCommerce install, and it will be the last one I ever do, a repair actually. Anyone who can work with that product and not be totally repulsed... and then who decides it's a good idea to form a 'dedicated oscommerce' company, is probably one of the developers I've heard about who have come to realize they have the name recognition that will make people use their product, and the product itself is so bad that you have to pay a dedicated company to implement it. And I'll bet at least one employee of that company is an oscommerce developer.
zencart etc forked because oscommerce was so bad. To me, anybody who gets paid to setup shopping carts who does not exert every possible effort to guide you away from oscommerce is highly suspect. And is definitely not a good programmer, since no good programmer I know of would want to touch that garbage. You couldn't pay me any amount to ever work on any oscommerce install again, I consider implementing that product, as a developer, a highly unethical action, since I know how bad it is, I could and would never allow any client to use it.
And as you can see by pageoneresult's quick overview of the site in question, in fact that company did exactly as bad a job as I would expect, since nobody good would use or specialize in that product.
And just for the record, I use almost all open source / free software products, for development, desktop, and hosting.
I helped patch an osCommerce install last year, and you are right. It was a nightmare of duplicate content, crap meta data, poor spidering potental, and rubbish HTML code.
The thing that really bit was the lack of standardised methods across the package. Different bits did the same task in completely different ways.
The package has several huge stylesheets, yet parts of it are stuffed full of font tags.
Almost no thought.... No. No. Absolutely no thought had gone into how the damn thing interfaces with search engines.
There is no attempt to keep "useless" pages or duplicate content out of the index.
The same things are true of many such programs, scripted packages: CMS, carts, forums, etc, even stuff like vBulletin and PHPbb fare little better.
>> And who will never fix their site's technical errors, let alone admit that they exist, or should be fixed. <<
I recently spent several hours in on/off PM exchanges about a site showing obvious duplicate content problems (from "same meta description on every page" design).
Even after showing explicit search examples that screamed "big problem here", I was brushed off with "that can't be the problem".
Having seen and fixed the same problem on several dozen websites alone this year, I can assure you that it is the problem.....
What shopping cart do you all recommend over oscommerce?
| This 85 message thread spans 3 pages: 85 (  2 3 ) > > |