homepage Welcome to WebmasterWorld Guest from 54.161.236.92
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
I changed my site structure. Ooooops
deborahbaker




msg:4515566
 8:29 am on Nov 4, 2012 (gmt 0)

I have a very well established site which has been live since around 1996. It was on page 1 of google for many keywords normally in position 1 in a not really competitive market. It is the ultimate site for its topic.

In march this year I changed the site structure. Initially it was a landing page eg www.mydomain.com with cube cart in a folder called 'store'. With a few old established htm pages to beef it up.

I decided to get rid of the htm landing page and move the store home page from mydomain.com/store/index.php to mydomain.com/index.php. I set up redirects through .htaccess to keep the back links and page rank.

Since then the site has gradually dropped in the rankings and visitors have dropped from 7000 unique to 1300 unique. I checked all my files and noticed that I had not altered my robots.txt file which was stopping spiders accessing folder /images/. Which originally was just images but of course when I restructured the site this folder then became one with lots of functional cube art files in it. So I thought that that was the reason. I have changed the robots file to give the spiders full access.

My questions are

1. Does anyone know if the robots file could have caused this to happen?
or
2. Should I move the cube cart files back to their original position in the site structure and put back the htm landing page as it was probably more seo friendly. Do you think a htm home page is a better seo option than the cube art index.php page?
Or
3. Could it be something else I have not thought of? It just seems too coincidental as the site has had these positions for 20 years and was not affected at all by penguin or panda

Any help would be really appreciated
Thanks

 

SevenCubed




msg:4515621
 4:49 pm on Nov 4, 2012 (gmt 0)

These days it's really difficult to know anything with certainty regarding google. My GUESS would be that it is just the behaviour that google exhibits after a site structure change. It's typically dropped back to a staging region in the SERPs, reassessed, then floated back to the area where it used to be if the changes aren't too drastic. That has always been my experience with changes going back even 4-5 years ago. I doubt it has anything to do with the google turmoil of recent times. In the past I knew I might have to wait as much as 3 weeks for it to bounce back -- these days I have no idea what the timeline might be.

Out of curiosity have you noticed if your home page has been replaced with an inner page and that page appears maybe somewhere around page 7 (no not because I love that number!) in the SERPs?

smallcompany




msg:4515622
 5:05 pm on Nov 4, 2012 (gmt 0)

I set up redirects through .htaccess to keep the back links and page rank.
What did you do for this exactly? Sometimes technical mistakes block or direct search engines to a wrong direction.
If it was as simple as you've described - from subfolder to the root of your domain - you should be able to achieve it with a single line in your .htaccess.
Also, have you checked what G says for your robots.txt in WMT?

brinked




msg:4515637
 5:56 pm on Nov 4, 2012 (gmt 0)

Nobody here can really give you the answer you're looking for. There are just too many variables in play.

When you change your site structure you are usually changing a lot of different things on your site. Your site will need to be thoroughly reviewed.

There is some correlation between panda and site navigation, and a poor navigation can cause a panda penalty.

I had a client a few months ago who was in a similar situation as yours, it turned out he enabled rel canonical on some pages, but it in fact was canonicaling all pages to one page on the site, causing a massive site de-index. Such a simple thing was overlooked for many months.

deborahbaker




msg:4515828
 11:17 am on Nov 5, 2012 (gmt 0)

Hi Thanks for the replies

Out of curiosity have you noticed if your home page has been replaced with an inner page and that page appears maybe somewhere around page 7 (no not because I love that number!) in the SERPs?
Not that I can see

What did you do for this exactly? Sometimes technical mistakes block or direct search engines to a wrong direction.
If it was as simple as you've described - from subfolder to the root of your domain - you should be able to achieve it with a single line in your .htaccess.
I have a lot of redirects in my .htaccess file because i had three landing pages and I have another url pointing to this domain so its a bit complicated. I think the .htaccess file is okay because I got the host company to do it for me, but I am not 100% sure.

Also, have you checked what G says for your robots.txt in WMT?
I had a look and asked WMT to fetch the file - it said "Crawl postponed because robots.txt was inaccessible". I dont understand this as it is a basic robots.txt file and is just sitting there in my root directory waiting to be spidered! So I ran the "test" function and this is the result.

Test results

Url

Googlebot

Googlebot-Mobile

<Mod's note. Removed live link to actual site. Likely to removed subsequent discussion referring to actual site itself.>

Allowed by line 2: Disallow:

Detected as a directory; specific files may have different restrictions


Allowed by line 2: Disallow:

Detected as a directory; specific files may have different restrictions

So is it okay or not do you think? because WMt has now totally confused me.

I had a client a few months ago who was in a similar situation as yours, it turned out he enabled rel canonical on some pages, but it in fact was canonicaling all pages to one page on the site, causing a massive site de-index. Such a simple thing was overlooked for many months.
This is something I have to do too - canonicalise. Any advice or tutorial on this would be appreciated.

One other thing when I checked WMT it said "no sitemaps" so I entered the url and asked it to test the sitemap and it said it was okay. When I go back in again it says "no sitemaps" I checked my sitemap and got an error "Missing "charset" attribute for "text/xml" document" I did a bit of research and was even more confused as to whether to use an internal DTD or external DTD or whatever.

My question is can anyone give me the exact statement I need to insert into my sitemap please? Also is this what is stopping Google seeing the sitemap?. I used an online site to produce the sitemap.
.

[edited by: Robert_Charlton at 5:16 pm (utc) on Nov 5, 2012]
[edit reason] removed domain name, fixed paragraph formatting [/edit]

Marketing Guy




msg:4515856
 1:40 pm on Nov 5, 2012 (gmt 0)

At what rate has the traffic diminished?

Generally, changing the URL structure of a site and 301ing old to new will result in some traffic loss initially, which should return over a number of weeks to near normal again (there is a PR loss when following a redirects, so rankings probably won't be 100% what they were).

A very simple example, I recently changed a site from WWW to non-WWW. Result was about a 40% drop in traffic after a few days, with a 10% increase week on week after that for perhaps 4-5 weeks. The site stablised around 95% normal traffic at the end.

That's the trend you should be seeing if the issue is relating to your redirects. If the pattern is different, then the problem most likely is with something else.

Could be anything really. I just had a quick look at the site you mentioned above and the couple of things that stood out were;

1) Server was down - a few connection attempts just now and pages didn't load.

2) Use of misspellings in optimisation.

Potentially those could send out negative signals to Google. However, I'd say Google is smart enough to ignore the occaisional server outage, so unless that's a consistent problem, I don't think it would affect your rankings. The misspellings may be a minor sign of optimisation and could be associated with spam sites that list variations of keywords. However, for an old site like yours, it's also likely that the age / authority outweighs this potential issue.

On further inspection, it looks like your CMS is duplicating some content. I won't post the full URLs as mods will likely snip all references, but;

1) Is product URL - /product_###.html
2) Is index.php page - /index.php?_a=viewProd&productId=###&review=write
3) Is variation on product URL - /product_###.html?review=read

So, I'd hazard a guess that a combination of the structure changes and multiple URLs for same content is confusing Google. The dillution effect would mean that an individual page may not rank as well. And if Google is slowly discovering the new URLs, then the impact might be a gradual decline in traffic.

The URL parameters feature in Google Webmaster Tools can help here (you can use to get Google to ignore URL variation number 3).

Solving the remainder of the issues depends on what the actual problem is. Which of the two remaining URLs is the "proper" version? The index.php or the folder structure version? Either 301 old to new, or use rel=canonical to resolve it.

Hope this helps!
Scott

deborahbaker




msg:4515863
 1:58 pm on Nov 5, 2012 (gmt 0)

On further inspection, it looks like your CMS is duplicating some content. I won't post the full URLs as mods will likely snip all references, but;

1) Is product URL - /product_###.html
2) Is index.php page - /index.php?_a=viewProd&productId=###&review=write
3) Is variation on product URL - /product_###.html?review=read
I had no idea this was happening. Also I dont know how you found it out but thanks. I did disable the review feature in Cubec.rt a while ago by editing some of the files. Do you think this is what has caused these additional urls? If so, if I change the files back to allow reviews do you think it will resolve this issue? If I do change the files back, could you check by whatever methods you used to find this issue in the first place? How did you do it by the way?.

BTW The misspellings arent misspellings I have tried to cover the english and american versions of the keywords which are spelt differently.

Marketing Guy




msg:4515867
 2:08 pm on Nov 5, 2012 (gmt 0)

You can check by grabbing the first paragraph of text from a product page and Googling it. Different pages may return different results - the example I looked at returned the variations I showed. Another didn't (maybe Google hasn't indexed the variations of these yet), but did return the same copy used by another website.

Go through analytics and find your pages that have dropped the most and go through this process and you'll eventually see a pattern of issues (the content being presented on multiple URLs).

No idea about the review feature with Cubec, never used the CMS. But have resolved similar issues via WMT / redirects as outlined above.

It'll take a while to notice any impact of changes - you need to wait for Google to reindex the pages first. But your traffic logs should show some improvement if you make the changes.

I understand your use of different spellings, but to be honest I think you can pick one style and run with it. Google should be able to handle it and rank the page for appropriate terms. It's like optimisation vs optimization - personally I'd go with whatever is your local spelling. Going off on a tangent and possibly a different discussion, but I think you need to weigh up user experience vs SEO on this one. It may have been the case in the past that SEO won out, but I think nowadays UE is more important. Personal opinion of course, but might be worth testing and see how your conversions going when all the tech stuff is sorted out. :)

g1smd




msg:4515916
 4:44 pm on Nov 5, 2012 (gmt 0)

move the store home page from mydomain.com/store/index.php to mydomain.com/index.php

URLs should never include the index filename as it is a duplicate of the variant without it.

lucy24




msg:4515925
 5:13 pm on Nov 5, 2012 (gmt 0)

I checked my sitemap and got an error "Missing "charset" attribute for "text/xml" document" I did a bit of research and was even more confused as to whether to use an internal DTD or external DTD or whatever.

Is it xml? It should look like this, minus the {braces}. Note the first line. Question marks are part of the format.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="{blahblah}">
<url>
<loc>http://www.example.com/</loc>
{ <changefreq>monthly</changefreq> }
</url>
{ et cetera }
</urlset>

blahblah = http: // www.sitemaps.org/schemas/sitemap/0.9 (without spaces). I thought this was the people that made my sitemap, but it's the exact format from google's how-to page. NOT your own sitename here! That goes in place of "example.com".

{ changefreq } one of several optional tags. They don't do anything, but you might include them for your own information.

{ et cetera } repeat the url package for each page you want to list. You do not need to list every single page; in fact you don't need a sitemap at all unless you've got backwaters that a robot might not find.

Oh, yes, and unless your URLs contain non-ASCII characters --not likely if your site is English-- the charset declaration won't change anything. It will just make the search engines happy.

Robert Charlton




msg:4516045
 8:50 pm on Nov 5, 2012 (gmt 0)

deborahbaker - I've edited one of your earlier posts to remove a link to your site, which has allowed some other members to look at it and give their feedback. As the Google Forum Charter [webmasterworld.com] explains, we ask members not to post such links, for their own protection as well as the protection of everyone here. We don't generally offer public site reviews.

Since you posted the link, though, we now have a situation where some where some comments in this thread are based on observation by those who have seen your site, and some by those who haven't. As it's quite a bit of work to clean that up, I'm going to let those stand, and I'm going to add a comment of my own.

When I took a look, I saw that much of the content on your site is not unique. Though I won't post the specific phrases, one thirteen word text string I searched for in quotes returned 21,800 pages containing the exact text. Almost everything else I saw on the site is quoted on numerous other sites. I don't know whether your content is highly derivative, or whether you've simply been scraped a lot. Perhaps the topic is more popular than you think, or perhaps the prevailing business model in your niche doesn't encourage originality.

It's possible that the change in your structure has caused Google to reevaluate your site... that seems to be Google's pattern now... and that the dupe content may be causing your site to drop. This is apart from the technical problems it appears you're having.

g1smd




msg:4516068
 10:12 pm on Nov 5, 2012 (gmt 0)

There's a recent NYT article that explains (part way through) how a US political site got dinged for specific types of on-site duplicate content in recent years. Very illuminating.

mslina2002




msg:4516081
 11:13 pm on Nov 5, 2012 (gmt 0)

I was not able to see the url but was able to find your site. Well, what I believe is your site.

My first thought was what Robert Charlton posted.
It's possible that the change in your structure has caused Google to reevaluate your site...
Usually when you have a dramatic change in site structure there will be a full recrawl and hence more discoveries by more savvy bots.

One thing also, some pages you should "noindex, follow" for example your search results pages and your "tell a friend pages". (If the site I see is yours I see about 4000 or more tell a friend pages in the index)

deborahbaker




msg:4516237
 10:05 am on Nov 6, 2012 (gmt 0)

Is it xml? It should look like this, minus the {braces}. Note the first line. Question marks are part of the format.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="{blahblah}">
<url>
<loc>http://www.example.com/</loc>
{ <changefreq>monthly</changefreq> }
</url>
{ et cetera }
</urlset>

blahblah = http: // www.sitemaps.org/schemas/sitemap/0.9 (without spaces). I thought this was the people that made my sitemap, but it's the exact format from google's how-to page. NOT your own sitename here! That goes in place of "example.com".
Thanks for that. Yes that is exactly what I have but GWT is still saying "no sitemaps found for this site"
does anyone think it is because I have redirected www.mydomain.com/index.htm to www.mydomain.com/index.php? Is this bad practice?

deborahbaker - I've edited one of your earlier posts to remove a link to your site
Sorry it slipped in when I copied and pasted the results.

When I took a look, I saw that much of the content on your site is not unique
Yeah you are right and I know about unique content but I think due to the age etc of this site it has not been an issue. I have been gradually replacing it. You know when this site was developed Google did not even exist. It was just Yahoo and since its birth it has had number 1 placements. You are right I did copy some of the content because duplicate content was not an issue then. But a lot of it is my original content which has since been duplicated by other sites. Also my site gets plenty of new unique content everytime a product is uploaded there is a huge description added. I really dont think this is the reason for its recent problems and is one of the reasons I havent bothered too much about the content (if it aint broke dont fix it - which I really wish I hadnt done 6 months ago as I wouldnt be in this position now) I am sure it is one of the other reasons above that I am trying to resolve but I do take your point and I am going to change all the content. Incidentally since I changed the robot file my site is climbing back up the rankings. But I want to resolve all my above issues so please if anyone can help then post here.

One thing also, some pages you should "noindex, follow" for example your search results pages and your "tell a friend pages". (If the site I see is yours I see about 4000 or more tell a friend pages in the index)
Thanks I hadnt thought of this. I really have opened up a whole can of worms I now have to deal with.
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved