homepage Welcome to WebmasterWorld Guest from 23.20.19.131
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Need Help! - no data in WMT - is it robots.txt or https?
Something has happened to my search query data, I need help getting it back
Scarlett1985




msg:4413700
 11:13 pm on Feb 2, 2012 (gmt 0)

This is going to sound really strange as I'm not very technically minded and I don't have all that much information to give. But I really need some help with this if someone is able to ...

I've been looking after a client's website, basically just doing a bit of SEO here and there, they are away at the moment so I don't have contact with them or their developers but something has gone wrong on their site. The other day my search query data from webmaster tools suddenly stopped working. No rankings have dropped and I'm still getting traffic data in analytics but nothing for impressions, or CTR in webmaster tools. It's completely flat-lined.

I resubmitted the site for verification in webmaster tools but this seemed to have no effect. Now, this is where my knowledge of things starts to fade - there are a lot of pages blocked by robots.txt (webmaster tools reports 10,000). The site is a movie streaming site so I can understand there are quite a few pages they want to block. I'm not sure if this is the problem as I was getting the webmaster tools data fine for months and nothing has changed in the robots.txt that I know of.

Another thing that might be an issue is that the entire site is https. I'm assuming that the http version redirects to the https because the https is the only one that is indexed. I'm confused though as all of the backlinks point to the http version. And I've just realised that they don't even have a sitemap! Also, the robots.txt is called [mysite.com...] The more I read about this stuff the more I realise how much of a mess this site is. The main thing I need though is for the search query data to start coming through again and I can deal with the rest later. It's really strange because everything was working fine before and now it has suddenly stopped and I don't think any changes have been made.

Does anyone have any suggestions about how I can get this working again? Any help would be much appreciated. And, remember - I'm not all that familiar with coding!

 

lucy24




msg:4413767
 2:50 am on Feb 3, 2012 (gmt 0)

Setting aside the possibility that google has just got the hiccups and everything will be back to normal tomorrow (which really does happen)...

Also, the robots.txt is called http://mysite.com/robots.txt.


OK, now you understand why the Forums rules insist on the specific form "example.com". Everything else turns into a clickable link and nobody can see what you typed. Or, in this case, figure out what you mean. I really hope you don't mean that the actual name of the file is, in full,
http://www.example.com/robots.txt
?! (and let's stipulate that the final . just sneaked in for grammatical reasons) Not that that would cause the problems you describe; it would just be as if robots.txt didn't exist at all.

But that brings up one easy thing to check: Have you looked in the "crawler access" area of GWT? That's where you can see the robots.txt they're currently working from, and feed in random URLs to see if the googlebot can get in.

Also look in the Parameters area, if the site uses them. It's in the same general area as "crawler access".

A sitemap isn't always necessary. If all parts of the site are nicely linked-- preferably via direct html rather than tucked away in javascript-- the googlebot will find them anyway.

***

Whoops! Gotta go. When I went to gwt to make sure I was calling things by the right names, I found a batch of new probably-spurious links from a probably-garbage site near the top of the list. Off to investigate.

Scarlett1985




msg:4413777
 4:00 am on Feb 3, 2012 (gmt 0)

Hi Lucy24 - as you can tell, I'm pretty new to all of this. You're right, the final . was just a grammatical error.

I'm really hoping that this is just a Google hiccup, although it has been like this for over a week now. I took a look at the crawler access section of WMT and if I put the https url (the one that is ranking) and hit 'test', it gives me a 'not in domain' message. Is that strange when it's the https pages that are indexed?

There are a few parameters listed but I'm not really sure what they mean ...

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved