Forum Moderators: open
[webmasterworld.com...]
The URLs pertaining to my website that all point to my index page take the following form.
www.mydomain.com/?S=AC3%26Document=document
www.mydomain.com/?S=AC3%26Document=document
www.mydomain.com/?SID=xRSUNVW8R9P44HSYQ6UWED&
www.mydomain.com/?S=AC3%26Document=document
www.mydomain.com/default.asp?S=AC3&am
www.some-other-URL.com/go.php?id=aHR0cDovL3d3dy5jcmVkaXRjaGFtcGlvbi5jb20v
www.some-other-URL-2.com/go.php?id=aHR0cDovL3d3dy5jcmVkaXRjaGFtcGlvbi5jb20v
www.some-other-URL-3.com/file/callink.php?linkid=3
I have emailed google, but have received no reply. I am unsure what I can do to A) eliminate the incorrect URL's that appear to originate from my site and B) eliminate the mirror URLs that originate from unrelated websites.
Any help would be greatly appreciated.
They are in the supplemental index, and cache date of 1969. I've just added those URL's to my robots.txt to be disallowed and submitted to the url removal tool. I also notice a few other domains that redirect to my site in the supplemental index, but obviously I can't do anything about those. Finally, I see that my domain name without www is listed - I suppose I should read the threads on what people recommend doing for that.
I'd be very surprised if any of this is the reason my rankings dropped Sept 22, but I'd be happy if removing those URL's returns the rankings nonetheless.
addendum:
when you search for the name of my site, it ranked #1 before Sept 22, then afterwards has ranked anywhere from 6th to 25th for the site name - today, if I add &filter=0 to the search string, I'm first. Not sure if that means anything or not.
I used the removal page around Dec 1st, and most of the stuff was removed within a day. A directory I had somehow submitted for removal twice, once as /directory/ and once as /directory took a while before it was removed. Exactly 7 days after that was gone, my site is back at #1 for it's index page and 'obvious' search term.
Inner pages are still hit and miss, but better than before for sure.
Not out of the woods yet, but being back where I was still #1 with &filter=0 is nice.
I've implemented a dynamic robots.txt that puts up two different robots.txt based on the subdomain:
www.example.com/robots.txt ---> accept robots.txt
valid.example.com/robots.txt ---> accept robots.txt
invalid.example.com/robots.txt ---> deny robots.txt
where the deny robots.txt is simply:
User-agent: *
Disallow: /
As far as the behavior for the index.htm of each subdomain, I've implemented the following:
www.example.com ---> valid main domain index.htm
valid.example.com ---> valid subdomain index.htm
invalid.example.com ---> 404
I'm not sure if it's better to do a 404 on an invalid subdomain or a 301 redirect to the main domain. My hope is the 404 with a deny all robots.txt will get rid of my invalid subdomain problem.
Unfortunately, I have so many invalid subdomains in Google's index that I can't use their tool to remove all of them. I guess I'll just have to wait to see if they get removed through subsequent crawls.
Also, I tried using an absolute URL in a robots.txt and both the validator and Google's removal tool choked on it.
from MY experience, Google takes a long time on 404s and 301s. Removing them, no matter how much of a paine is better
c
how does that work? I mean how were you able to do that?
There was a request by Googleguy to send examples using a special keyword in the email subject line. See message 336 of the following thread:
[webmasterworld.com...]
yeah, I posted it ;). I hope they're working on it at least
easy there...not letting you escape with just a brief comment ;). Can you please share more either here or via PM?
thanks,
A lot of this has to do with redirecting, issent cloaking some kind of redirecting?, so why dont they just bane redirecting meta. there is no use for that any way for a longer term.
Furthermore, the remaining redirect sites that are showing in the site:mysite.com search are cached November 2, whereas my own pages are cached earlier this week. This MAY indicate that Google is correcting the problem (if the algorithm is no longer requesting those pages). Perhaps another update or two will show them gone! Just speculation :)
I suppose it may only indicate that those pages were deemed mirrors/redirects, and not worthy of a recrawl. Fingers crossed for the best.
Not yet. About a week ago I created a robots.txt to remove clutter from the Google index (old, outdated, incorrect, and duplicate urls). Google has removed those and recrawled my site. Unfortunately, the internal pages are still indexed with url only, and the home page is not showing at all in a site:mysite.com search. The redirects are still there, as I mentioned, with cache of Nov 2.
So for now, I am not showing any improvement in my position, but I did get my inbound links back this week!
I am also finding that the inurl: command may not be working normally on some datacenters. I have been periodically monitoring the inurl:tracker2.php for changes in the way those urls were indexed. Recently I noticed the urls listed by url only (possibly the result of penalty/action by Google). Now when I run the search inurl:tracker2.php, I get a 403 Forbidden Access page from Google when I click past the first page. Very odd to me. Wonder whats up with that.
C
I am also finding that the inurl: command may not be working normally on some datacenters. I have been periodically monitoring the inurl:tracker2.php for changes in the way those urls were indexed. Recently I noticed the urls listed by url only (possibly the result of penalty/action by Google). Now when I run the search inurl:tracker2.php, I get a 403 Forbidden Access page from Google when I click past the first page. Very odd to me. Wonder whats up with that.
Additional Info: The Forbidden Access page is telling me "... we can't process your request right now. A computer virus or spyware application is sending us automated requests, and it appears that your computer or network has been infected." Again, this message appears only when I click pages beyond the first page of serps for the inurl:tracker2.php search. All other inurl: searches I have tested (for comparison) are performing normal.
My laptop is brand new (just got it yesterday), so there is no virus causing "repeated requests" for that particular search. And, I myself have only attempted that search two or three times and I get the same result from another computer. I wonder what, if anything, it may signify w.r.t. the future of tracker2.php/redirects/hijacking.
I cleared my cookies and it still happens .. so is this redirect being done based on my IP address then?
No. but now I see a bunch more redirects to my site. Where's my dust pan? Women's work is never done!
Google clearly has a mess on their hands. Because, I had convinced myself that the redirects containing the tracker2 were the ones responsible for the removal of my index page. However, today I discovered that a simple search of www.mydomain.com indeed reveals my title and description, but if you mouse over the url, or click the "Google Cache" link it is apparent that this url is very simple redirect of the following form:
[some-other-site...]
No tracker2! Google actually thinks my original url was replaced by this one. Of course, searching site:mydomain.com reveals all the other redirects that Google thinks is mine. As I mentioned, those have a Nov. 2 cache date. Maybe they will go away soon. No new redirects have been tied to my site in the past week or so.
allinurl: index.php (any other than 1st page) =>
"... we can't process your request right now. A computer virus or spyware application is sending us automated requests, and it appears that your computer or network has been infected."
allinurl: index.html => normal result
Yup, Google is well buggered (as we say in Yorkshire).
energylevel: the redirect to G.co.uk is also normal for myself and is, AFAIK, based on IP geo-location.
Seeing major changes on 216.239.39.104 with respect to redirects to my site. There are no redirects remaining when I do a site:mysite.com search. Furthermore, the tracker2.php urls are listed with urls only...no title/descriptions pulled from the sites they hijacked. The tracker2.php urls that once were showing in site:mysite.com search are still in the index, just no longer tied to my site. I think we are seeing some changes in the right direction with respect to this hijacking fiasco.
Chris
Forgot to mention - I was also able to get one url removed with the DMCA process and the results posted on chillingeffects but it looks like one of the others is still there although it was contained in the DMCA as well - of course the host took the pages down as well, but dont know how much this feeds into this?