Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google indexing pages my site doesn't have

odd PHP dynamic pages showing up

         

ddd5280

3:52 pm on Jun 3, 2005 (gmt 0)

10+ Year Member



When i do "site:example.com" in Google i get back more than just a list of pages i have indexed on Google - i get extra dynamic pages i don't even have, for example:

www.example.com/index.php?kid=32&catname=Nebraska

I use Coldfusion and not PHP and i have no such variables as "kid" or "catname" in my site.

The shared hosting Coldfusion server i am on does have php installed which might be a clue. I asked my host and they have no clue. Also another clue is i do have pages on each state like Nebraska.

I did a search to see if anyone is linking to my site that way too and i found nothing.

Anyone else getting this?

[edited by: ciml at 6:51 pm (utc) on June 3, 2005]
[edit reason] Examplified [/edit]

diamondgrl

11:30 am on Jun 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, what does Google show in the cache for these pages?

Drumat5280

2:05 pm on Jun 7, 2005 (gmt 0)

10+ Year Member



No cache, just "similar pages" under the title.

Rx Recruiters

3:11 pm on Jun 7, 2005 (gmt 0)

10+ Year Member



Hello,

I have had exactly the same problem. When I do a search inurl:mysite.com, it returns a ton of php pages that have been spidered by Google, but are not part of my domain. This has happened, not only on my main domains, but on parked domains as well!

I assume that this is some sort of "hi-jack"? It has only shown up in the past couple of months, and in that time period my Google refs went from 8000 a day to 2 or 3 (I was always ranked #1 for a ton of key words, have been around since 1998,and am the "Most Popular" site in my cat in Yahoo - but now have fell completely out of Google for the most popular keywords. I didn't make any changes to the site during that time period, so can only assume that the php hijacking, as well as 302 redirect problems that started showing up, are to blame.

Thnaks for any comments - This is my first post here, but I have been reading the forums for years.

Thomas

ddd5280

4:36 pm on Jun 7, 2005 (gmt 0)

10+ Year Member



Rx Recruiters,

Glad to hear that i am not the only one with this odd issue.

My site is only two months old, so i can't tell you if it is effecting my rankings.

Maybe we can figure this out together. Do you use subdomains like i do for each state in the USA?

Are you getting the exact same variables in the url as i am?

Is there any way we can find the scrapper site?

Rx Recruiters

4:57 pm on Jun 7, 2005 (gmt 0)

10+ Year Member



dd5280 - I will have to research it some more, but truthfully, I don't know if we can do anything about someone else doing this to our sites - They seem to proliferate too fast, - this should be something that Google can help with (or should address with their spidering), as is the same with the 302 redirect indexings.

I have webpages on both my sites for targeting each state. Such as www. domain name .com /alabama-keyword-keyword.htm or something along that line. I wonder if that is coincidental that we both target state names, but in a particular industry niche.

ddd5280

5:06 pm on Jun 7, 2005 (gmt 0)

10+ Year Member



Rx Recruiters,

Are you getting 51 backlinks for each state and DC - the same as i am?

Is there a way to find out who linking to my site in this fashion so i can maybe tell Google about it?

Rx Recruiters

4:22 am on Jun 8, 2005 (gmt 0)

10+ Year Member



I am getting all kinds of php backinks and inurl: t changes from day to day.

I was told that this was a form of php hijacking, but I don't know if there is anything we can do about it, or if that is even what killed my rankings.

I'm tackling the 302 redirects and the dup indexes (http:// and http: www) for now, but will keep digging on the php problem. It seems strange that other search engines don't have that problem, just the Googlebot?

ramachandra

10:25 am on Jun 8, 2005 (gmt 0)

10+ Year Member



I too can see two php links with “allinurl” tool, one of the link redirects to my website and other to a link update page. Is it really affecting ranking? How to remove these links?

shri

10:39 am on Jun 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does your software give a valid 404 error code when it cannot find the required content?

This has happened to us in this scenario.

1) Take a CMS driven site - A on IP address x

2) Taken another CMS driven site B on IP address y could be anyone's site...

Point site B to IP address x. Can happen if you move servers or someone misconfigures their IP address.

Bot fetches b.com/index.php?contentid=123

Your server rewrites it to a.com/index.php?contentid=123 with a 301

Your CMS does not return a proper 404 "not found".. it returns a 200 with some vague error message. Know what .. the Bot thinks this is a valid URL on your site.

Be very careful with .htaccess and rewitemagic functions.

Reid

11:17 am on Jun 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



allinurl:ww*.mysite.com

will show all results with ww*.mysite.com IN THE URL

That's all, it is not limited to your site. You are searching for a particular query string "www.mysite.com" in the URL of the results.

You have 52 subdomains? one for each state?

how many results will you get IN THE URL when you search for your main domain with allinurl? you get 53. The main and each sub has that query string 'in the url'.

You see appended type links like w*w.somesite.mysite.com?
This does NOT add up to a hijack.

If it has a cache of your page and your page title and snippet but someother URL then that IS a hijack.

stuartc1

2:37 pm on Jun 8, 2005 (gmt 0)

10+ Year Member



another explaination could be that if the site was owned by someone before you - google could be showing outdated cached links... probably a long shot.. you could look at the waybackmachine site for old pages!

theBear

3:18 pm on Jun 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Be even more careful with custom error pages and NEVER point to your home page for any error message using the errordocument directive.

Check the codes issued using a header checker (Google can be your friend).