Forum Moderators: open

Message Too Old, No Replies

MySQL - parsing an index of email addresses

.... looking for duplicate domains

         

trillianjedi

3:25 pm on Nov 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not sure if this is possible. Under a site member table, I have an email address column. I'm looking for dupicates, but by domain and not full address (so the part after the "@").

Some of these will be obvious (hotmail, gmail, yahoo etc addresses), but I'm looking for duplicates.

Is there any way that I can build an SQL query that will pull out a list of email addresses where there is more than one on the same domain?

Thanks,

TJ

Nutter

3:46 pm on Nov 29, 2005 (gmt 0)

10+ Year Member



This worked on a quick test in Access with a table named tbl and field named email.

SELECT DISTINCT(RIGHT(LCASE(email), LEN(email) - INSTR(email, '@'))) FROM tbl;

The email addresses were example1@gmail.com, example2@gmail.com, caseexample@GmAiL.CoM, bob@example.com, yahootesting123@yahoo.com, & testagainwithabadaddressbutcapitalthistime@YAHOO.com.

Returned values
example.com
gmail.com
yahoo.com

Edit - after re-reading your question, I don't think I really answered it. I'll have to mess with it some more...

trillianjedi

3:28 pm on Dec 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Nutter, thanks!

I don't think I really answered it

I think perhaps you did actually?

Your query left out example.com as it was unique?

That's exactly what I need. I'll have a play with your query.

TJ

trillianjedi

3:32 pm on Dec 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm I'm getting an error:-


MySQL said: Documentation
#1064 - You have an error in your SQL syntax near '( `email_address` ) - INSTR( `email_address` , '@' ) ) ) FROM subscriber LIMIT ' at line 1

Here's my actual query:-


SELECT DISTINCT (
RIGHT( LCASE( `email_address` ) , LEN(
`email_address`
) - INSTR( `email_address` , '@' ) )
)
FROM subscriber
LIMIT 0 , 30

Nutter

3:55 pm on Dec 2, 2005 (gmt 0)

10+ Year Member



Oops, typo. It should have been LENGTH( rather than LEN( . It's that Visual Basic code creeping back.

SELECT DISTINCT (
RIGHT( LCASE( email ) , LENGTH( email ) - INSTR( email, '@' ) )
)
FROM tbl
LIMIT 0 , 30

And it does return example.com. I've tried to get it to only return duplicates, but can't figure out how.

BarryStCyr

4:14 pm on Dec 2, 2005 (gmt 0)

10+ Year Member



Try this:

SELECT DISTINCT (Right(LCase([emailaddress]),Len([emailaddress])-InStr([emailaddress],'@'))) AS Expr1, Count(([emailaddress])) AS Expr2
FROM email
GROUP BY (Right(LCase([emailaddress]),Len([emailaddress])-InStr([emailaddress],'@')))
HAVING (((Count(([emailaddress])))>1));
Hope it helps

trillianjedi

4:22 pm on Dec 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Barry, but I'm getting an error with that too:-

#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[`email_address`]),Len([`email_address`])-InStr([`email_address`

Exact Query:-


SELECT DISTINCT (
Right( LCase( [ `email_address` ] ) , Len(
[ `email_address` ]
) - InStr( [ `email_address` ], '@' ) )
) AS Expr1, Count( (
[ `email_address` ]
) ) AS Expr2
FROM `subscriber`
GROUP BY (
Right( LCase( [ `email_address` ] ) , Len(
[ `email_address` ]
) - InStr( [ `email_address` ], '@' ) )
)
HAVING (
(
(
Count( (
[ `email_address` ]
) )
) >1
)
)
LIMIT 0 , 30

BarryStCyr

4:28 pm on Dec 2, 2005 (gmt 0)

10+ Year Member



Check all the function names for your implementation of SQL. For example, some use LEN and some use LENGTH.

trillianjedi

4:30 pm on Dec 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My bad - sorry, I should have spotted that ;-)

Thanks, testing now....