jeudi 30 janvier 2020

Receiving no output when NOT clause is used in SQL to detect duplicate records under same values

Im new to SQL and trying to understand how i can find if similar fields exist across different values in a column:

My code:

WITH LOC_CTE AS
(SELECT DISTINCT fa.AccountKey, dce.EmailAddress , dcp.Phone , dpo.BrandName, dpo.BrandId
FROM Fact.Account fa
LEFT JOIN Dim.PetOwner dpo ON dpo.PetOwnerKey = fa.PetOwnerKey
LEFT JOIN Dim.RatingAddress dra ON dra.RatingAddressKey = fa.RatingAddressKey
LEFT JOIN Dim.CustomerEmail dce ON dce.CustomerEmailKey = fa.CurrentCustomerEmailKey
LEFT JOIN Dim.CustomerPrimaryPhone dcp ON dcp.CustomerPrimaryPhoneKey = fa.CurrentCustomerPrimaryPhoneKey
)


SELECT * FROM (
SELECT  LOC_CTE.*
   ,row_number() over (partition by EmailAddress,Phone,BrandId order by BrandName) as seqnum
      from LOC_CTE
           ) Q
where seqnum >= 1;

The output gives me:

AccountKeyEmailAddress  Phone     BrandName BrandId seqnum
389082  0160150@m**.edu   2098579**    AP       4   1
430223  01co***st**e@msn.com8655677**9  AP      4   1
430224  01cor**s**e@msn.com 8655677**9  AP      4   2
531443  01mar**01@gmail.com 73278**63   AP      4   1
544454  01mu**rmyra@gmail.com 8133**793 AP      4   1
548374  03b**y13@gmail.com  30130**93   AP      4   1
216594  03d**ado@gmail.com  50363**34   AP      4   1
377919  03d**ado@gmail.com  50363**734  AP      4   2
486237  03d**ado@gmail.com  50363**734  HP      3   3
532010  041**hen@gmail.com  85749**455  AP      4   1
365925  05bla**ro*e21@gmail.com 682365**51  AP  4   1
365926  05bla**ro*e21@gmail.com 682365**51  HP  3   2

So as can be observed in my output, I am able to detect duplicate records and sequence of its duplicity. Bu what I want is that my final report also detects if the same person(email and phone number) is repeated under a different brand. For example if you notice in the last two rows of the ouput it is the same individual but under 2 different brands and brand ID. But I just want to isolate these instances and generate an output which only shows the original duplicate along with the duplicate that comes up under a different brand/brandID. How do I do this?

I tried to edit the last line of my clause and integrated a NOT clause but I did not receive any output and neither any error. My syntax may be wrong but hope this helps with the logic

where seqnum >= 1 AND NOT BrandID = BrandID ;

Aucun commentaire:

Enregistrer un commentaire