I have a very long data.table consisting of claims and clients data:
head(claimsData)
claimID subCoverageKey clientID amount clDate subCoverageKeyClaims product gender incurred censored
1: IP_00297141-5f49-4979-8f20-4d929bf42504 IP_Accommodation client-98655-2 2281 2018-04-09 ffdef3f3-2996-4d1a-bf51-a78b43029079 Product 1 M 2281 TRUE
2: IP_00297141-5f49-4979-8f20-4d929bf42504 IP_Upgrade client-98655-2 2281 2018-04-09 4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb Product 1 M 2281 TRUE
3: IP_00297141-5f49-4979-8f20-4d929bf42504 IP_Upgrade client-98655-2 2281 2018-04-09 4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb Product 1 M 2281 TRUE
4: IP_0032b73e-6371-40d6-a971-b8eed7943b94 Daily_cash client-47058-1 4805 2017-11-09 0a9ee55e-31b1-46f8-a0d4-91154e6c0998 Product 1 M 12011725 TRUE
5: IP_0032b73e-6371-40d6-a971-b8eed7943b94 IP_Accommodation client-47058-1 4805 2017-11-09 ffdef3f3-2996-4d1a-bf51-a78b43029079 Product 1 M 4805 TRUE
6: IP_0032b73e-6371-40d6-a971-b8eed7943b94 IP_Upgrade client-47058-1 4805 2017-11-09 4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb Product 1 M 4805 TRUE
Now i want to aggregate the data by clientID and i want to flag those clients whose overall claims have been more than the predefine tresholdHCC:
results <- claimsData[, .("NOclaims" = length(amount), "claimsAmount" = sum(amount),"claimsIncurred" = sum(incurred)), by = clientID
][, HCC := ifelse(claimsIncurred > tresholdHCC,1,0)
][ , claimsCappedHCC := as.numeric(claimsIncurred)][ HCC == 1, claimsCappedHCC := tresholdHCC]
I get an warning message:
Warning message: In
[.data.table(claimsData[, .(NOclaims = length(amount), claimsAmount = sum(amount), : Coercing 'character' RHS to 'double' to match the type of the target column (column 6 named 'claimsCappedHCC').
and the results are not as expected, since client-47058-1 should have been flagged as HCC. What is going on?
results
clientID NOclaims claimsAmount claimsIncurred HCC claimsCappedHCC
1: client-98655-2 9 9953 2600716 0 2600716
2: client-47058-1 30 41783 12291356 0 12291356
3: client-76465-1 33 74122 49882641 1 40000
4: client-101595-2 24 69960 20643477 0 20643477
5: client-79164-3 3 2492 2078186 0 2078186
DATA:
structure(list(claimID = c("IP_00297141-5f49-4979-8f20-4d929bf42504",
"IP_00297141-5f49-4979-8f20-4d929bf42504", "IP_00297141-5f49-4979-8f20-4d929bf42504",
"IP_0032b73e-6371-40d6-a971-b8eed7943b94", "IP_0032b73e-6371-40d6-a971-b8eed7943b94",
"IP_0032b73e-6371-40d6-a971-b8eed7943b94"), subCoverageKey = c("IP_Accommodation",
"IP_Upgrade", "IP_Upgrade", "Daily_cash", "IP_Accommodation",
"IP_Upgrade"), clientID = c("client-98655-2", "client-98655-2",
"client-98655-2", "client-47058-1", "client-47058-1", "client-47058-1"
), amount = c(2281, 2281, 2281, 4804.69, 4804.69, 4804.69), clDate = structure(c(17630L,
17630L, 17630L, 17479L, 17479L, 17479L), class = c("IDate", "Date"
)), subCoverageKeyClaims = c("ffdef3f3-2996-4d1a-bf51-a78b43029079",
"4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb", "4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb",
"0a9ee55e-31b1-46f8-a0d4-91154e6c0998", "ffdef3f3-2996-4d1a-bf51-a78b43029079",
"4390be79-dc1e-4f7a-a0c0-f548c0b9ffcb"), product = c("Product 1",
"Product 1", "Product 1", "Product 1", "Product 1", "Product 1"
), gender = c("M", "M", "M", "M", "M", "M"), incurred = c(2281,
2281, 2281, 12011725, 4804.69, 4804.69), censored = c(TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE)), row.names = c(NA, -6L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x7ffbf78114e0>)
Aucun commentaire:
Enregistrer un commentaire