jeudi 6 juillet 2017

Having trouble w/ using an if/else statement in R to append lists for use in outlier detection

When attempting to append the list both d and e give the same result despite having different requirements to append. Also, the list is being appended with the entire data set instead of just the specific point.

The purpose of the code is to determine whether something is an outlier based on its distance from a point and then append the list with the y value (cd2) that corresponds to that distance.

In:
cd1 = runif(100,1,100)
cd2 = runif(100,1,100)
smth_ln = lowess(cd1,cd2)
dis = smth_ln$y - cd2
data_frame = data.frame(cd1,cd2,dis)
low_lim = 1.5*(quantile(dis,.25))
up_lim = 1.5*(quantile(dis,.75))
c = c()
y = c()
for(x in c(dis[1:100])){if(x>0){c = c(c,x)}else{y=c(y,x)}}
up_sig1 = quantile(c,.75) + (1.5*(quantile(c,.75) - quantile(c,.25)))
low_sig1 = quantile(c,.25) - (1.5*(quantile(c,.75) - quantile(c,.25)))
up_sig2 = quantile(y,.75) - (1.5*(quantile(y,.75) - quantile(y,.25)))
low_sig2 = quantile(y,.25) + (1.5*(quantile(y,.75) - quantile(y,.25)))
d = c()
e = c()
if(low_sig2>low_sig1){for(x in c(data_frame$dis[1:100])){if(x <= up_sig1 && x >= low_sig2){d = c(d,approx(x = data_frame$cd1,y = data_frame$cd2,xout = data_frame$cd1))}else{e = c(e,approx(x = data_frame$cd1,y = data_frame$cd2,xout = data_frame$cd1))}}}else{for(x in c(data_frame$dis[1:100]))if(x <= up_sig1 && x >= low_sig1){d = c(d,approx(x = data_frame$cd1,y = data_frame$cd2,xout = data_frame$cd1))}else if(x <= up_sig2 && x >= low_sig2){d = c(d,approx(x = data_frame$cd1,y = data_frame$cd2,xout = data_frame$cd1))}else{e = c(e,approx(x = data_frame$cd1,y = data_frame$cd2,xout = data_frame$cd1))}}

The result is:

$x
  [1] 11.447428 44.291567 92.809770 36.063389 90.596517 12.722653
  [7] 25.756582 69.232233 13.776838 62.360205  5.477240 32.229693
  [13] 59.814434 46.254613 81.739447 35.020277 90.421646 45.154622
  [19] 83.973407 60.404087 92.881599 86.821291 61.502499  8.282610
  [25] 55.860109  4.636936 43.634686 81.330728 43.396142 66.789648
  [31] 53.322288  6.365995 24.340246 92.574642 25.689096 49.681125
  [37] 50.054399  7.831072 52.773898 66.842364 76.399255 49.782069
  [43] 92.129942 10.874742  7.969397 33.511227 81.130801  2.609641
  [49] 67.775461 24.730233 67.182756 24.675746 73.598423 48.631472
  [55] 50.904886 58.838220 23.738821 81.478225 48.911970 14.794760
  [61] 15.043253 15.394312 84.929685 54.762099 95.675635 47.696485
  [67] 27.312439 39.857976 86.275431 63.573144 19.129964 42.821685
  [73] 47.054810 50.198058 68.435317 21.340840 56.184376 56.185090
  [79] 35.897770 23.723093 21.142765 50.976963 75.527709 98.059314
  [85] 51.778683 47.452870 66.543085 99.921337 38.275734 82.211106
  [91]  4.178087 99.423369 99.851362 45.074104 87.583087 94.087865
  [97] 40.419044 46.156591 43.573101 34.250272

 $y
  [1] 86.757834 64.795089 63.206636 12.064656 93.110171 28.473002
  [7] 73.584784 54.240773 69.469404 79.814226 74.217999 19.589156
  [13] 99.575540 55.535001 71.650742 49.581719  2.161870 31.103456
  [19] 49.212733 75.218275 89.114477 99.061958 86.896058 29.882293
  [25] 65.026437 34.973215 38.109474 33.469248 90.604875 36.196422
  [31] 13.699698 38.543386 64.492879 86.664067  5.022857 99.444511
  [37] 65.626650 18.911719  1.858937 61.926040 79.125846 88.960847
  [43] 81.827462 53.719003 32.714238 41.364405 93.739879 70.428402
  [49]  4.212821  6.227203 22.559080 11.222974 35.417615 83.539601
  [55] 88.293480 96.963779 88.229092 70.728419 66.119922 14.747876
  [61] 49.630652 68.671306 68.638475 21.645112 47.031634 56.109030
  [67] 14.061691 73.549199 24.073812 36.245393 52.683881 37.711940
  [73] 59.509533 74.041737 92.785747 20.732096 67.699491 99.789727
  [79] 63.344462 93.016976 53.180088 13.865976 73.338913 64.992383
  [85] 81.954478 13.842078  2.241327 35.427582 49.562590 61.433470
  [91] 16.502319 32.160169 77.223827 85.748099 22.660180 46.331218
  [97] 43.919372 87.545919 26.479644 27.889094

Repeated one hundred times.

Aucun commentaire:

Enregistrer un commentaire