I have several decades of twice daily data with the following structure
str(Raw.Data)
'data.frame': 709400 obs. of 7 variables:
$ V1: int 254 1 2 3 9 4 4 4 4 4 ...
$ V2: Factor w/ 448 levels "0","100","1000",..: 1 40 11 448 286 4 24 23 20 17 ...
$ V3: Factor w/ 18039 levels "","-1","-10",..: 99 15749 6714 18039 13326 4244 4221 12375 14708 16000 ...
$ V4: Factor w/ 3509 levels "","-1","-10",..: 3503 3034 3496 1 2176 3496 1219 2878 33 149 ...
$ V5: Factor w/ 1295 levels "","-1","-10",..: 1092 1273 1019 1 992 1295 1254 40 187 192 ...
$ V6: int NA 353 99999 NA 230 99999 163 202 238 262 ...
$ V7: int NA 99999 0 NA 40 99999 50 40 70 60 ...
In a spreadsheet like format the first day of data looks like this:
254 0 1 JUN 1957 NA NA
1 94823 72520 40.50N 80.22W 353 99999
2 2000 2000 99999 13 99999 0
3 PIT ms NA NA
9 9780 353 234 105 230 40
4 10000 157 99999 99999 99999 99999
4 8500 1566 143 64 163 50
4 7000 3168 34 -133 202 40
4 5000 5815 -127 -266 238 70
4 4000 7483 -231 -270 262 60
4 3000 9517 -414 99999 258 150
4 2500 10726 -530 99999 260 170
4 2000 12128 -638 99999 271 230
254 12 1 JUN 1957 NA NA
1 94823 72520 40.50N 80.22W 353 99999
2 1000 1500 1690 15 7 0
3 PIT ms NA NA
9 9770 353 168 113 135 40
4 10000 153 99999 99999 99999 99999
4 8500 1537 119 89 216 80
4 7000 3133 16 4 221 70
4 5000 5779 -132 -182 249 90
4 4000 7444 -240 -314 262 90
4 3000 9469 -414 99999 272 120
4 2500 10682 -511 99999 289 130
4 2000 12097 -608 99999 291 150
4 1500 13868 -630 99999 291 160
4 1000 16400 -611 99999 298 110
I want reorganize the data so that the first day of data is reduced to this:
0 1 JUN 1957 9780 353 234 105 230 40
12 1 JUN 1957 9770 353 168 113 135 40
To do this I need cells 2:5 for rows that begin with "254" and cells 2:7 for rows that begin with "9".
I developed the following code, but it doesn't even make it through the first if statement in the first iteration of the for loop. Maybe this is a problem with data type or indexing?
leng <- dim(Raw.Data)[1]
Processed.Data <- as.data.frame(matrix(0,ncol = 10, nrow = 42000))
i <- 1:leng
count <- 1
for (i in 1:leng){
if(Raw.for.R[i,1]==254){
Surface.Obs[count,1:4]<-Raw.for.R[i,2:5]
} else if(Raw.or.R$V1[i,1]==9){
Surface.Obs[count,5:10]<-Raw.for.R[i,2:7]
}
count <- count +1
}
When the code is run I get the following warning messages:
1: In if (Raw.Data[i, 1] == 254) { :
the condition has length > 1 and only the first element will be used
2: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 1 has 709400 rows to replace 1 rows
3: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 2 has 709400 rows to replace 1 rows
4: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 3 has 709400 rows to replace 1 rows
5: In `[<-.data.frame`(`*tmp*`, count, 1:4, value = list(V2 = c(1L, :
replacement element 4 has 709400 rows to replace 1 rows
6: In `[<-.factor`(`*tmp*`, iseq, value = 99L) :
invalid factor level, NA generated
7: In `[<-.factor`(`*tmp*`, iseq, value = 3503L) :
invalid factor level, NA generated
8: In `[<-.factor`(`*tmp*`, iseq, value = 1092L) :
invalid factor level, NA generated
Any help resolving just one of my many problems will be greatly appreciated!
P.S. If you have some ideas of how to insert blank rows for missing dates that might save me an extra question later.
Thank you!
Evan
Aucun commentaire:
Enregistrer un commentaire