dimanche 9 octobre 2016

in R is there a better way for loops a dataframe to create 3d binary vector

I have a dataframe that has a value that is a character and i need to convert the values to a 3d binary vector based on the value

 DF = data.frame(Names = c("A1","A2","A3","A4"), TestScore = c("100 pts","NA","50 pt","75 pt."))

I need to create a function to split the TestScore to just the numeric value that would contain only the value (100,NA,50,75) and then create a 3d binary vector with of three categories where the 2nd and 3rd categories would be average vs best student. All 0 or NA would be consider the first category of NoScore. I need to use this 3d vector for analyzing the NoScore relationship to another variable

The only way I can think about it is do a for-loop on the rows of the dataframe, split the TestScore and then do an if-then else but I am not creating a 3d dimensional vector but I am creating 2 variable (Ind and Cat) that have values for Score but I think I should use my array of [i,j,k] to store values of 0 or 1.

Here is what I have for looking at the dataframe :

 my_awards_array <- array(0, dim=c(movie_nrow,3,3))

convert_Awards <- function(df){
    library(stringr)       
    awards_split_df = data.frame(str_split_fixed(df$Awards," ",2))
    summary(awards_split_df)
    ls.str(awards_split_df)
    awards_split_df[] <-   data.frame(lapply(awards_split_df,as.character),stringsAsFactors=FALSE)
    summary(awards_split_df)
    ls.str(awards_split_df)

    ############################################################################
    ## First column contain the Number to convert as the value for binary Vector
    ############################################################################


    ### Second column would be eliminated ("win" or "wins." "win.")
    awards_split_df$X2 <- NA
    awards_conv_df <- data.frame(lapply(awards_split_df,as.integer))
    colnames(awards_conv_df) = c("AwardsNum","AwardsType")
    awards_final_df <- awards_conv_df

    for (i in 1:nrow(awards_final_df) {
        if (is.na(awards_final_df$AwardsNum)) {
                awards_final_df$AwardsCat = "NoAwards"
                awards_final_df$AwardsInd = 0
                break
            } else if (awards_final_df$AwardsNum = 0 ){
                          awards_final_df$AwardsCat = "NoAwards"
                          awards_final_df$AwardsInd = 0       
                          break
            } else if (awards_final_df$AwardsNum > 0 & awards_final_df$AwardsNum <= 5) {
                         awards_final_df$AwardsInd = 1        
                         awards_final_df$AwardsCat = "SomeAwards"
            } else {
                         awards_final_df$AwardsInd = 2        
                         awards_final_df$AwardsCat = "ManyAwards"
            }    

    }

Aucun commentaire:

Enregistrer un commentaire