vendredi 1 décembre 2017

multiple actions in if statement - R

I'm trying to use multiple actions in if statement. For example:

x <- 1

if (x == 1) {
   paste("First")
   1*1                    #multiple actions
 } else if (x == 2) { 
   paste("Second")
   2*2 } else {("Nothing")
 }

 [1] 1        #what I'm getting

 [2] "First"
      1       #what I want to get

In this case only the second part of the expressions was printed to the console. Any ideas how can I run all actions between if and else if ?

Using the output of if statement (cmp) in a cp command

I am trying to use the output of an if statement as input for a cp command, but it does not seem to work. My code is:

if cmp $2/$filename $fname; then 
cp $fname $2/$filename.JPG
fi

I think the cp statement is incorrect, as the cmp only gives the difference between the files and then executes the cp if there is a difference. I already copied the files with a for loop, now I am trying to rename and copy the different ones that have same names.

Is there a way how to rewrite this code without if-else?

I have this code:

float s = 0.0;
if (b > 240.0) 
{
    s = 780.0 + (b - 240.0) * 100.0;
}
else if (b > 90.0) 
{
    s = 80.0 + (b - 90.0) * 5.0;
}
else 
{
    s = (b < 10.0) ? (b / 10.0) : (b - 10.0);
}

Is there a way, how to rewrite this without if-else statements? Even for the cost, that something will be computed and not used.

R matching 2 data frames and creating a new variable based on 2 dataframe

I have a requirement where I need to look at values in one data frame and check that with other data frame and create a new categorical column. This seems to be something simple as vlookup but I am not able to get how to do it in R.

My data is big and has thousands of rows and many columns. Below I have tried to create a sample data of similar format.

#####Generating sample data

library("plyr")
library("data.table")
library("rpart")

set.seed(1200)

id <- 1:1000
ibd <- sample(1:15,1000,replace = T)
bills <- sample(1:20,1000,replace = T)
nos <- sample(1:80,1000,replace = T)
stru <- sample(c("A","B","C","D"),1000,replace = T)
v1 <- sample(1:80,1000,replace = T)
v2 <- sample(1:80,1000,replace = T)
v3 <- sample(1:80,1000,replace = T)
v4 <- sample(1:80,1000,replace = T)
v5 <- sample(1:80,1000,replace = T)
v6 <- sample(1:80,1000,replace = T)
v7 <- sample(1:80,1000,replace = T)
v8 <- sample(1:80,1000,replace = T)
v9 <- sample(1:80,1000,replace = T)
v10 <- sample(1:80,1000,replace = T)
a1 <- sample(1:80,1000,replace = T)
b1 <- sample(1:80,1000,replace = T)
type <- sample(1:15,1000,replace = T)
value <- sample(100:1000,1000,replace = T)

df1 <- data.frame(id,ibd,bills,nos,stru,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,a1,b1,type,value)

num_var <- c("bills","nos","v1","v2","v3")

v0 <- num_var

ibda <- sort(rep(1:15,4),decreasing = F)
billsa <- sample(5:15,60,replace = T)
nosa <- sample(15:60,60,replace = T)
v1a <- sample(10:70,60,replace = T)
v2a <- sample(20:70,60,replace = T)
v3a <- sample(20:70,60,replace = T)

df2 <- data.frame(ibda,billsa,nosa,v1a,v2a,v3a)

bills_ibd1 <- sort(df2[ibda == 1,"billsa"])

So if you see bills_ibd1 contains 05,10,13,15. I want to check these values in df1 for ibd==1 and create a categorical variable "bills_cat" in df1 which will have codes as below

if (ibd == 1 & bills_ibd1 <= 05) bills_cat = 1
if (ibd == 1 & bills_ibd1 > 05 & bills_ibd1 <= 10)  bills_cat = 2
if (ibd == 1 & bills_ibd1 > 10 & bills_ibd1 <= 13)  bills_cat = 3
if (ibd == 1 & bills_ibd1 > 13 & bills_ibd1 <= 15)  bills_cat = 4
if (ibd == 1 & bills_ibd1 > 15 )  bills_cat = 5

Note - bills_ibd1 is getting generated from df2 and I would have such variable for each ibd and column variable.

But this way I will have to write many if statements and I observed variable bills_cat getting replaced.

Is there a simple and better way of achieving this? I need to check for variable in df1 at ibd level based on the values from df2. Please suggest

Finding out if the 4th digit in a string is a number or character in r

Following on from the question as found by the link below.

How to test if the first three characters in a string are letters or numbers in r?

How do I include it to check that the 4th character is numeric also? For instance, an example of my dataframe is as follows.

ID   X
1   MJF34
2   GA249D
3   DEW235R
4   4SDFR3
5   DAS3
6   BHFS7

So again, I want the first three characters in the string to be letters and I also want the 4th to be any number between 0-9. If the given rule is achieved then I want it to paste the first three letters of the X variable in a new column. If not I want it to say "FR". Hence the final dataset it as follows.

ID    X       Y
1    MJF34   MJF 
2    GA249D  FR
3    DEW235R DEW
4    4SDFR3  FR
5    DAS3    DAS
6    BHFS7   FR

What I have so far that checks the first three letters is:

sub_string<-substr(df$X, 1, 3)

df$Y<-ifelse(grepl('[0-9]',sub_string), "FR", sub_string)

I have tried to expand it to read the 4th but doesn't seem to work.

sub_number<-substr(df$X, 4, 4)
df$Y<-ifelse(grepl('[0-9]',sub_string) && !grepl('[0-9]',sub_number), "FR", sub_string)

I'm probably doing something obviously wrong but can't seem to figure it out? Thanks in advance

Igraph Invalid Indexing Error when using Ifelse

I have two vectors of nodes:

bad_node_pair
+ 2/2 vertices, named:
[1] 1949 1967

remaining_nodes
+ 5/? vertices, named:
[1] 1947 1948 1949 1967 1968

I test whether the bad_node_pair exists in the remaining_nodes, and if so, return the ones that do appear:

bad_node_pair[names(bad_node_pair) %in% names(remaining_nodes)]
+ 2/2 vertices, named:
[1] 1949 1967

However, when I put this in a loop, I get:

ifelse(
  bad_node_pair[names(bad_node_pair) %in% names(remaining_nodes)], 
       print(1), 
       print(0)
       )
[1] 1
Error in `[<-.igraph.vs`(`*tmp*`, test & ok, value = c(1, 1)) : 
  invalid indexing

It prints the answer, but throws that error.

What is going on?


Data for bad nodes:

df1 <- read.table(header=T, text=" from   to
8 1949 1967")
bad_g <- graph.data.frame(df1, directed=FALSE)
bad_node_pair <- V(bad_g)

Data for good nodes:

df2 <- read.table(header=T, text=" from   to
1 1947 1948
2 1947 1949
3 1947 1967
4 1947 1968
5 1948 1949
6 1948 1967
7 1948 1968
8 1949 1968")
g <- graph.data.frame(df2, directed=FALSE)
remaining_nodes <- V(g)

How can use IF & MAX together?

How do I perform the following function in Excel? Currently Column C display the MAX value of D4:F4. However I need it to display an alternate value.

For instance if MAX Value is from column D, it'll display "TRAN", or if MAX value value comes from column E, It'll display "VERT" and lastly if it comes from column F, it'll display "LONG"

VM

Thanks in advance