This is my dataset where I have different countries, different models for the different countries, years and the price and volume.
data_dic = {
"Country" : [1,1,1,1,2,2,2,2],
"Model" : ["A","B","B","A","A","B","B","A"],
"Year": [2005,2005,2020,2020,2005,2005,2020,2020],
"Price" : [100,172,852,953,350,452,658,896],
"Volume" : [4,8,9,10,12,6,8,9]
}
Country Model Year Price Volume
0 1 A 2005 100 4
4 2 A 2005 350 12
3 1 A 2020 953 10
7 2 A 2020 896 9
1 1 B 2005 172 8
5 2 B 2005 452 6
2 1 B 2020 852 9
6 2 B 2020 658 8
I would like to obtain the following where 1) column "Difference_Price" is the difference in price for Country 1 of Model A between the year 2005 and 2020 and 2) column "Difference_Volume" is the difference in volume for Country 1 of Model A between the year 2005 and 2020.
data_dic2 = {
"Country" : [1,1,1,1,2,2,2,2],
"Model" : ["A","B","B","A","A","B","B","A"],
"Year": [2005,2005,2020,2020,2005,2005,2020,2020],
"Price" : [100,172,852,953,350,452,658,896],
"Volume" : [4,8,9,10,12,6,8,9],
"Diffence_Price": [853,680,680,853,546,206,206,546],
"Diffence_Volume": [6,1,1,6,3,2,2,3],
}
print(data_dic2)
Country Model Year Price Volume Diffence_Price Diffence_Volume
0 1 A 2005 100 4 853 6
4 2 A 2005 350 12 546 3
3 1 A 2020 953 10 853 6
7 2 A 2020 896 9 546 3
1 1 B 2005 172 8 680 1
5 2 B 2005 452 6 206 2
2 1 B 2020 852 9 680 1
6 2 B 2020 658 8 206 2
My whole dataset has up to 50 countries and I have up to 10 models with years ranging 1990 to 2030. I am still unsure how to account for the multiple conditions of three columns so that I can substract automatically the column Price and Volume based on the three conditions (i.e., Country, Year and Models)?
Thanks !
Aucun commentaire:
Enregistrer un commentaire