mardi 27 novembre 2018

How to calculate the average of two files using awk and grep

I have the 2 following files:

points:

John,12
Joseph,14
Madison,15
Elijah,14
Theodore,15
Regina,18  

teams:

Theodore,team1
Elijah,team2
Madison,team1
Joseph,team3
Regina,team2
John,team3

I would like to calculate the average points of each team. I came up with a solution using only 2 awk statements. But I would like to do it in a more efficient way (without using for loops and if statements).

Here is what I did:

#!/bin/bash

awk 'BEGIN { FS="," }
      FNR==NR { a[FNR] = $1; b[FNR] = $2; next } { for(i = 0; i <= NR; ++i) { if(a[i] == $1) print b[i], $2 } }' teams points > output.txt

In this first awk command, I am separating the teams (team1, team2, team3) from the names and created a new file containing only my teams and the proper points for each team (and therefor the necessity of using a for loop and an if statement).

Secondly:

awk 'BEGIN { FS=" "; 
              count_team1 = 0; 
              count_team2 = 0; 
              count_team3 = 0
              average_team1 = 0; 
              average_team2 = 0; 
              average_team3 = 0 } 

        /team1/  { count_team1 = count_team1 + 1; average_team1 = average_team1 + $2 }
        /team2/  { count_team2 = count_team2 + 1; average_team2 = average_team2 + $2 }
        /team3/  { count_team3 = count_team3 + 1; average_team3 = average_team3 + $2 }


      END { print "The average of team1 is: " average_team1 / count_team1;
            print "The average of team2 is: " average_team2 / count_team2; 
            print "The average of team3 is: " average_team3 / count_team3 }' output.txt

In this second awk command, I am simply creating variables to store how many members of each team I have and other variables to have the total number of points of each team. I is easy to do since my new file output.txt only contains the teams and the scores.

This solution is working but as I said before I would like to do it without using a for loop and an if statement. I thought of not using FNR==NR and use grep -f for matching but I didn't get any conclusive results.

Aucun commentaire:

Enregistrer un commentaire