mercredi 15 janvier 2020

Python MapReduce How do i add a conditional statement

I am new to MapReduce and I am trying to find the average movie review for films in the MovieLens 100k dataset. I have a working program that finds the average review for each movie, but what I want is to only do this for movies that have >100 reviews. How can I add a conditional statement to do this?

from mrjob.job import MRJob

class PopularMovieAvgReview(MRJob):
    def mapper(self, key, line):
        (userID, movieID, rating, timestamp) = line.split('\t')
        yield movieID, float(rating)

    def reducer(self, movieID, rating):
        total = 0
        numElements = 0 
        for x in rating:
            total += x
            numElements += 1
        yield movieID, total / numElements

if __name__ == '__main__':
    PopularMovieAvgReview.run()

Aucun commentaire:

Enregistrer un commentaire