mercredi 6 novembre 2019

How can I compare different arrays and eliminate specified values?

I have three arrays from which I would like to eliminate inconsistent data. There are two inconsistencies that I want to find:

  1. unavailable data that is marked at ':' in the dataset
  2. the row data (e.g. a country) must exist in all arrays. If this is not the case, the data is is not consistent for analysis

First, I tried to specify what the inconsistencies are in the arrays. Then, I tried to create three for-loops to analyse each array. Subsequently, I wanted to state when rows will be eliminated based on the found inconsistencies.

By trying, I found two problems:

  1. The first problem considers the length of arrays. The length of the four arrays varies. Although I sorted the arrays alphabetically, it seems difficult to find if a country exists in different arrays while they may be at different locations (i.e. i=12 and j=14). How can I check whether countries are available in the array regardless of the index
  2. I should use i,j,k in the loop, but I have no idea how to put it such that it finds the inconsistencies

My code:

nodata = ':';

invalid = any(pop(:,1) =~ gdp(:,1) | pop(:,1) ~= fp(:,1) | gdp(:,1) ~= fp(:,1))

for i = 1:length(pop)

    for j= 1:length(gdp)

        for k = 1:length(fp)

            if (:,2:end == nodata) | (:,1 == invalid)

                % Delete entire row = []

            end
        end
    end
end

I know this code does not work. But what it should do is eliminate every row in which inconsistent data is.

Aucun commentaire:

Enregistrer un commentaire