samedi 12 mai 2018

Why is my linear regression starting from 0?

My linear regression model y-intersect is wrong, I've tested it against sklearn which returned 61.9195775448, mine returns 0.51611686 instead. I can't seem to figure out why this is, both are using the same dataset. Here's the code for the linear regression with gradient descent:

def cost_func_linear(X,y,theta):

    cost1 = 1.0/m * sum([(theta[0] + theta[1]*X[i] - y[i]) for i in range(m)]) 

    cost2 = 1.0/m * sum([(theta[0] + theta[1]*X[i] - y[i])*X[i] for i in range(m)]) 

    return cost1, cost2

def linearRegression(X,y,alpha,threshold):

    iteration = 0
   # del error[:]
    converge = False
    theta = [0,0]

    #del itList[:]

    while not converge:

        cost = cost_func_linear(X,y,theta)

        J = cost[0]
        #error.extend(J)

        v0 = theta[0] - alpha*cost[0]
        v1 = theta[1] - alpha*cost[1]

        theta[0] = v0
        theta[1] = v1

        newJ = cost_func_linear(X,y,theta)[0]

        if (abs(J - newJ) <= threshold):
            converge = True
            #error between the two 

        iteration += 1
        #itList.append(iteration)
    print('Iterations: ', iteration)
    return theta[0] + theta[1]*X

this is the output Linear regression

Can someone explain whats causing this?

Aucun commentaire:

Enregistrer un commentaire