﻿ Python Programming Tutorials

# Python Programming Tutorials

• Python Programming Tutorials
• How to value an exchange's risk: : BitcoinMarkets
• automated – Bitcoin Price Network Blockchain News
• Sentdex API
• Sentdex Analysis ##### Linear Regression following Sentdex's tutorials

Hello. I am trying to do some machine learning on some bitcoin data, specifically linear regression. The full code is here, but in order to plot it on a graph, I want to use the values of y (which is the values of x in 14.5 days time, so price in 14.5 days time) where I use the old actual values of y followed by the new values of y which are the predictions. In order to do this I need to find the values of X which have values for y (the predictions) and the values for x which already have the price in 14.5 days time. I performed a shift on the data, meaning some Xs have values for Y in 14.5 days time and some don't.
Why 14.5 days? As the data set is 1450 days long and I did a 0.01 negative shift. Hopefully I communicated what I was trying to say alright.
import pandas as pd import math import numpy as np from sklearn import preprocessing, svm from sklearn.model_selection import cross_validate from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from statistics import mean import matplotlib.pyplot as plt from matplotlib import style df = pd.read_csv("coinbaseUSD_1-min_data_2014-12-01_to_2019-01-09.csv") df['date'] = pd.to_datetime(df['Timestamp'],unit='s').dt.date print("calculating...") forecast_col = 'Weighted_Price' forecast_out = int(math.ceil(0.01*len(df))) #forecast_out = 20998 = 20998 minutes = 14.5 days df['label'] = df[forecast_col].shift(-forecast_out) df = df[['date', 'Weighted_Price', 'label']] df.dropna(inplace=True) X = np.array(df['Weighted_Price'], dtype = np.float64) y = np.array(df['label'], dtype=np.float64) X_lately = X[-forecast_out:] X = X[:-forecast_out:] def best_fit_line(X, y): m = (((mean(X) * mean(y)) - mean(X*y)) / ((mean(X) * mean(X)) - mean(X*X))) c = mean(y) - (m * (mean(X))) return m, c m, c = best_fit_line(X, y) print(m, c) regression_line = [(m*values) for values in X] plt.scatter(X, y) plt.plot(X, regression_line) plt.show()

So what have I tried? The offender is this line here:
X_lately = X[-forecast_out:] X = X[:-forecast_out:]
That is what sentdex did in the video series, but I get the error: ValueError: operands could not be broadcast together with shapes (1871868,) (1892866,)
This doesn't work with:
m = (((mean(X) * mean(y)) - mean(X*y)) / ((mean(X) * mean(X)) - mean(X*X)))
due to this making the X and Ys different lengths? I'm not sure.
What am I doing wrong?          