๐Ÿ“ฎ

Linear Regression

Tags
Python
MachineLearning
ID matched
Created
Jan 5, 2023 05:13 PM
Last Updated
Last updated July 15, 2023
ย 
ย 
๐Ÿ’ก
์•„๋ž˜์˜ ์ฝ”๋“œ๋Š” ์ฝ”๋žฉ ํ™˜๊ฒฝ์—์„œ ํ…Œ์ŠคํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
ย 

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • ์ฝ”๋“œ
    • import numpy as np import seaborn as sns import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder, PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error
ย 

๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • ์ฝ”๋“œ
    • DF = sns.load_dataset('mpg') DF.info()
      notion image
ย 

๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

  • ์ฝ”๋“œ
    • # null ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ DF.drop(index=DF[DF['horsepower'].isnull()].index, inplace=True) # origin ๋ฒ”์ฃผํ˜• ์ฒ˜๋ฆฌ encoder = LabelEncoder() DF['orgin_state'] = encoder.fit_transform(DF[['origin']]) print('origin:', DF['orgin_state'].unique()) # object ์ปฌ๋Ÿผ ์‚ญ์ œ DF.drop(columns=['name', 'origin'], axis=1, inplace=True) print("**" * 20) DF.info()
      notion image
ย 

๋ฐ์ดํ„ฐ์‹œ๊ฐํ™”1 (pairpot)

  • ์ „์ฒด ์ปฌ๋Ÿผ
    • sns.pairplot(DF) plt.show()
      notion image
  • ์ปฌ๋Ÿผ ๋ฒ”์œ„ ์„ค์ •
    • sns.pairplot(DF, vars=['mpg', 'displacement', 'weight']) plt.show()
      notion image
  • ํŠน์ • ์ปฌ๋Ÿผ
    • sns.pairplot(DF, y_vars=['mpg']) plt.show()
      notion image
ย 

๋ฐ์ดํ„ฐ์‹œ๊ฐํ™”2 (heatmap)

  • ์ „์ฒด ์ปฌ๋Ÿผ
    • sns.heatmap(DF.corr()) plt.show()
      notion image
  • ํŠน์ • ์ปฌ๋Ÿผ
    • sns.heatmap(DF.corr()[['mpg']]) plt.show()
      notion image
ย 

์ƒ๊ด€๊ณ„์ˆ˜ ์ถœ๋ ฅ

  • ์ „์ฒด ์ปฌ๋Ÿผ
    • DF.corr()
      notion image
  • ํŠน์ • ์ปฌ๋Ÿผ
    • DF.corr()[['mpg']]
      notion image
  • ํŠน์ • ์ปฌ๋Ÿผ (์ •๋ ฌ)
    • DF.corr()[['mpg']].sort_values(by='mpg', key=abs, ascending=False)
      notion image
ย 

๋‹จ์ˆœํšŒ๊ท€ (Simple Regression)

  • ์˜ˆ์‹œ
    • X = DF[['weight']] y = DF['mpg'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=2045) RA = LinearRegression() RA.fit(X_train, y_train) print('weight:', RA.coef_) print('bias:', RA.intercept_) print('R Score:', RA.score(X_test, y_test)) y_hat_test = RA.predict(X_test) mse = mean_squared_error(y_test, y_hat_test) print('MSE: %.2f / %.2f' % (mse, np.sqrt(mse))) plt.figure(figsize=(9, 6)) ax1 = sns.kdeplot(y_test, label='y_test') ax2 = sns.kdeplot(y_hat_test, label='y_hat_simple', ax=ax1) ax3 = sns.kdeplot(y_train, label='y_train', ax=ax1) plt.legend() plt.show()
      notion image
ย 

๋‹คํ•ญํšŒ๊ท€ (Polynomial Regression)

  • ์˜ˆ์‹œ
    • X = DF[['weight']] y = DF['mpg'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=6691) poly = PolynomialFeatures(degree=2, include_bias=False) X_train_poly = poly.fit_transform(X_train) X_test_poly = poly.fit_transform(X_test) RA = LinearRegression() RA.fit(X_train_poly, y_train) print('weight:', RA.coef_) print('bias:', RA.intercept_) print('R Score:', RA.score(X_test_poly, y_test)) y_hat_test = RA.predict(X_test_poly) mse = mean_squared_error(y_test, y_hat_test) print('MSE: %.2f / %.2f' % (mse, np.sqrt(mse))) plt.figure(figsize=(9, 6)) ax1 = sns.kdeplot(y_test, label='y_test') ax2 = sns.kdeplot(y_hat_test, label='y_hat_simple', ax=ax1) ax3 = sns.kdeplot(y_train, label='y_train', ax=ax1) plt.legend() plt.show()
      notion image
ย 

๋‹ค์ค‘ํšŒ๊ท€ (Multiple Regression)

  • ์˜ˆ์‹œ
    • X = DF[['weight', 'displacement', 'horsepower']] y = DF['mpg'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=2045) RA = LinearRegression() RA.fit(X_train, y_train) plt.figure(figsize=(9, 6)) ax1 = sns.kdeplot(y_test, label='y_test') ax2 = sns.kdeplot(y_hat_test, label='y_hat_simple', ax=ax1) ax3 = sns.kdeplot(y_train, label='y_train', ax=ax1) plt.legend() plt.show()
      notion image
ย