 Linear Regression in Python

## Complete Detailed Tutorial on Linear Regression in Python

```import seaborn as sns
iris```
`iris = iris[['petal_length', 'petal_width']]`
```X = iris['petal_length']
y = iris['petal_width']```
plt.scatter(X, y)
plt.xlabel("petal length")
plt.ylabel("petal width")```
```from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state = 23)```
`X_train`
```import numpy as np
X_train = np.array(X_train).reshape(-1, 1)
X_train```
```X_test = np.array(X_test).reshape(-1, 1)
X_test```
`from sklearn.linear_model import LinearRegression`
`lr = LinearRegression()`
`lr.fit(X_train, y_train)`
```c = lr.intercept_
c```
`-0.3511327422143744`
```m = lr.coef_
m```
`array([0.41684538])`
```Y_pred_train = m*X_train + c
Y_pred_train.flatten()```
```y_pred_train1 = lr.predict(X_train)
y_pred_train1```
```import matplotlib.pyplot as plt
plt.scatter(X_train, y_train)
plt.plot(X_train, y_pred_train1, color ='red')
plt.xlabel("petal length")
plt.ylabel("petal width")```

```y_pred_test1 = lr.predict(X_test)
y_pred_test1```
In this dataset, we have a total of 7 columns. Let’s see the dataset first:

```import pandas as pd
df```

```df['sex']  =df['sex'].astype('category')
df['sex'] = df['sex'].cat.codesdf['smoker']  =df['smoker'].astype('category')
df['smoker'] = df['smoker'].cat.codesdf['region']  =df['region'].astype('category')
df['region'] = df['region'].cat.codes```
`df.isnull().sum()`
```age         0
sex         0
bmi         0
children    0
smoker      0
region      0
charges     0
dtype: int64```
```X = df.drop(columns = 'charges')
X```
`y = df['charges']`
```from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 23)```
```lr_multiple = LinearRegression()
lr_multiple.fit(X_train, y_train)```
```c = lr_multiple.intercept_
c```
`-11827.733141795668`
```m = lr_multiple.coef_
m```
`array([  256.5772619 ,   -49.39232379,   329.02381564,   479.08499828, 23400.28378787,  -276.31576201])`
```y_pred_train = lr_multiple.predict(X_train)
y_pred_test = lr_multiple.predict(X_test)```
```from sklearn.metrics import r2_score
r2_score(y_test, y_pred_test)```
`0.7911113876316933`