[Machine Learning] KNN 으로 편가르기

Notice

Recent Posts

Recent Comments

Link

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Tags more

Archives

Today

Total

관리 메뉴

pbj0812의 코딩 일기

[Machine Learning] KNN 으로 편가르기 본문

인공지능 & 머신러닝/Machine Learning

[Machine Learning] KNN 으로 편가르기

pbj0812 2021. 5. 7. 08:27

0. 목표

- KNN 으로 편가르기

1. 실습

1) 데이터 생성

- 두 팀 생성

import numpy as np

x = np.linspace(0, 5, 6)
y = np.linspace(0, 100, 101)
xx, yy = np.meshgrid(x, y)
xxx = np.reshape(xx, (-1, ))
yyy = np.reshape(yy, (-1, ))

x2 = np.linspace(15, 20, 6)
y2 = np.linspace(0, 100, 101)
xx2, yy2 = np.meshgrid(x2, y2)
xxx2 = np.reshape(xx2, (-1, ))
yyy2 = np.reshape(yy2, (-1, ))

2) 데이터 프레임화

import pandas as pd

df = pd.DataFrame({'x' : xxx, 'y' : yyy})
df2 = pd.DataFrame({'x' : xxx2, 'y' : yyy2})

df['class'] = 'red'
df2['class'] = 'blue'
df3 = pd.concat([df, df2], axis = 0)

df3.head()

3) 시각화

import seaborn as sns

sns.scatterplot(data = df3, x = "x", y = "y", hue = 'class', legend = "full")

4) train, test 데이터 분류

- 7:3

from sklearn.model_selection import train_test_split

X = df3[['x', 'y']]
Y = df3['class']
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.30)

5) 모델 생성 및 학습

from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=7)
model.fit(x_train, y_train)

6) 모델 평가

- 100점

y_pred = model.predict(x_test)

from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

2. 토이 프로젝트

- 랜덤한 위치에 데이터를 생성하여 분류

1) 데이터 생성

import random

x_toy = []

for i in range(20):
    x_toy.append(random.randint(0, 20))
    
y_toy = []
for i in range(20):
    y_toy.append(random.randint(0, 100))
    
df_new = pd.DataFrame({'x' : x_toy, 'y' : y_toy})

df_new.head()

2) 모델 생성

- 1. 에서의 모든 데이터를 학습 데이터로 사용

model_toy = KNeighborsClassifier(n_neighbors=7)
model_toy.fit(X, Y)

3) 확인

y_toy = model.predict(df_new)

df_new['class'] = y_toy

df_new.head()

4) 시각화

- 기존 데이터와 더하여 시각화

def case(x):
    if x == 'red':
        return 'red2'
    else:
        return 'blue2'
        
df_new['new_class'] = df_new.apply(lambda df_new : case(df_new['class']), axis = 1)

df_new.drop(columns = 'class', inplace = True)
df_new.rename(index = {'new_class' : 'class'}, inplace = True)

df_new.rename(columns = {'new_class' : 'class'}, inplace = True)

df_final = pd.concat([df3, df_new], axis = 0)

sns.scatterplot(data = df_final, x = "x", y = "y", hue = 'class')

3. 참고

- 머신러닝 - 6. K-최근접 이웃(KNN)

- numpy.meshgrid

- [Python] 구조의 재배열, numpy.reshape 함수

- Seaborn을 사용한 데이터 분포 시각화

- seaborn.scatterplot

- pandas.DataFrame.drop

- [Python pandas] DataFrame의 칼럼 이름 바꾸기 : df.columns = [], df.rename(columns)

저작자표시 비영리 동일조건

'인공지능 & 머신러닝 > Machine Learning' 카테고리의 다른 글

[Machine Learning] pycaret tutorial 따라하기 (0)	2021.05.12
[Machine Learning] KMeans 를 통한 자동 편 가르기 (0)	2021.05.11
[Machine Learning] XGBoost 를 이용한 집값 예측 (0)	2021.05.06
[Machine Learning] RandomForest 를 이용한 집값 예측 (2)	2021.05.05