Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
Tags
- 월간결산
- Blog
- Linux
- 파이썬 시각화
- 한빛미디어서평단
- 서평단
- Pandas
- Visualization
- matplotlib
- Google Analytics
- 티스토리
- SQL
- 한빛미디어
- 딥러닝
- 텐서플로
- 독후감
- Python
- Ga
- 파이썬
- python visualization
- 블로그
- 매틀랩
- 시각화
- Tistory
- 리눅스
- 서평
- tensorflow
- MySQL
- MATLAB
- 통계학
Archives
- Today
- Total
pbj0812의 코딩 일기
[kaggle] Intro to Machine Learning 수료 과정 본문
0. 목차
- Machine Learning 입문 과정으로 Pandas 로 데이터를 읽고 전처리 하는 과정부터 시작하여, Decision Tree, Random Forest 등을 통해 모델을 만들고 학습하는 과정, 그리고 평가하는 방법을 배울 수 있음.
- kaggle 에서 제공하는 내부 jupyter notebook 으로 진행하기에
1) How Models Work
2) Basic Data Exploration
3) Your First Machine Learning Model
4) Model Validation
5) Underfitting and Overfitting
6) Random Forest
7) Machine Learning Competitions
1. 최종 코드
# Code you have previously used to load data
import pandas as pd
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
# Path of the file to read
iowa_file_path = '../input/home-data-for-ml-course/train.csv'
home_data = pd.read_csv(iowa_file_path)
# Create target object and call it y
y = home_data.SalePrice
# Create X
features = ['LotArea', 'YearBuilt', '1stFlrSF', '2ndFlrSF', 'FullBath', 'BedroomAbvGr', 'TotRmsAbvGrd']
X = home_data[features]
# Split into validation and training data
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=1)
# Specify Model
iowa_model = DecisionTreeRegressor(random_state=1)
# Fit Model
iowa_model.fit(train_X, train_y)
# Make validation predictions and calculate mean absolute error
val_predictions = iowa_model.predict(val_X)
val_mae = mean_absolute_error(val_predictions, val_y)
print("Validation MAE when not specifying max_leaf_nodes: {:,.0f}".format(val_mae))
# Using best value for max_leaf_nodes
iowa_model = DecisionTreeRegressor(max_leaf_nodes=100, random_state=1)
iowa_model.fit(train_X, train_y)
val_predictions = iowa_model.predict(val_X)
val_mae = mean_absolute_error(val_predictions, val_y)
print("Validation MAE for best value of max_leaf_nodes: {:,.0f}".format(val_mae))
# Set up code checking
from learntools.core import binder
binder.bind(globals())
from learntools.machine_learning.ex6 import *
print("\nSetup complete")
from sklearn.ensemble import RandomForestRegressor
# Define the model. Set random_state to 1
rf_model = RandomForestRegressor(random_state=1)
# fit your model
rf_model.fit(train_X, train_y)
# Calculate the mean absolute error of your Random Forest model on the validation data
rf_val_mae = mean_absolute_error(val_y, rf_model.predict(val_X))
print("Validation MAE for Random Forest Model: {}".format(rf_val_mae))
2. 수료증
3. 참고
'인공지능 & 머신러닝 > kaggle' 카테고리의 다른 글
[kaggle] pandas 수료과정 (0) | 2021.04.18 |
---|---|
[kaggle] Advanced SQL 수료과정 (0) | 2021.04.16 |
[kaggle] Intro to SQL 수료과정 (0) | 2021.04.14 |
[kaggle] Intermediate Machine Learning 수료 과정 (0) | 2021.04.13 |
[kaggle] titanic 문제 풀기 (0) | 2020.05.07 |
Comments