(Linear Regression Classification)피마 인디언 부족의 당뇨병 발병 여부를 예측하는 데이터 실습

728x90

- 머신러닝에서 Logistic Regression을 학습하는 일반적인 데이터

- Colab에서 실행 시 local 또는 Google Drive에서 파일을 읽어들인 후 학습 데이터와 테스트 데이터 등을 분리하여야 함

마운트 Mount

저장장치를 사용할 수 있도록 특정 디렉토리에 연결하는 것

Google Drive에 학습 데이터가 저장되어 있고 Colab에서 이러한 학습데이터를 이용하여 딥러닝 개발을 하는 경우

Google Drive --- Mount ---> Colab, Colab의 특정 디렉토리에 Google Drive를 Mount 시켜야 함

# Colab에 GoogleDrive Mount
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.optimizer import SGD, Adam

from google.colab import drive
drive.mount('/content/gdrive/') # Colab에서 Google Drive를 마운트 시킬 특정 디렉토리

# Google drive 내의 Working directory 이동
import os
working_dir = 'dataset'

colab_default_dir = '/content/gdrive/MyDrive/Colab Notebooks/'
original_dir = os.setcwd()

try:
	os.chdir(colab_default_dir) # Google Drive의 Colab 기본 디렉토리로 이동
    if not os.path.exists(working_dir):
    	os.mkdir(working_dir)
    os.chdir(working_dir) # working 디렉토리 이동
    print('current dir = ', os.getcwd()))
except Exception as err:
	os.chdir(original_dir)
    print(str(err))
    
# Trainding Data 생성
import numpy as np
try:
	loaded_data = np.loadtxt('./diabetes.csv', delimiter = ',')
    x_data = loaded_data[:,0:-1]
    y_data = loaded_data[:,[-1]]
    print("x_data.shape = ", x_data.shape)
    print("t_data.shape = ", t_data.shape)
excep Exception as err:
print(str(err))

# 모델 구축
model = Seqeuntial()
model.add(Dense(t_data.shape[1], input_shape = (x_data.shape[1],), activation = 'sigmoid'))

# 모델 컴파일
model.compile(optimizer = SGD(learning_rage=0.01), loss = 'binary_crossentropy', metrics = ['accuracy'])
model.summary()

# 모델 학습
hist = model.fit(x_data, t_data, epochs=500, validation_split = 0.2, verbose = 2)

# 모델 (정확도) 평가
model.evaluate(x_data, t_data)

# Loss & Accuracy 시각화
import matplotlib.pyplot as plt
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.grid()

plt.plot(hist.history['loss'], label = 'train loss')
plt.plot(hist.history['val_loss'], label = 'validation loss')
plt.legend(loc = 'best')
plt.show()

plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.grid()

plt.plot(hist.history['accuracry'], label = 'train accuracry')
plt.plot(hist.history['val_accuracy'], label = 'validation accuracy')
plt.legend(loc = 'best')
plt.show()

'IT > 머신러닝공부' 카테고리의 다른 글

이항 분류 Binary Classification 딥러닝 예제 (0)	2022.12.25
Logistic Regression / DeepLearning 간단 비교 (0)	2022.12.25
Logistic Regression - Classfication 요약 정리 (0)	2022.12.25
Linear Regression 예제 스크립트 (0)	2022.12.25
머신러닝의 회귀(Regression)과 손실함수, GDA에 대한 간단정의 (0)	2022.12.25

인생은패패승승승

(Linear Regression Classification)피마 인디언 부족의 당뇨병 발병 여부를 예측하는 데이터 실습

'IT > 머신러닝공부' 카테고리의 다른 글

티스토리툴바

(Linear Regression Classification)피마 인디언 부족의 당뇨병 발병 여부를 예측하는 데이터 실습

'IT > 머신러닝공부' 카테고리의 다른 글

'IT/머신러닝공부' Related Articles

티스토리툴바