👩‍💻LEARN : ML&Data/Lecture

[Reinforcement learning]#1. Introduction

쟈니유 2023. 3. 30. 16:19
728x90

강화해보자


#1. Introduction 

 

▶️ What is reinforcement learning 

특정 State 에 따라 rewards를 정적강화(+n)/부적강화(-m) 을 세팅해서 자동으로 good action으로 행동하게 하는 것

 

▶️ Mars rover example

(s,a,R(s),s') = state, action, rewards, updated state after take action 

 

▶️ The return in reinforcement learning 

  • Discount factor (감마) : 이동(action)에 대한 비용을 계산하는 것 . 증권에서는 돈의 가치 하락 등을 반영함. 

  • State에 따라 행동에 따른 return 값이 다르므로 이를 행동 가이드에 반영할 수도 있음 
To summarize, the return in reinforcement learning is the sum of the rewards that the system gets, 
weighted by the discount factor, where rewards in the far future are weighted by the discount factor raised to a higher power.

 

 

▶️  Making decisions: Policies in reinforcement learning 

Policy(pi)

pi(state) = action 

 

Goal 

Find a policy pi that tells you what action to take in every state so as to maximize the return 

 

Markov Decision Process (MPD)