-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
#복수의 목표에서 최우선 목표 판단
##Exploration
###Uncertain but think there might be better results
*mean of RL's think
*The difficulty reached does not necessarily mean that the compensation value must be high. (But!)
*It may be necessary to deviate from the way the score is measured in proportion to the target result, which is the structure of the current RL.
##comparison of results
*already adapted in many papers.
###Pareto dominating policy
*Doesn't it end in the top result after all?
Metadata
Metadata
Assignees
Labels
No labels