WindyGridWorldQLearning - 源码下载|其它|LabView|源代码

Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian

domains. It amounts to an incremental method for dynamic programming which imposes limited computational

demands. It works by successively improving its evaluations of the quality of particular actions at particular states.

This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins

(1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions

are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions

to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed

each iteration, rather than just one.

下载资源主分类

源码下载

Web源码

开发工具

文档下载

其它资源

资源分类

汇编语言

SCSI/ASPI

编译器/词法分析

磁盘编程

语音合成与识别

编辑器/阅读器

杀毒

中文信息处理

并行运算

书籍源码

Dephi控件源码

操作系统开发

中间件编程

MacOS编程

LabView

易语言编程

python

在结果中搜索

文件名称:WindyGridWorldQLearning

介绍说明－－下载内容均来自于网络，请自行研究使用

下载文件列表

相关说明

相关评论

发表评论

源码中国 www.ymcn.org

*主　　题：
*内　　容：
*验证码：