- 近期网站停站换新具体说明
- 按以上说明时间,延期一周至网站时间26-27左右。具体实施前两天会在此提前通知具体实施时间
主题:【原创】围绕脑科学而发生的若干玄想 -- 鸿乾
when I get time, I will read and comment on the following:
"Path Integral Reinforcement Learning"
http://homes.cs.washington.edu/~etheodor/papers/LearningWorkshop11.pdf
1.
相关搜索- 上载www.pudn.com 程序员联合开发网
s.pudn.com/search_uploads.asp?k=lingo
轉為繁體網頁
... 最优,并在此基础上以路边约束、动态避障和路径最短作为适应度函数,提出(322KB, .... 72. jiqixuexi.rar - 这是博弈论算法全集第三部分:机器学习,其它算法将陆续推出. ..... 它所做的工作是将积分方程化为差分方程,或将积分方程中积分化为有限求和, ...
和机器学习和计算机视觉相关的数学- godenlove007的专栏- 博客频道 ...
blog.csdn.net/godenlove007/article/details/8510392
轉為繁體網頁
2013年1月16日 – 和机器学习和计算机视觉相关的数学之一(以下转自一位MIT牛人的空间文章, ... 而在统计学中,Marginalization和积分更是密不可分——不过,以解析形式把 ..... 目录的路径不能含有中文,不能含有空格,以字母开头,路径别太长。
科学网—《李群机器学习》李凡长等- 中国科大出版社的博文
blog.sciencenet.cn/blog-502977-684746.html
轉為繁體網頁
2013年4月28日 – 从历史经验看,研究机器学习应该“以认知科学为基础、数学方法为手段、 ... 途径,并沿着这样的路径来构建机器学习的理论、技术、方法和应用体系”.
一篇演讲By 浙江大学数学系主任刘克峰- bluenight专栏- 博客频道 ...
blog.csdn.net/chl033/article/details/4888555
轉為繁體網頁
2009年11月27日 – 物理学家学习数学的方式也许值得我们借鉴,Witten他们大概从来不做 ... 虽然Feynman的路径积分还缺少严格的数学基础,该理论因其物理上的 ...
机器学习前沿热点–Deep Learning - 大枫叶_HIT - 博客频道- CSDN ...
blog.csdn.net/datoubo/article/details/8596444
轉為繁體網頁
2013年2月20日 – 深度学习是机器学习研究中的一个新的领域,其动机在于建立、模拟人脑进行分析学习的神经网络 ... 这种流向图的一个特别属性是深度(depth):从一个输入到一个输出的最长路径的长度。 .... 访问:1889次; 积分:113分; 排名:千里之外 ...
2.
the following papar
"Path Integral Reinforcement Learning"
http://homes.cs.washington.edu/~etheodor/papers/LearningWorkshop11.pdf
Abstract—Reinforcement learning is one of the most fundamental
frameworks of learning control, but applying it to
high dimensional control systems, e.g., humanoid robots, has
largely been impossible so far. Among the key problems are
that classical value function-based approaches run into severe
limitations in continuous state-action spaces due to issues of
function approximation of value functions, and, moreover,
that the computational complexity and time of exploring high
dimensional state-action spaces quickly exceeds practical feasibility.
As an alternative, researchers have turned to trajectorybased
reinforcement learning, which sacrifices global optimality
in favor of being applicable to high-dimensional state-action
spaces. Model-based approches, inspired by ideas of differential
dynamic programming, have demonstrated some sucess if models
are accurate, but model-free trajectory-based reinforcement
learning has been limited by problems of slow learning and the
need to tune many open parameters.
In this paper, we review some recent developments of
trajectory-based reinforcement learning using the framework of
stochastic optimal control with path integrals. The path integral
control approach transforms the optimal control problem into
an estimation problem based on Monte-Carolo evaluations of a
path integral. Based on this idea, a new reinforcement learning
algorithm can be derived, called Policy Improvement with
Path Integrals (PI2). PI2 is surprising simple and works as
a black box learning system, i.e., without the need for manual
parameter tuning. Moreover, it learns fast and efficiently in very
high dimensional problems, as we demonstrate in a variety of
robotic tasks. Interestingly, PI2 can be applied in model-free,
hybrid, and model-based scenarios. Given its solid foundation in
stochastic optimal control, path integral reinforcement learning
offers a wide range of applications of reinforcement learning
to very complex and new domains.
- 相关回复 上下关系8
压缩 2 层
🙂仔细看看蚊子苍蝇的神经系统:极端高效的元学习机 1 鸿乾 字940 2013-05-10 18:19:38
🙂继续玄想:元学习机和韩盾的Autoencoder的关系 鸿乾 字1069 2013-05-06 10:52:42
🙂找到了google的那个人脸猫脸的文章,链接在这里 鸿乾 字287 2013-05-09 10:29:27
🙂路径积分 vs 机器学习
🙂我觉得你没有专注于科学直觉或反直觉 4 益者三友 字1548 2013-05-04 14:13:16
🙂笑喷了 川普 字291 2013-05-07 21:54:42
🙂先花之,再仔细辨析:正是希望通过讨论来接近脑内回路的真相 3 鸿乾 字1397 2013-05-06 09:28:37
🙂是否可以这样思考 1 川普 字742 2013-05-07 22:21:33