Home
Blog
Work
About
Select color scheme
Orange
Blue
Teal
Green
Pink
Purple
Red
Toggle theme
RL|day4-isaaclab
强化学习|学习笔记
1 min read
(2 min read total)
RL|day1-sim环境搭建
1 min read
RL|day2-基础概念与贝尔曼方程
2 min read
RL|day3-贝尔曼与贝尔曼最优方程
1 min read
RL|day4-isaaclab
1 min read
Blog
强化学习|学习笔记
RL|day4-isaaclab
Blog
Post
Subpost
RL|day4-isaaclab
yifan
January 18, 2026
1 min read
study
Previous subpost
No older subpost!
Parent post
强化学习|学习笔记
Next subpost
RL|day3-贝尔曼与贝尔曼最优方程
课程概述
系列文章
🎯 RL基础
🚀 经典算法
🔬 深度强化学习
学习路径
Subposts
第一章:Reinforcement Learning Basic Concepts
1. 基础元素:以 Grid World 为例
1.1 State
1.2 Action
1.3 State Transition
1.4 Policy
1.5 Reward
2. 交互过程与评估
2.1 Trajectory
2.2 Return
2.3 Episode
3. Markov Decision Process (MDP)
3.1 Sets
3.2 Dynamics / Model
3.3 Policy
3.4 Markov Property
4. MDP vs MP
RL-004
强化学习|学习笔记
1 min read
(2 min read total)
RL|day1-sim环境搭建
1 min read
RL|day2-基础概念与贝尔曼方程
2 min read
RL|day3-贝尔曼与贝尔曼最优方程
1 min read
RL|day4-isaaclab
1 min read
Previous subpost
No older subpost!
Parent post
强化学习|学习笔记
Next subpost
RL|day3-贝尔曼与贝尔曼最优方程
Comments