Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
reinforcementlearning
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
The Whole Paper Fits in One Sigmoid: Implementing the SDAR Gate
Shoaibali Mir
Shoaibali Mir
Shoaibali Mir
Follow
Jun 14
The Whole Paper Fits in One Sigmoid: Implementing the SDAR Gate
#
machinelearning
#
reinforcementlearning
#
python
#
aws
Comments
1
comment
5 min read
Four Models in One Training Loop: Architecting SDAR on AWS (Before Renting a Single GPU)
Shoaibali Mir
Shoaibali Mir
Shoaibali Mir
Follow
Jun 6
Four Models in One Training Loop: Architecting SDAR on AWS (Before Renting a Single GPU)
#
aws
#
machinelearning
#
reinforcementlearning
#
mlops
Comments
Add Comment
5 min read
How to Add Live Telemetry and Failure Diagnosis to Isaac Lab, MuJoCo, or Gazebo Training in Under 5 Minutes
SimTooReal
SimTooReal
SimTooReal
Follow
Jun 6
How to Add Live Telemetry and Failure Diagnosis to Isaac Lab, MuJoCo, or Gazebo Training in Under 5 Minutes
#
ai
#
robotics
#
mujoco
#
reinforcementlearning
Comments
Add Comment
4 min read
Why robotics RL training pipelines fail at scale
Robosynx
Robosynx
Robosynx
Follow
May 30
Why robotics RL training pipelines fail at scale
#
robotics
#
machinelearning
#
reinforcementlearning
#
simulation
Comments
Add Comment
4 min read
ARTIST: RL-Powered Tool Use for LLM Agents Explained
Jangwook Kim
Jangwook Kim
Jangwook Kim
Follow
May 27
ARTIST: RL-Powered Tool Use for LLM Agents Explained
#
reinforcementlearning
#
llmagents
#
tooluse
#
agenticai
Comments
Add Comment
9 min read
Q-Learning for Games: Teaching an Agent Tic-Tac-Toe Through Self-Play
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
May 11
Q-Learning for Games: Teaching an Agent Tic-Tac-Toe Through Self-Play
#
reinforcementlearning
#
gametheory
Comments
Add Comment
14 min read
Your RL Agent Failed a 12-Step Task. Which Step Was Wrong? (The Supervision Problem in Agentic RL)
Shoaibali Mir
Shoaibali Mir
Shoaibali Mir
Follow
May 31
Your RL Agent Failed a 12-Step Task. Which Step Was Wrong? (The Supervision Problem in Agentic RL)
#
machinelearning
#
reinforcementlearning
#
llm
#
aws
Comments
2
comments
5 min read
Value Iteration vs Q-Learning: Dynamic Programming Meets RL
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
May 4
Value Iteration vs Q-Learning: Dynamic Programming Meets RL
#
reinforcementlearning
#
optimisation
#
dynamicprogramming
Comments
Add Comment
12 min read
Solving CartPole Without Gradients: Simulated Annealing
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
Apr 23
Solving CartPole Without Gradients: Simulated Annealing
#
reinforcementlearning
#
optimisation
Comments
Add Comment
13 min read
The Cross-Entropy Method: Solving RL Without Gradients
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
Apr 21
The Cross-Entropy Method: Solving RL Without Gradients
#
reinforcementlearning
#
optimisation
1
reaction
Comments
Add Comment
12 min read
Self-Learning AI Agents; Architectures and Challenges
Vishal Uttam Mane
Vishal Uttam Mane
Vishal Uttam Mane
Follow
Apr 21
Self-Learning AI Agents; Architectures and Challenges
#
selflearningai
#
aiagents
#
agentarchitecture
#
reinforcementlearning
1
reaction
Comments
1
comment
3 min read
Policy Gradients: REINFORCE from Scratch with NumPy
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
Apr 8
Policy Gradients: REINFORCE from Scratch with NumPy
#
reinforcementlearning
#
deeplearning
#
optimisation
Comments
Add Comment
16 min read
Deep Q-Networks: Experience Replay and Target Networks
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
Apr 6
Deep Q-Networks: Experience Replay and Target Networks
#
reinforcementlearning
#
deeplearning
#
optimisation
Comments
Add Comment
18 min read
Q-Learning from Scratch: Navigating the Frozen Lake
Berkan Sesen
Berkan Sesen
Berkan Sesen
Follow
Apr 4
Q-Learning from Scratch: Navigating the Frozen Lake
#
reinforcementlearning
#
optimisation
Comments
Add Comment
11 min read
Evolution Is Back: A New Way to Fine‑Tune LLMs
Ankit Dey
Ankit Dey
Ankit Dey
Follow
May 4
Evolution Is Back: A New Way to Fine‑Tune LLMs
#
ai
#
reinforcementlearning
#
machinelearning
#
coding
1
reaction
Comments
Add Comment
7 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account