Dave Troy @davetroy

**The Internet is Crack** @theinternetiscrack@mastodon.social · 6d

The Internet is Crack @theinternetiscrack@mastodon.social

Artificial Intelligence: What It Really Is (and Isn’t)

We had the pleasure of speaking with Prof. Michael Littman (Brown University) about the fundamentals of AI and reinforcement learning. We explore whether today’s AI systems are truly “intelligent”—or just faking it impressively well.

Listen here: https://youtu.be/N3TpwsMVeRg

#ArtificialIntelligence #AI #ReinforcementLearning

**Georg Weissenbacher** @GeorgWeissenbacher@fediscience.org · Jul 8

Jul 8

Georg Weissenbacher @GeorgWeissenbacher@fediscience.org

A nice #arstechnica article on #ReinforcementLearning in #LLMs

https://arstechnica.com/ai/2025/07/how-a-big-shift-in-training-llms-led-to-a-capability-explosion/

Ars Technica · Jul 7How a big shift in training LLMs led to a capability explosionBy Timothy B. Lee

**Ars Technica News** @arstechnica@c.im · Jul 7

Jul 7

Ars Technica News @arstechnica@c.im

How a big shift in training LLMs led to a capability explosion https://arstechni.ca/FerG #reinforcementlearning #imitationlearning #explainers #Features #AI

Ars Technica · Jul 7How a big shift in training LLMs led to a capability explosionBy Timothy B. Lee

**Erik Jonker** @ErikJonker@mastodon.social · Jul 7

Jul 7

Erik Jonker @ErikJonker@mastodon.social

Good article how reinforcement learning improved current AI models. Also illustrates that LLMs today are not just imitating.
https://arstechnica.com/ai/2025/07/how-a-big-shift-in-training-llms-led-to-a-capability-explosion/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
#AI #reinforcementlearning

Ars Technica · Jul 7How a big shift in training LLMs led to a capability explosionBy Timothy B. Lee

**Assn for Computing Machinery** @ACM@mastodon.acm.org · Jun 6

Jun 6

Assn for Computing Machinery @ACM@mastodon.acm.org

"Intelligence is figuring out how the world works rather than waiting for someone to tell you how the world works."

Join us as we hear from Andrew Barto and Richard Sutton, the 2024 #ACMTuringAward recipients as they discuss their work on #ReinforcementLearning.

https://vimeo.com/1085726612

Vimeo2024 ACM A.M. Turing Award, Andrew Barto and Richard SuttonBy CACM

**Python Weekly** @python_discussions@mastodon.social · Jun 6

Jun 6

Python Weekly @python_discussions@mastodon.social

This Python class offers a multiprocessing-powered Pool for efficiently collecting and managing experience replay data in reinforcement learning.

https://github.com/NoteDance/Pool

Discussions: https://discu.eu/q/https://github.com/NoteDance/Pool

GitHubGitHub - NoteDance/Pool: reinforcement learning, deep reinforcement learning, RLreinforcement learning, deep reinforcement learning, RL - NoteDance/Pool

#programming #python #reinforcementlearning

**Dr. Carlotta A. Berry, PhD** @drcaberry@blacktwitter.io · May 26

May 26

Dr. Carlotta A. Berry, PhD @drcaberry@blacktwitter.io

#BlackInRobotics workshop series #ROS #ROS2 #Robot #Robotics #STEM #STEAM #BlackSTEM #BlackSTEAM #Drone #ComputerVision #Drones #AI #MachineLearning #Neuralnetworks #ReinforcementLearning #Learning

**Dr. Carlotta A. Berry, PhD** @drcaberry@blacktwitter.io · May 26

May 26

Dr. Carlotta A. Berry, PhD @drcaberry@blacktwitter.io

#BlackInRobotics workshop series #ROS #ROS2 #Robot #Robotics #STEM #STEAM #BlackSTEM #BlackSTEAM #Drone #ComputerVision #Drones #AI #MachineLearning #Neuralnetworks #ReinforcementLearning #Learning

**LavX News** @lavxnews@ioc.exchange · May 11

May 11

LavX News @lavxnews@ioc.exchange

AI Progress: Understanding the Myths Behind the Intelligence Explosion

As the AI landscape evolves, the concept of an 'intelligence explosion' raises questions about the future of AI development. This article delves into the current state of AI, addressing misconceptions...

https://news.lavx.hu/article/ai-progress-understanding-the-myths-behind-the-intelligence-explosion

#news #tech #DataScience

**trndgtr.com** @trndgtr@mastodon.social · Apr 27

Apr 27

trndgtr.com @trndgtr@mastodon.social

Training AI to Persuade? - Jeremie & Edouard Harris on JRE

#reinforcementlearning #ai #aipersuasion

**Victoria Stuart** @persagen@mastodon.social · Apr 22 *

Apr 22 *

Victoria Stuart @persagen@mastodon.social

[AGI discussion, DeepMind] Welcome to the Era of Experience
https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf
https://old.reddit.com/r/MachineLearning/comments/1k4zr1i/r_deepmind_welcome_to_the_era_of_experience

* threshold of new era in AI that promises unprecedented level of ability
* new generation of agents will acquire superhuman capabilities, learning predominantly f. experience
* paradigm shift, accompanied by algorithmic advancements in RL, will unlock new supra-human capabilities

#Google #DeepMind #AI

**Antonio Lieto** @antoniolieto@fediscience.org · Apr 1

Apr 1

Antonio Lieto @antoniolieto@fediscience.org

Happy birthday to Cognitive Design for Artificial Minds (https://lnkd.in/gZtzwDn3) that was released 4 years ago!

Since then its ideas have been presented and discussed widely in the research fields of AI/Cognitive Science/Robotics and - nowadays - both the possibilities and the limitations of: #LLMs, #GenerativeAI and #ReinforcementLearning (already envisioned and discussed in the book) have become a common topic of research interests in the AI community and beyond.
Similarly also the topic concerning the evaluation - in human-like and human-level terms - of the current AI systems has become a critical theme related to the problem Anthropomorphic interpretation of AI output (see e.g. https://lnkd.in/dVi9Qf_k ).
Book reviews have been published on ACM Computing Reviews (2021) https://lnkd.in/dWQpJdkV and on Argumenta (2023): https://lnkd.in/derH3VKN

I have been invited to present the content of the book in over 20 official scientific events in international conferences, Ph.D Schools in US, China, Japan, Finland, Germany, Sweden, France, Brazil, Poland, Austria and, of course, Italy.

A news I am happy to share is that Routledge/Taylor & Francis contacted me few weeks ago for a second edition! Stay tuned!

The #book is available in many webstores:
- Routledge: https://lnkd.in/dPrC26p
- Taylor & Francis: https://lnkd.in/dprVF2w
- Amazon: https://lnkd.in/dC8rEzPi

@academicchatter @cognition
#AI #minimalcognitivegrid #CognitiveAI #cognitivescience #cognitivesystems

**Python Weekly** @python_discussions@mastodon.social · Mar 31

Mar 31

Python Weekly @python_discussions@mastodon.social

Implemented 18 RL Algorithms in a Simpler Way

https://github.com/FareedKhan-dev/all-rl-algorithms

Discussions: https://discu.eu/q/https://github.com/FareedKhan-dev/all-rl-algorithms

GitHubGitHub - FareedKhan-dev/all-rl-algorithms: Implementation of all RL algorithms in a simpler wayImplementation of all RL algorithms in a simpler way - FareedKhan-dev/all-rl-algorithms

#programming #python #reinforcementlearning

**WetHat** @WetHat@fosstodon.org · Mar 18

Mar 18

WetHat @WetHat@fosstodon.org

The article provides good insights into industry leaders such as Waymo, DeepMind, and Amazon demonstrate the transformative power of Reinforcement Learning (RL).

Takeaways:
RL drives autonomy and innovation across industries, but challenges like interpretability remain pivotal.
Hybrid systems that blend RL and symbolic reasoning hint at breakthroughs in high-level decision-making.

https://www.computer.org/publications/tech-news/community-voices/reinforcement-learning-in-agentic-systems

IEEE Computer Society · Mar 6Reinforcement Learning in Agentic SystemsThis article explores the role of RL in agentic systems and showcase its transformative impact across industries.

#ReinforcementLearning #ArtificialIntelligence #AI

**nemo™** @nemo@mas.to · Mar 8

Mar 8

nemo™ @nemo@mas.to

Check out QwQ-32B! A 32B parameter model that rivals DeepSeek-R1 in performance, thanks to Reinforcement Learning. It excels in math, coding & general problem-solving. Open-weight on Hugging Face & ModelScope! Explore the future of AGI!
https://qwenlm.github.io/blog/qwq-32b/ #AI #ReinforcementLearning #MachineLearning #Qwen #OpenSource

Qwen · Mar 5QwQ-32B: Embracing the Power of Reinforcement LearningQWEN CHAT Hugging Face ModelScope DEMO DISCORD Scaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studies have demonstrated that RL can significantly improve the reasoning capabilities of models. For instance, DeepSeek R1 has achieved state-of-the-art performance by integrating cold-start data and multi-stage training, enabling deep thinking and complex reasoning. Our research explores the scalability of Reinforcement Learning (RL) and its impact on enhancing the intelligence of large language models.

**Dr. Anna Latour** @anna@mathstodon.xyz · Mar 4

Mar 4

Dr. Anna Latour @anna@mathstodon.xyz

My colleagues at TU Delft are seeking to hire a postdoc to work on Applied Planning and Scheduling under Uncertainty, with applications in modelling supply chain scenarios for offshore wind farm installation: https://careers.tudelft.nl/job/Delft-Postdoc-in-Applied-Planning-and-Scheduling-under-Uncertainty-2628-CD/814890902/

careers.tudelft.nlPostdoc in Applied Planning and Scheduling under UncertaintyPostdoc in Applied Planning and Scheduling under Uncertainty

#AcademicMastodon #PostdocLife #Hiring

**Jason Yip** @jchyip@mastodon.online · Feb 27

Feb 27

Jason Yip @jchyip@mastodon.online

#ReinforcementLearning Triples Spot’s Running Speed https://spectrum.ieee.org/ai-institute

IEEE Spectrum · Feb 21Reinforcement Learning Triples Spot’s Running SpeedBy Evan Ackerman

**Tero Keski-Valkama** @tero@rukii.net · Feb 13

Feb 13

Tero Keski-Valkama @tero@rukii.net

How to formulate exploration-exploitation trade-off better than all the hacks on top of Bellman equation?

We can first of all simply estimate the advantage of exploration by Monte-Carlo in a swarm setting: Pitting fully exploitative agents against fully exploitative agents which have the benefit of recent exploration. This can be easily done by lagging policy models.

Of course the advantage of exploration needs to be divided by the cost of exploration, which is linear to the number of agents used in the swarm to explore at a particular state.

Note that the advantage of exploration depends on the state of the agent, so we might want to define an explorative critic to estimate this.

What's beautiful in this formulation is that we can incorporate autoregressive #WorldModels naturally, as the exploitative agents only learn from rewards, but the explorative agents choose their actions in a way which maximizes the improvement of the auto-regressive World Model.

It brings these two concepts together as sides of the same coin.

Exploitation is reward-guided action, exploration is auto-regressive state transition model improvement guided action.

Balancing the two is a swarm dynamic which encourages branching where exploration has an expected value in reward terms. This can be estimated by computing the advantage of exploitative agents utilizing recent exploration versus agents which do not, and returning this advantage to the points of divergence between the two.

#mathematics #ReinforcementLearning #RL

**WetHat** @WetHat@fosstodon.org · Feb 13

Feb 13

WetHat @WetHat@fosstodon.org

DeepSeek R1: All you need to know

The article covers various aspects of the model, from its architecture to training methodologies and practical applications. The explanations are mostly clear and detailed, making complex concepts like Mixture of Experts (#MoE) and reinforcement learning easy to understand.

https://fireworks.ai/blog/deepseek-r1-deepdive

DeepSeek R1: All you need to know 🐳DeepSeek R1: All you need to know 🐳

#DeepSeekR1 #AI #MachineLearning

**Sean Patrick** @seanpatrickphd@wandering.shop · Feb 3

Feb 3

Sean Patrick @seanpatrickphd@wandering.shop

New instance, new #introduction!

I'm a #DataScientist with a background in #ReinforcementLearning and #ElectricalEngineering. Well, that's what my resume says, but really I'm a #poet and a SF/F #writer. I love to play #DnD and other #TTRPGs.

I use they/them pronouns, and "Dr." not "Mr.", please and thank you.

I maintain a blog at www.seanpatrick.phd which includes a current list of publications, including my debut sonnet collection, "Love, Death, and Other Surprises."

Recent searches

Search options

Administered by:

Server stats:

#reinforcementlearning