Search Results for author: Karthik Valmeekam

Found 9 papers, 2 papers with code

Chain of Thoughtlessness: An Analysis of CoT in Planning

no code implementations • 8 May 2024 • Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution.

Paper
Add Code

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

no code implementations • 12 Feb 2024 • Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples--ranging from multiplication to simple planning--there persists a wide spread belief that LLMs can self-critique and improve their own solutions in an iterative fashion.

Paper
Add Code

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

no code implementations • 2 Feb 2024 • Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Kaya Stechly, Mudit Verma, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers.

Paper
Add Code

Can Large Language Models Really Improve by Self-critiquing Their Own Plans?

no code implementations • 12 Oct 2023 • Karthik Valmeekam, Matthew Marquez, Subbarao Kambhampati

We evaluate a planning system that employs LLMs for both plan generation and verification.

Paper
Add Code

On the Planning Abilities of Large Language Models : A Critical Investigation

2 code implementations • 25 May 2023 • Karthik Valmeekam, Matthew Marquez, Sarath Sreedharan, Subbarao Kambhampati

We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs in LLM-Modulo settings where they act as a source of heuristic guidance for external planners and verifiers.

203

Paper
Code

On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark)

no code implementations • 13 Feb 2023 • Karthik Valmeekam, Sarath Sreedharan, Matthew Marquez, Alberto Olmo, Subbarao Kambhampati

On this benchmark, we evaluate LLMs in three modes: autonomous, heuristic and human-in-the-loop.

Paper
Add Code

Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

no code implementations • 28 Oct 2022 • Lin Guan, Karthik Valmeekam, Subbarao Kambhampati

We propose two practical methods that can learn to model any kind of behavioral attributes from ordered behavior clips.

Paper
Add Code

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

2 code implementations • NeurIPS 2023 • Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati

PlanBench provides sufficient diversity in both the task domains and the specific planning capabilities.

Common Sense Reasoning World Knowledge

203

Paper
Code

RADAR-X: An Interactive Mixed Initiative Planning Interface Pairing Contrastive Explanations and Revised Plan Suggestions

no code implementations • 19 Nov 2020 • Karthik Valmeekam, Sarath Sreedharan, Sailik Sengupta, Subbarao Kambhampati

Decision support systems seek to enable informed decision-making.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.