Improving Language Model Reasoning with Contrastive Chain-of-Thought Prompting

Показать описание

#promptengineering #chatgpt #largelanguagemodels
Abstract: Despite the success of chain of thought in enhancing language model reasoning, the underlying process remains less well understood. Although logically sound reasoning appears inherently crucial for chain of thought, prior studies surprisingly reveal minimal impact when using invalid demonstrations instead. Furthermore, the conventional chain of thought does not inform language models on what mistakes to avoid, which potentially leads to more errors. Hence, inspired by how humans can learn from both positive and negative examples, we propose contrastive chain of thought to enhance language model reasoning. Compared to the conventional chain of thought, our approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes. To improve generalization, we introduce an automatic method to construct contrastive demonstrations. Our experiments on reasoning benchmarks demonstrate that contrastive chain of thought can serve as a general enhancement of chain-of-thought prompting.

⏩ Paper Title: Contrastive Chain-of-Thought Prompting
⏩ Author: Yew Ken Chia, Guizhen Chen, Luu Anh Tuan, Soujanya Poria, Lidong Bing
⏩ Organisation: DAMO Academy, Alibaba Group, Singapore, Singapore University of Technology and Design, Nanyang Technological University, Singapore

⏩ IMPORTANT LINKS

Enjoy reading articles? then consider subscribing to Medium membership, it just 5$ a month for unlimited access to all free/paid content.

*********************************************
*********************************************

Tools I use for making videos :)

#techviz #datascienceguy #deeplearning #ai #transformers #summarisation #machinelearning

About Me:
I am Prakhar Mishra and this channel is my passion project. I am currently pursuing my MS (by research) in Data Science. I have an industry work-ex of 4+ years in the field of Data Science and Machine Learning with a particular focus on Natural Language Processing (NLP).