Prompt engineering

Prompt engineering is a concept in artificial intelligence, particularly natural language processing (NLP). In prompt engineering, the description of the task is embedded in the input, e.g., as a question instead of it being implicitly given. Prompt engineering typically works by converting one or more tasks to a prompt-based dataset and training a language model with what has been called "prompt-based learning" or just "prompt learning".[1][2] Prompt engineering may work from a large "frozen" pretrained language model and where only the representation of the prompt is learned, with what has been called "prefix-tuning" or "prompt tuning".[3][4]

The GPT-2 and GPT-3 language models[5] were important steps in prompt engineering. In 2021, multitask prompt engineering using multiple NLP datasets showed good performance on new tasks.[6] Prompts with train of thought show indication of reasoning in language models.[7] Adding "Let's think step by step" to the prompt may improve the performance of a language model in multi-step reasoning problems.[8]

A description for handling prompts reported that over 2,000 public prompts for around 170 datasets were available in February 2022.[9]


  1. ^ Alec Radford; Jeffrey Wu; Rewon Child; David Luan; Dario Amodei; Ilya Sutskever (2019), Language Models are Unsupervised Multitask Learners (PDF), Wikidata Q95726769
  2. ^ Pengfei Liu; Weizhe Yuan; Jinlan Fu; Zhengbao Jiang; Hiroaki Hayashi; Graham Neubig (28 July 2021), Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (PDF), arXiv:2107.13586, Wikidata Q109286554
  3. ^ Xiang Lisa Li; Percy Liang (August 2021). "Prefix-Tuning: Optimizing Continuous Prompts for Generation" (PDF). Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers): 4582–4597. doi:10.18653/V1/2021.ACL-LONG.353. Wikidata Q110887424.
  4. ^ Brian Lester; Rami Al-Rfou; Noah Constant (November 2021). "The Power of Scale for Parameter-Efficient Prompt Tuning" (PDF). Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: 3045–3059. arXiv:2104.08691. doi:10.18653/V1/2021.EMNLP-MAIN.243. Wikidata Q110887400.
  5. ^ Tom B. Brown; Benjamin Mann; Nick Ryder; et al. (28 May 2020). "Language Models are Few-Shot Learners" (PDF). arXiv. arXiv:2005.14165. doi:10.48550/ARXIV.2005.14165. ISSN 2331-8422. Wikidata Q95727440.
  6. ^ Victor Sanh; Albert Webson; Colin Raffel; et al. (15 October 2021), Multitask Prompted Training Enables Zero-Shot Task Generalization (PDF), arXiv:2110.08207, Wikidata Q108941092
  7. ^ Jason Wei; Xuezhi Wang; Dale Schuurmans; Maarten Bosma; Ed Chi; Quoc Viet Le; Denny Zhou (28 January 2022), Chain of Thought Prompting Elicits Reasoning in Large Language Models (PDF), arXiv:2201.11903, doi:10.48550/ARXIV.2201.11903, Wikidata Q111971110
  8. ^ Takeshi Kojima; Shixiang Shane Gu; Machel Reid; Yutaka Matsuo; Yusuke Iwasawa (24 May 2022), Large Language Models are Zero-Shot Reasoners (PDF), arXiv:2205.11916, doi:10.48550/ARXIV.2205.11916, Wikidata Q112124882
  9. ^ Stephen H. Bach; Victor Sanh; Zheng-Xin Yong; et al. (2 February 2022), PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts (PDF), arXiv:2202.01279, Wikidata Q110839490