Skip to content

Extract Model Names from Papers

Background

The following prompt tests an LLM's capabilities to perform an information extraction task which involves extracting model names from machine learning paper abstracts. This is a common task in research analysis and literature review processes.

Task Description

The goal is to identify and extract specific AI/ML model names mentioned in research paper abstracts, returning them in a structured format for further analysis or processing.

Prompt Structure

Base Prompt

Your task is to extract model names from machine learning paper abstracts. Your response is an array of the model names in the format ["model_name"]. If you don't find model names in the abstract or you are not sure, return ["NA"]

Abstract: {input}

Example Input

Abstract: Large Language Models (LLMs), such as ChatGPT and GPT-4, have revolutionized natural language processing research and demonstrated potential in Artificial General Intelligence (AGI). However, the expensive training and deployment of LLMs present challenges to transparent and open academic research. To address these issues, this project open-sources the Chinese LLaMA and Alpaca…

Expected Output

For the example above, the expected output would be:

["ChatGPT", "GPT-4", "LLaMA", "Alpaca"]

Implementation

Code Example

python
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "Your task is to extract model names from machine learning paper abstracts. Your response is an array of the model names in the format [\"model_name\"]. If you don't find model names in the abstract or you are not sure, return [\"NA\"]\n\nAbstract: Large Language Models (LLMs), such as ChatGPT and GPT-4, have revolutionized natural language processing research and demonstrated potential in Artificial General Intelligence (AGI). However, the expensive training and deployment of LLMs present challenges to transparent and open academic research. To address these issues, this project open-sources the Chinese LLaMA and Alpaca…"
        }
    ],
    temperature=1,
    max_tokens=250,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0
)

API Parameters

ParameterValueDescription
modelgpt-4The language model to use
temperature1Controls randomness in generation
max_tokens250Maximum length of the response
top_p1Nucleus sampling parameter
frequency_penalty0No penalty for frequent tokens
presence_penalty0No penalty for new tokens

Key Features

1. Structured Output Format

  • Returns results in a consistent array format
  • Handles cases where no models are found with ["NA"]
  • Maintains consistent formatting for easy parsing

2. Robust Error Handling

  • Returns ["NA"] when uncertain or no models found
  • Prevents crashes or invalid outputs
  • Ensures consistent response structure

3. Clear Task Definition

  • Explicit instructions for the extraction task
  • Specific output format requirements
  • Clear handling of edge cases

Use Cases

  • Research Literature Review: Automating model identification in papers
  • Model Comparison Studies: Collecting model mentions across research
  • Academic Database Building: Creating structured datasets of model references
  • Research Trend Analysis: Tracking model popularity over time

Best Practices

  1. Clear Formatting: Use consistent array notation
  2. Error Handling: Always handle cases where no models are found
  3. Validation: Verify extracted names against known model databases
  4. Iterative Refinement: Adjust prompts based on output quality
  5. Batch Processing: Process multiple abstracts for efficiency

Expected Challenges

  • Model Name Variations: Same model referred to differently
  • Abbreviations: Models mentioned with acronyms
  • Version Numbers: Different versions of the same model
  • Ambiguous References: Unclear model mentions

Reference

Source: Prompt Engineering Guide (16 March 2023)