Prompt for data

May 16, 2026

Extracting quantitative information using GenAI tools requires to properly structure the prompts used to question them to efficiently use their large language models (LLMs)

The Shift from Creative to Analytical

For the past two years, the conversation around Generative AI has been dominated by its creative prowess: generating poetry, drafting marketing emails, and summarizing meeting transcripts. However, the most significant business value of Large Language Models (LLMs) lies in their ability to act as sophisticated data parsers.

When we ask an LLM to “analyze this report,” we often get a conversational summary. To move beyond GenAI as a creative assistant, we must learn to use it as a data engine. This requires a fundamental shift in how we structure our requests.

The “Prompt for Data” Framework

To transform an LLM into a reliable quantitative analyst, your prompts need to transition from natural language conversation to structured instruction. Here are three pillars for success:

1. Define the Schema (The “Output Constraint”)
LLMs are statistically inclined to “chat.” To stop this, you must explicitly define the output structure. Do not just ask for information; provide a template.

Weak Prompt: “What are the key trends in this data?”
Strong Prompt: “Extract the quarterly revenue figures from this text. Output your findings strictly in JSON format with keys for ‘Quarter’, ‘Revenue_USD’, and ‘Growth_Percentage’. Do not include preamble or conversational text.”

2. Implement Few-Shot Prompting
Models perform significantly better when given examples. If you want specific data extraction (e.g., pulling sentiment scores or specific dates), provide 2–3 examples of the input-output pairing within the prompt itself. This anchors the model’s “reasoning” to the specific format you require.

3. Use Chain-of-Thought for Verification
When extracting complex quantitative data, ask the model to “show its work.” By instructing the model to list the specific sentence or data point it used to arrive at a value, you enable a human-in-the-loop verification process, drastically reducing hallucinations.

The Reliability Paradox

As noted in recent research, LLMs are non-deterministic, meaning they can yield different results for the same prompt. However, by leveraging tools like Retrieval-Augmented Generation (RAG) and structuring prompts to force structured outputs (like CSV, JSON, or YAML), we can mitigate these variances.

The future of GenAI adoption isn’t just better chat interfaces—it’s the development of “Data Pipelines” where LLMs act as the intelligent nodes that transform unstructured noise into structured, actionable intelligence.

Wrestling with a similar regulatory or operational challenge?

We help regulated firms reduce the friction between what compliance requires and what teams actually have to do — through better processes first, AI where it earns its place. A 30-minute Business & Automation Review maps where your time is going and where automation could pay back fastest.

Navigating AI compliance for IHT 2027

Explore how AI is reshaping compliance for IHT 2027. Understand the frameworks and operational shifts impacting financial advisers.

Compliance Testing – Fairness Assessment using R

Retrieval Augmented Generation (RAG) augmented by ML can help in Proactive Risk Identification enabling predictive analysis to identify potential issues regarding unbalanced customer selection.

Company default prediction – DLMM internal rating model in R

Most firms are sitting on data that could predict which clients are at risk or which investments are underperforming. Machine learning is the type of artificial intelligence that enables computers to learn from this existing knowledge and data.

Behavioral & decision-making quantification

GenAI can adopt a persona and "make decisions" or "behave" in a way that can be quantified. This technique is used to simulate scenarios, which can then be analyzed quantitatively and used in particular to assess multi-criteria decision alternatives

Prompt for data

Extracting quantitative information using GenAI tools requires to properly structure the prompts used to question them to efficiently use their large language models (LLMs)

Prompt for data

Table of Contents

The Shift from Creative to Analytical

The “Prompt for Data” Framework

The Reliability Paradox

Further Reading & References

Wrestling with a similar regulatory or operational challenge?

Case study: how Humboldt Financial automated its back office on Plannr

Navigating AI compliance for IHT 2027

Compliance Testing – Fairness Assessment using R

Company default prediction – DLMM internal rating model in R

Behavioral & decision-making quantification

Prompt for data