You Don't Have a Prompting Problem. You Have a Context Problem.
The advice is everywhere and it sounds like skill: learn to prompt. Use the right phrasing, add the role-play preamble, append "think step by step," and — when that stops working — pay for the bigger model. The implicit promise is that a mediocre answer is a wording failure. Say it better and the machine will finally understand you.
I want to argue something less flattering to the prompt-crafters and more useful to you. For the questions that actually matter in your work and life, the answer was never bottlenecked on phrasing. It was bottlenecked on context — on what the model knows about your situation at the moment you ask. A perfect prompt aimed at an empty context window produces a confident, generic, slightly-wrong answer. A blunt, ungrammatical prompt aimed at a rich context produces something that sounds like it actually knows your project. The lever everyone polishes is the smaller one.
This is a deep-dive on why context dominates phrasing, how to tell the two problems apart, and how to build a system that feeds the model the right context every time instead of retyping your life into a chat box. By the end you'll have a diagnostic, a framework for what counts as context, and a habit that quietly makes every model you use better.
TL;DR - A language model knows everything general and nothing specific to you. Prompting tunes how it draws on the general; context is the only channel for the specific. - For generic questions ("explain recursion"), prompting is the whole game. For your questions ("does this fit our roadmap?"), context is — and that's where the value is. - Steelman accepted: phrasing genuinely matters, and a bad prompt can sink a good context. But past a low bar, returns on better wording fall off fast while returns on better context keep climbing. - Diagnose with one test: would a smart stranger with your exact prompt and no other information give a good answer? If no, it's a context problem — and a fancier prompt won't fix it. - The durable fix isn't a prompt library. It's a retrievable knowledge base (your notes) you can feed the model on demand.
The consensus is half-right — start there
Let me steelman the prompt-engineering camp, because the weak version of my argument would just be wrong.
Phrasing does matter, and anyone who's watched a vague request produce mush knows it. "Write something about marketing" and "Write a 200-word cold email to a CFO, lead with a cost number, no exclamation points" are not the same instruction, and the second gets a far better result. Specifying the format, the audience, the length, and the constraints is real skill, and it's the difference between usable and useless on plenty of tasks. Good prompting habits — being specific, giving examples, stating the format you want — earn their keep, and I'd never tell someone to skip them. (If you want the tactical version, [seven prompting habits that get better answers](7-prompting-habits-that-get-better-answers.md) is exactly that.)
So concede the strong form of the consensus case:
- A genuinely bad prompt — ambiguous, no constraints, no format — will tank a good model. Phrasing has a floor you have to clear.
- For self-contained, general tasks, prompting is most of what you control. "Explain the Pomodoro technique" needs no context about you; it just needs to be asked clearly.
- The model can't read your mind, so some of the gap really is on you to close with words.
All true. My quarrel isn't with "phrasing matters." It's with the sneakier belief riding underneath it: that better phrasing is the main lever, and a disappointing answer means you haven't found the right words yet. For the questions worth asking, that's usually false — and chasing the wrong lever keeps you busy while the real one sits untouched.
What a model actually has — and what it's missing
Here's the move the prompt-craft framing skips. A large language model is trained on a vast slice of the public internet, so it holds an enormous amount of general knowledge — how recursion works, what a CFO cares about, the shape of a good cold email. What it has none of, at the start of any conversation, is anything specific to you: your project, your last decision, your customer, your constraints, the thing you wrote yesterday. (If that surprises you, it's worth understanding [how large language models work](how-large-language-models-work.md) — there's no memory of you in there, only patterns from training.)
Those are two different knowledge stores, and prompting and context map cleanly onto them:
| What it draws on | Tuned by | Ceiling | |
|---|---|---|---|
| Prompting | The model's general knowledge | Phrasing, format, examples, role | The model already knows it — you're aiming the cannon |
| Context | The model's specific knowledge of your case | What you put in the conversation | Empty by default — you're loading the cannon |
Read across that table and the asymmetry jumps out. Prompting aims knowledge the model already has. Context is the only way to give it knowledge it doesn't. No amount of clever phrasing can extract a fact about your situation that you never told it — the fact isn't in there to extract. This is why the same prompt that nails a textbook question flops on a question about your actual work: one needs only general knowledge, the other needs specifics the empty window doesn't contain.
And the channel for those specifics — the conversation, the pasted documents, the system instructions — is finite. It's called the [context window](what-is-a-context-window.md), and it's the real estate where your specifics live. Prompting decides how well the model uses what's in that window. Filling it with the right things decides whether the answer can be right at all.
The diagnostic: whose problem is it?
Before you reach for a better prompt, run one test. It separates the two failure modes in about five seconds.
The Stranger Test Imagine handing your exact prompt to a brilliant, well-read stranger who knows nothing about your specific situation. Could they give you a genuinely useful answer? - Yes → it's a prompting problem (or already fine). The information needed is general; sharpen the wording. - No — they'd have to ask you a dozen questions first → it's a context problem. Every one of those questions is a piece of context the model is also missing, and a fancier prompt won't conjure it.
Worked example. You ask: "What should I prioritize next quarter?" and get a generic listicle about "aligning with company goals." Frustrating. The prompt-craft instinct says rephrase it — add a role, add "be specific." But run the Stranger Test: could a brilliant stranger answer that for you? Obviously not — they don't know your goals, your team's capacity, what shipped last quarter, or what's on fire. Neither does the model. The fix was never the wording. The fix is pasting in your roadmap, last quarter's retro, and your current constraints — and then asking. Same question, now answerable.
| Symptom | Looks like | Usually is |
|---|---|---|
| Answer is generic / could apply to anyone | Weak prompt | Missing context about your specifics |
| Answer is well-structured but factually wrong about your case | Model "hallucinating" | Missing context — it filled the gap with a guess |
| Answer ignores a constraint you care about | Model not listening | Constraint was never in the window |
| Answer is rambling or wrong format | — | Genuine prompting problem; fix the wording |
Notice that three of the four "it's the model's fault" symptoms are context problems wearing a disguise. The model didn't fail to understand a good question; it answered a question you hadn't fully given it, and filled the blanks with plausible average-case filler. That filler is what people call hallucination — and a lot of it is just an empty context window doing its best.
Why "buy the bigger model" is the same mistake, scaled
When better prompts stop helping, the next reflex is to upgrade: a newer model, a longer context window, a higher tier. Sometimes that's right. Often it's the prompting mistake wearing a bigger price tag — you're improving how well the cannon aims while leaving it unloaded.
A bigger model has more and sharper general knowledge. It does not have your specifics any more than the small one did, because those were never a function of model size — they're a function of what's in the window. Upgrading turns a B+ generic answer into an A generic answer. If what you needed was an answer grounded in your actual project, you paid more for a more articulate version of the same miss.
The honest decision rule:
Upgrade the model when the task needs more general horsepower — harder reasoning, more obscure knowledge, longer coherent output. The bottleneck is the model's capability. Don't upgrade — fix context when the task needs your specifics and the model keeps guessing them. A bigger model guesses more fluently. It still guesses.
The caveat I owe you: the two interact. A longer context window genuinely lets you supply more context, so "bigger" can be the enabler of "better context." Fine. But that only proves the point — the value came from the context you loaded into the bigger window, not from the size itself. Size without the right contents is an empty mansion.
The fix isn't a prompt library. It's a retrievable second brain.
If context is the lever, the obvious tactic is "paste in more stuff." True, but it doesn't scale by hand — you can't retype your project history, your decisions, and your notes into every conversation, and you won't. The real fix is upstream: maintain a body of your own knowledge that's organized to be retrieved and fed to a model on demand. That's not a prompting skill. It's a note-taking skill.
This is why people who keep good notes quietly get better AI answers than people with better prompts. They're not more clever at phrasing — they have something to put in the window. Their decisions, their research, their distilled thinking already exist in a form they can pull up and hand over. (Building that store is its own discipline: [how to build a second brain](../ai-notetaking/how-to-build-a-second-brain.md) is the long version.)
The context checklist — before you blame the prompt - [ ] Did I include the specific facts of my case (project, constraints, what I've tried)? - [ ] Did I paste the actual document/data, not a paraphrase of it? - [ ] Did I state what "good" looks like for me, not in general? - [ ] Did I give an example of the output I want, drawn from my real situation? - [ ] Could I retrieve this context in ten seconds, or did I have to reconstruct it from memory?
That last item is the tell. If supplying context is slow and painful, you'll skip it under deadline and fall back to under-specified prompts — and then blame the model. A searchable, linked set of notes turns "reconstruct my situation from memory" into "pull it up and paste it," which is the difference between a habit you keep and one you abandon. And because the context you supply is now external and checkable, it's also far easier to [verify the answer](how-to-verify-an-ai-answer.md) against what you actually gave it.
Common mistakes
- Treating every weak answer as a wording failure. Most disappointing answers to real questions are starved of context, not poorly phrased. Run the Stranger Test before you rephrase.
- Polishing prompts past the point of return. Clear the phrasing bar — specific, formatted, with an example — then stop tuning words and start supplying facts. The returns moved.
- Mistaking fluent guesses for knowledge. A confident, well-structured answer about your specific case, given an empty window, is a guess in a nice suit. Fluency is not grounding.
- Buying capability to fix a context gap. A bigger model with the same empty window gives you a more articulate miss. Match the upgrade to the actual bottleneck.
- Hoarding prompts instead of building knowledge. A library of clever prompts ages badly and still can't supply your specifics. A retrievable knowledge base does the thing prompts can't.
- Letting context be expensive to retrieve. If feeding the model your situation takes five minutes of digging, you'll skip it. The fix is upstream organization, not more willpower at the keyboard.
Summary + next step
The skill being sold is phrasing; the skill that pays is context. A language model arrives knowing everything in general and nothing about you, and prompting only ever tunes the first half. For the questions worth asking — the ones rooted in your project, your decisions, your actual constraints — the binding limit is what you put in the window, and the words around it are a rounding error past a low bar. So when an answer disappoints, don't reach first for better phrasing or a bigger model. Run the Stranger Test, and if the honest answer is "they'd have to ask me a dozen things," go supply those twelve things. The most reliable upgrade to your AI output isn't a prompt and isn't a subscription — it's having your own knowledge in a form you can hand over in ten seconds.
Next step: make context cheap to retrieve. Pair this with [how to build a second brain](../ai-notetaking/how-to-build-a-second-brain.md), and in JustJot.ai keep your decisions and research in linked, searchable notes — so the next time a model needs to know your situation, it's a paste away, not a rewrite.