classify_sentiment from yesterday classifies the tone of a raw email. For abstract triage, you want to classify the relevance of the summary — not the raw text. What changes in the pipeline?
Run Agent A to summarise the abstract first, then feed the summary as input to Agent B that classifies its relevance. The output of A becomes the input of B — the same chain pattern from the automation track, except both steps are AI calls.
Exactly. Two function calls, one pipeline:
def summarize_then_classify(text: str) -> str:
summary = summarize_text(text)
return classify_sentiment(summary)Why classify the summary rather than the original abstract? The abstract is longer — wouldn't the classifier have more signal?
Longer input can confuse the classifier with irrelevant details — methodology, dataset size, funding acknowledgements. The summary distils the core claim. Classifying the summary is classifying the claim. That's what your inclusion criterion actually tests.
So the chain isn't just convenience — the summary is a better input for the classifier than the raw abstract. The pipeline improves accuracy by design.
That's the architectural insight. Each step in a chain should narrow the input to the information the next step needs. Summarise first, then classify what was summarised. The model in step two sees a cleaner signal.
Two hundred abstracts, two agents, one pipeline. The entire first-pass triage is [summarize_then_classify(a) for a in abstracts] — I'd never get that weekend back, and now I won't need to.
The batch call is straightforward:
results = [summarize_then_classify(a) for a in abstracts]
positives = [a for a, r in zip(abstracts, results) if r == "positive"]And the output of that list is your codebook-classified corpus. Read only the "positive" ones. And the output of that list is your codebook-classified corpus. Ready for the second pass: you read only the "positive" ones.
def summarize_then_classify(text: str) -> str:
summary = summarize_text(text) # Agent A
return classify_sentiment(summary) # Agent BThe classifier receives a distilled claim, not a full abstract. Shorter, focused input → fewer irrelevant tokens → more accurate classification. Each step narrows the signal for the next.
| Goal | Pattern |
|---|---|
| Get a summary | Agent A with system_prompt |
| Classify the summary | Agent B receives A's output |
| Final result | Return Agent B's classification |
Classify the summary, not the raw text — the distilled claim is the signal you want to classify.
classify_sentiment from yesterday classifies the tone of a raw email. For abstract triage, you want to classify the relevance of the summary — not the raw text. What changes in the pipeline?
Run Agent A to summarise the abstract first, then feed the summary as input to Agent B that classifies its relevance. The output of A becomes the input of B — the same chain pattern from the automation track, except both steps are AI calls.
Exactly. Two function calls, one pipeline:
def summarize_then_classify(text: str) -> str:
summary = summarize_text(text)
return classify_sentiment(summary)Why classify the summary rather than the original abstract? The abstract is longer — wouldn't the classifier have more signal?
Longer input can confuse the classifier with irrelevant details — methodology, dataset size, funding acknowledgements. The summary distils the core claim. Classifying the summary is classifying the claim. That's what your inclusion criterion actually tests.
So the chain isn't just convenience — the summary is a better input for the classifier than the raw abstract. The pipeline improves accuracy by design.
That's the architectural insight. Each step in a chain should narrow the input to the information the next step needs. Summarise first, then classify what was summarised. The model in step two sees a cleaner signal.
Two hundred abstracts, two agents, one pipeline. The entire first-pass triage is [summarize_then_classify(a) for a in abstracts] — I'd never get that weekend back, and now I won't need to.
The batch call is straightforward:
results = [summarize_then_classify(a) for a in abstracts]
positives = [a for a, r in zip(abstracts, results) if r == "positive"]And the output of that list is your codebook-classified corpus. Read only the "positive" ones. And the output of that list is your codebook-classified corpus. Ready for the second pass: you read only the "positive" ones.
def summarize_then_classify(text: str) -> str:
summary = summarize_text(text) # Agent A
return classify_sentiment(summary) # Agent BThe classifier receives a distilled claim, not a full abstract. Shorter, focused input → fewer irrelevant tokens → more accurate classification. Each step narrows the signal for the next.
| Goal | Pattern |
|---|---|
| Get a summary | Agent A with system_prompt |
| Classify the summary | Agent B receives A's output |
| Final result | Return Agent B's classification |
Classify the summary, not the raw text — the distilled claim is the signal you want to classify.
Create a free account to get started. Paid plans unlock all tracks.