classify_urgency processes one text at a time. Your survey dataset has 200 responses. What's the most Pythonic way to apply it to all of them?
A list comprehension — [classify_urgency(t) for t in texts]. I already have the function. The list comprehension runs it on every item and returns a list of labels in the same order.
That's it. The entire batch function is three lines: the comprehension, a print, and a return. The AI function you built in Week 2 doesn't change — only the iteration wrapper changes:
def batch_classify(texts: list) -> list:
labels = [classify_urgency(t) for t in texts]
print(f"Classified {len(labels)} responses")
return labelsWhat if one of the items in the list is an empty string? Would classify_urgency fail or return something unexpected?
The model would still return a label — but classifying an empty string gives you a meaningless result. Filter before batching: texts = [t for t in texts if t.strip()]. One guard, one line, before the classification loop:
def batch_classify(texts: list) -> list:
texts = [t for t in texts if t.strip()]
labels = [classify_urgency(t) for t in texts]
print(f"Classified {len(labels)} responses")
return labelsI can pass my entire 200-response list in one call and get back a list of urgency labels — one per response, in order. Then zip them back with the original responses to build a DataFrame.
zip(texts, labels) gives you paired tuples — that's the standard pattern for building a DataFrame from two aligned lists. pd.DataFrame(zip(texts, labels), columns=['text', 'urgency']) if you're using pandas downstream.
My entire qualitative coding process just became batch_classify(survey_responses). Three weeks ago I was doing this by hand.
You still need to validate a sample of the labels. The automation handles the triage pass — your judgment handles the borderline cases and the final methodology write-up.
def batch_classify(texts: list) -> list:
texts = [t for t in texts if t.strip()] # filter empty
return [classify_urgency(t) for t in texts]Empty strings passed to an agent produce unpredictable output — the model has nothing to classify and may return an error or an arbitrary label. Filter before the API call, not after.
Each classify_urgency(t) makes one API call. A 200-item batch makes 200 calls. For large datasets, add rate-limiting or use a delay between calls if the API returns 429 errors.
The output list is the same length as the filtered input list. Zip it with the original texts to create label pairs: list(zip(texts, labels)).
classify_urgency processes one text at a time. Your survey dataset has 200 responses. What's the most Pythonic way to apply it to all of them?
A list comprehension — [classify_urgency(t) for t in texts]. I already have the function. The list comprehension runs it on every item and returns a list of labels in the same order.
That's it. The entire batch function is three lines: the comprehension, a print, and a return. The AI function you built in Week 2 doesn't change — only the iteration wrapper changes:
def batch_classify(texts: list) -> list:
labels = [classify_urgency(t) for t in texts]
print(f"Classified {len(labels)} responses")
return labelsWhat if one of the items in the list is an empty string? Would classify_urgency fail or return something unexpected?
The model would still return a label — but classifying an empty string gives you a meaningless result. Filter before batching: texts = [t for t in texts if t.strip()]. One guard, one line, before the classification loop:
def batch_classify(texts: list) -> list:
texts = [t for t in texts if t.strip()]
labels = [classify_urgency(t) for t in texts]
print(f"Classified {len(labels)} responses")
return labelsI can pass my entire 200-response list in one call and get back a list of urgency labels — one per response, in order. Then zip them back with the original responses to build a DataFrame.
zip(texts, labels) gives you paired tuples — that's the standard pattern for building a DataFrame from two aligned lists. pd.DataFrame(zip(texts, labels), columns=['text', 'urgency']) if you're using pandas downstream.
My entire qualitative coding process just became batch_classify(survey_responses). Three weeks ago I was doing this by hand.
You still need to validate a sample of the labels. The automation handles the triage pass — your judgment handles the borderline cases and the final methodology write-up.
def batch_classify(texts: list) -> list:
texts = [t for t in texts if t.strip()] # filter empty
return [classify_urgency(t) for t in texts]Empty strings passed to an agent produce unpredictable output — the model has nothing to classify and may return an error or an arbitrary label. Filter before the API call, not after.
Each classify_urgency(t) makes one API call. A 200-item batch makes 200 calls. For large datasets, add rate-limiting or use a delay between calls if the API returns 429 errors.
The output list is the same length as the filtered input list. Zip it with the original texts to create label pairs: list(zip(texts, labels)).
Create a free account to get started. Paid plans unlock all tracks.