parse_survey_csv from yesterday splits on commas manually. That breaks on any field with a comma inside — a free-text response like "Economics, Policy" becomes two values instead of one. What's the production-safe alternative?
The real CSV parser handles quoted fields, escape characters, and edge cases I haven't thought of. There must be a standard library module for this.
csv.DictReader is exactly that. It handles quoting, escaping, and column names automatically. The catch: DictReader expects a file object. io.StringIO wraps your string into a fake file object so DictReader can read it without touching the filesystem:
import csv, io
csv_text = 'major,satisfaction\n"Economics, Policy",4.2\nBio,3.8'
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
print(dict(row)) # {'major': 'Economics, Policy', 'satisfaction': '4.2'}io.StringIO wraps a string into a file-like object? So all I'm doing is making the string look like a file to the CSV reader?
Exactly. StringIO is an in-memory file. csv.DictReader can't tell the difference — it reads the same interface either way. This is the standard pattern for parsing CSV content that arrived as a string (from an API, a form, a clipboard):
def load_responses_from_csv(csv_text: str) -> list:
"""Parse CSV text using csv.DictReader — handles quoted fields correctly."""
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
responses = []
for row in reader:
r = dict(row)
if "satisfaction" in r:
r["satisfaction"] = float(r["satisfaction"])
responses.append(r)
print(f"Loaded {len(responses)} responses")
return responsesI'm replacing parse_survey_csv with this — same interface, handles commas in fields, converts the satisfaction column to float at ingestion.
Your manual split function was a good learning exercise. DictReader is what you deploy.
parse_survey_csv taught me what DictReader saves me from. I understand the problem it solves, not just the API.
float() at ingestion means downstream functions don't need to cast. If satisfaction is missing from a row, float() raises ValueError. Wrap in try/except when rows can have missing columns — or pre-validate with is_valid_response upstream.
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
# row is an OrderedDict with column names as keys
d = dict(row) # convert to plain dict| Issue | Manual split | DictReader |
|---|---|---|
| Comma inside a field | Breaks | Handled |
| Quoted strings | Breaks | Handled |
| Column names | Manual zip | Automatic |
Wraps a string into a file-like object. Pattern: io.StringIO(text) → pass to any function that expects a file.
parse_survey_csv from yesterday splits on commas manually. That breaks on any field with a comma inside — a free-text response like "Economics, Policy" becomes two values instead of one. What's the production-safe alternative?
The real CSV parser handles quoted fields, escape characters, and edge cases I haven't thought of. There must be a standard library module for this.
csv.DictReader is exactly that. It handles quoting, escaping, and column names automatically. The catch: DictReader expects a file object. io.StringIO wraps your string into a fake file object so DictReader can read it without touching the filesystem:
import csv, io
csv_text = 'major,satisfaction\n"Economics, Policy",4.2\nBio,3.8'
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
print(dict(row)) # {'major': 'Economics, Policy', 'satisfaction': '4.2'}io.StringIO wraps a string into a file-like object? So all I'm doing is making the string look like a file to the CSV reader?
Exactly. StringIO is an in-memory file. csv.DictReader can't tell the difference — it reads the same interface either way. This is the standard pattern for parsing CSV content that arrived as a string (from an API, a form, a clipboard):
def load_responses_from_csv(csv_text: str) -> list:
"""Parse CSV text using csv.DictReader — handles quoted fields correctly."""
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
responses = []
for row in reader:
r = dict(row)
if "satisfaction" in r:
r["satisfaction"] = float(r["satisfaction"])
responses.append(r)
print(f"Loaded {len(responses)} responses")
return responsesI'm replacing parse_survey_csv with this — same interface, handles commas in fields, converts the satisfaction column to float at ingestion.
Your manual split function was a good learning exercise. DictReader is what you deploy.
parse_survey_csv taught me what DictReader saves me from. I understand the problem it solves, not just the API.
float() at ingestion means downstream functions don't need to cast. If satisfaction is missing from a row, float() raises ValueError. Wrap in try/except when rows can have missing columns — or pre-validate with is_valid_response upstream.
import csv, io
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
# row is an OrderedDict with column names as keys
d = dict(row) # convert to plain dict| Issue | Manual split | DictReader |
|---|---|---|
| Comma inside a field | Breaks | Handled |
| Quoted strings | Breaks | Handled |
| Column names | Manual zip | Automatic |
Wraps a string into a file-like object. Pattern: io.StringIO(text) → pass to any function that expects a file.
You have a Qualtrics CSV export as a string and needs to parse it correctly — free-text responses sometimes contain commas, which broke his manual split parser. Write `load_responses_from_csv(csv_text)` using `csv.DictReader` and `io.StringIO` to parse the CSV and convert the 'satisfaction' field to float where present.
Tap each step for scaffolded hints.
No blank-editor panic.