read_range from Day 17 read structured rows from a Sheet. Today you read unstructured long-form content from a Doc — proposals, contracts, project briefs. The transition from structured rows to unstructured document content is the same API shape.
Pass the document ID, get back a dict with the title and body content. Then extract the text I need.
GOOGLEDOCS_GET_DOCUMENT_BY_ID returns a dict with title and a nested body structure. The full body is JSON-encoded document nodes — paragraphs, runs, styles. To get plain text quickly, extract from body.content:
result = toolset.execute_action(
Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
{"document_id": doc_id}
)
title = result.get("title", "")
body = result.get("body", {})The body is a nested structure, not plain text? That seems complicated for just reading a proposal.
Google Docs encodes every paragraph as a structured object — useful for formatting but verbose for plain text. For the capstone use case, the full response dict is what you return. The doc_to_post function in Week 4 slices the first 300 characters of text from the body — a simple extraction that works for most proposal docs.
So read_doc returns the full dict and downstream functions extract what they need. The function stays simple.
Single responsibility. read_doc fetches. The caller decides what to extract. That is the right shape:
def read_doc(doc_id: str) -> dict:
result = toolset.execute_action(
Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
{"document_id": doc_id}
)
title = result.get("title", "no title")
print(f"Doc: {title}")
return resultMy proposal library is now accessible from code. I can extract scope sections, client names, payment terms — whatever I need.
One note: the document ID is the string between /d/ and /edit in the Docs URL. https://docs.google.com/document/d/ABC123/edit — the ID is ABC123.
GOOGLEDOCS_GET_DOCUMENT_BY_ID returns the full document dict, including title and body:
result = toolset.execute_action(
action=Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
params={'documentId': doc_id}
)
title = result.get('title', '')
body = result.get('body', '')Sheets return rows as lists of lists — structured. Docs return a single body string — unstructured. For proposal or contract docs, the body field contains the full text you can summarise, search, or excerpt.
Read before create: use read_doc to check if a proposal doc for a client already exists before creating a new one with create_doc. Avoid duplicate docs cluttering Drive.
read_range from Day 17 read structured rows from a Sheet. Today you read unstructured long-form content from a Doc — proposals, contracts, project briefs. The transition from structured rows to unstructured document content is the same API shape.
Pass the document ID, get back a dict with the title and body content. Then extract the text I need.
GOOGLEDOCS_GET_DOCUMENT_BY_ID returns a dict with title and a nested body structure. The full body is JSON-encoded document nodes — paragraphs, runs, styles. To get plain text quickly, extract from body.content:
result = toolset.execute_action(
Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
{"document_id": doc_id}
)
title = result.get("title", "")
body = result.get("body", {})The body is a nested structure, not plain text? That seems complicated for just reading a proposal.
Google Docs encodes every paragraph as a structured object — useful for formatting but verbose for plain text. For the capstone use case, the full response dict is what you return. The doc_to_post function in Week 4 slices the first 300 characters of text from the body — a simple extraction that works for most proposal docs.
So read_doc returns the full dict and downstream functions extract what they need. The function stays simple.
Single responsibility. read_doc fetches. The caller decides what to extract. That is the right shape:
def read_doc(doc_id: str) -> dict:
result = toolset.execute_action(
Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
{"document_id": doc_id}
)
title = result.get("title", "no title")
print(f"Doc: {title}")
return resultMy proposal library is now accessible from code. I can extract scope sections, client names, payment terms — whatever I need.
One note: the document ID is the string between /d/ and /edit in the Docs URL. https://docs.google.com/document/d/ABC123/edit — the ID is ABC123.
GOOGLEDOCS_GET_DOCUMENT_BY_ID returns the full document dict, including title and body:
result = toolset.execute_action(
action=Action.GOOGLEDOCS_GET_DOCUMENT_BY_ID,
params={'documentId': doc_id}
)
title = result.get('title', '')
body = result.get('body', '')Sheets return rows as lists of lists — structured. Docs return a single body string — unstructured. For proposal or contract docs, the body field contains the full text you can summarise, search, or excerpt.
Read before create: use read_doc to check if a proposal doc for a client already exists before creating a new one with create_doc. Avoid duplicate docs cluttering Drive.
Create a free account to get started. Paid plans unlock all tracks.