Financial Documents
Extract metrics, dates, and insights from earnings reports, SEC filings, and financial statements View Example
Explore real-world examples of LangStruct in action across different domains. Each example includes complete code, explanations, and best practices for production use.
→ Browse complete runnable examples on GitHub
Financial Documents
Extract metrics, dates, and insights from earnings reports, SEC filings, and financial statements View Example
Medical Records
Process clinical notes, lab reports, and medical documents View Example
Legal Contracts
Analyze contracts, agreements, and legal documents for key terms and risks View Example
Scientific Papers
Extract methodology, results, and citations from scientific literature View Example
Perfect for getting started with LangStruct:
from pydantic import BaseModel, Fieldfrom langstruct import LangStruct
class PersonSchema(BaseModel): name: str = Field(description="Full name") age: int = Field(description="Age in years") occupation: str = Field(description="Job title")
extractor = LangStruct(schema=PersonSchema)result = extractor.extract("Dr. Sarah Johnson, 34, is a data scientist at Google")print(result.entities) # {'name': 'Dr. Sarah Johnson', 'age': 34, 'occupation': 'data scientist'}
Extract structured product data from descriptions:
class ProductSchema(BaseModel): name: str = Field(description="Product name") price: float = Field(description="Price in USD") features: List[str] = Field(description="Key features") brand: str = Field(description="Brand name")
extractor = LangStruct(schema=ProductSchema)text = """MacBook Pro 16" - $2,399Features: M2 Pro chip, 16GB RAM, 512GB SSD, Retina displayBrand: Apple"""result = extractor.extract(text)
Financial Documents
Quarterly earnings, SEC filings, balance sheets View Example
Market Research
Consumer surveys, market analysis, competitor data
Sales Data
CRM records, sales reports, customer feedback
Medical Records
Patient records, diagnostic reports, treatment plans View Example
Scientific Papers
Medical literature, clinical trial results, case studies View Example
Lab Reports
Test results, pathology reports, imaging studies
Legal Contracts
Service agreements, NDAs, employment contracts View Example
Regulatory Filings
SEC documents, compliance reports, legal notices
Case Law
Court decisions, legal precedents, case summaries
For the use cases mentioned above that don’t have dedicated pages, here are quick implementation examples:
class MarketResearchSchema(BaseModel): survey_topic: str = Field(description="Main research topic") respondents: int = Field(description="Number of survey respondents") key_findings: List[str] = Field(description="Primary research findings") demographics: List[str] = Field(description="Respondent demographics") recommendations: List[str] = Field(description="Business recommendations")
market_extractor = LangStruct(schema=MarketResearchSchema)
class SalesDataSchema(BaseModel): customer_name: str = Field(description="Customer or company name") deal_value: float = Field(description="Deal value in USD") products: List[str] = Field(description="Products or services sold") sales_rep: str = Field(description="Sales representative name") close_date: str = Field(description="Deal close date") pipeline_stage: str = Field(description="Current sales stage")
sales_extractor = LangStruct(schema=SalesDataSchema)
class LabReportSchema(BaseModel): patient_id: str = Field(description="Patient identifier") test_type: str = Field(description="Type of laboratory test") results: List[Dict[str, str]] = Field(description="Test results with values and units") reference_ranges: List[str] = Field(description="Normal reference ranges") abnormal_flags: List[str] = Field(description="Abnormal or critical values") ordering_physician: str = Field(description="Physician who ordered tests")
lab_extractor = LangStruct(schema=LabReportSchema)
from langstruct import LangStruct
class DocumentProcessor: def __init__(self, schema): self.extractor = LangStruct(schema=schema)
def process_batch(self, documents): # Process multiple documents efficiently using built-in batch processing results = self.extractor.extract(documents) return results
# Process many documents efficientlyprocessor = DocumentProcessor(YourSchema)results = processor.process_batch(document_list)
def robust_extraction(text, extractor): try: result = extractor.extract(text)
# Validate extraction quality if result.confidence < 0.8: print(f"Low confidence: {result.confidence}") # Handle low confidence extractions
return result
except Exception as e: print(f"Extraction error: {e}") # Handle extraction errors
# Track basic extraction metricsdef track_extraction_performance(text, extractor): result = extractor.extract(text)
print(f"Extraction confidence: {result.confidence:.2f}") print(f"Fields extracted: {len(result.entities)}")
# Check if source tracking is available if hasattr(result, 'sources') and result.sources: print(f"Source locations tracked: {len(result.sources)}")
return result
# Usageresult = track_extraction_performance(document_text, extractor)
rate_limit
Want to contribute an example? We welcome contributions that demonstrate:
Ready to dive into specific examples?
Financial Documents
Medical Records
Legal Contracts
All Examples