Building an AI-Powered Query System on Salesforce Objects
Building an AI-Powered Query System on Salesforce Objects
While understanding Salesforce objects is foundational, the real power emerges when you combine object knowledge with LLMs to create intelligent querying systems. This is the core of what Divami is building.
The Problem: From Manual Dashboards to Dynamic Intelligence
Traditional Salesforce dashboards require manual configuration. Want to answer a new business question? You must:
- Navigate to Reports or Dashboards
- Manually select objects and fields
- Define filters and aggregations
- Create visualizations
This workflow is static and slow. Modern executives need dynamic, ad-hoc analysis - the ability to ask natural language questions and get instant answers.
The Solution: LLM-Powered Salesforce Querying
Divami's approach bridges this gap by leveraging three key technical capabilities:
1. Object Metadata Extraction - Understanding what data exists and how it's structured
2. LLM-Assisted Query Generation - Converting natural language questions into SOQL (Salesforce Object Query Language)
3. Dynamic Visualization - Rendering results as charts and dashboards without pre-configuration
Real-World Example: From Questions to Insights
Consider this workflow powered by helper functions that interact with Salesforce:
def get_salesforce_tables():
"""Get list of all queryable Salesforce tables"""
tables = [obj['name'] for obj in sf.describe()['sobjects'] if obj['queryable']]
return tables
def get_table_schema(table_name: str):
"""Get schema details for a specific Salesforce table"""
table = getattr(sf, table_name)
fields = table.describe()['fields']
schema = []
for f in fields:
field_info = dict(name=f['name'], type=f['type'], label=f['label'])
if f['type'] == 'picklist':
field_info['values'] = [v['value'] for v in f['picklistValues']]
if f['type'] == 'reference':
field_info['references'] = f['referenceTo']
schema.append(field_info)
return schema
def run_salesforce_query(soql_query: str):
"""Execute a SOQL query and return results"""
result = sf.query(soql_query)
return result['records']
These three functions become the interface between natural language and Salesforce data. An LLM can:
- Call
get_salesforce_tables()to understand what objects are available - Call
get_table_schema()to understand field types and relationships - Generate appropriate SOQL and execute via
run_salesforce_query()
Case Study: Analyzing Sales Pipeline Data
Let's walk through a real example of how this system works by asking progressively complex business questions.
Question 1: "How many opportunities do we have by stage?"
Initial Query (with error):
SELECT Stage__c, COUNT(Id) cnt FROM Custom_Opportunity__c GROUP BY Stage__c
The LLM initially tried the wrong field name. After receiving a malformed request error, it self-corrected by referencing the schema to find Opportunity_Stage__c instead:
SELECT Opportunity_Stage__c, COUNT(Id) cnt FROM Custom_Opportunity__c WHERE Opportunity_Stage__c != null GROUP BY Opportunity_Stage__c
Results:
- Nurture: 66 opportunities (52%) - most leads still in early stage
- SQL: 18 opportunities (14%) - qualified sales leads
- Proposal: 14 opportunities (11%) - actively pitching
- Solution/PoC: 12 opportunities (10%) - proof of concept phase
- Won: 8 opportunities (6%) - closed deals
- Discovery: 3 opportunities (2%)
- Contract: 2 opportunities (2%)
- Lost: 2 opportunities (2%)
- Negotiation: 1 opportunity (1%)
This reveals the sales funnel shape - most opportunities are still in nurture, with a significant drop-off at proposal stage.
Question 2: "What's the total value of opportunities by owner and their performance?"
Query:
SELECT Owner.Name owner_name, Owner.UserRole.Name role_name, SUM(Revenue__c) total_revenue, COUNT(Id) opp_cnt FROM Custom_Opportunity__c GROUP BY Owner.Name, Owner.UserRole.Name ORDER BY SUM(Revenue__c) DESC
Key Findings:
| Owner | Opportunities | Total Value | Avg Value per Opp |
|---|---|---|---|
| John | 9 | $554.8M | $61.6M |
| Alice | 48 | $50.4M | $1.0M |
| Bob | 4 | $44.1M | $11.0M |
| Charlie | 14 | $17.9M | $1.3M |
| Dave | 24 | $9.8M | $408K |
Business Insight: John manages fewer deals but significantly higher-value ones, suggesting he focuses on enterprise accounts.
Question 3: "Which opportunities are closing this month?"
Query:
SELECT Name, Company__r.Name, Revenue__c, Opportunity_Stage__c, Target_Close_Date__c FROM Custom_Opportunity__c WHERE Target_Close_Date__c >= 2025-12-01 AND Target_Close_Date__c <= 2025-12-31
Results (December 2025):
- Acme Corp - $26M (Proposal, closes Dec 22)
- BetaTech - $900K (Proposal, closes Dec 23)
- Gamma Inc - $4M (Proposal, closes Dec 9)
- Delta LLC - $800K (Won, closed Dec 8)
- Epsilon Co - $82.5K (Proposal, closes Dec 5)
Total value closing this month: $3.17M
Additional Insights from Related Objects
Lead Conversion Performance:
- Total leads: 352
- Converted leads: 6
- Conversion rate: 1.7%
Top lead sources for conversions: N/A (50%), Research (33%), Partner (17%)
Account-Opportunity Relationships:
- Placeholder account: 57 opportunities ($55.6M)
- Top contacts per account: Tata Communications (54), Blue Yonder (11)
- Industries represented: Technology (29%), Healthcare (13%), Computer Software (11%)
Why This Matters for Enterprise Intelligence
The key insight here is that objects are not just database tables - they're the foundation for intelligent systems. When an LLM understands:
- What objects exist (Account, Opportunity, Lead, Contact, etc.)
- How they're structured (fields, types, relationships)
- How to query them (SOQL syntax)
...it becomes possible to answer complex business questions in natural language without manual dashboard configuration.
Challenges Encountered and Solutions
Challenge 1: Field Name Ambiguity
- Problem: Multiple possible field names for similar concepts (e.g.,
Stage__cvsOpportunity_Stage__c) - Solution: Always call
get_table_schema()first to validate field names before query generation
Challenge 2: Query Failures and Self-Correction
- Problem: LLM generates syntactically valid but semantically incorrect queries
- Solution: Implement error handling that feeds back to the LLM with schema information, allowing self-correction
Challenge 3: Complex Relationship Traversal
- Problem: Opportunity → Company (Account) relationships require understanding lookup fields
- Solution:
get_table_schema()returns relationship metadata, enabling the LLM to generate proper query paths using__rnotation
Challenge 4: Data Quality
- Problem: Many fields unpopulated (e.g., 349/350 accounts have no geographic data)
- Solution: Filter with
!= nullin queries and handle missing data gracefully
The Future: From Query Generation to Predictive Intelligence
This foundation opens doors to more sophisticated capabilities:
- Predictive scoring: "Which opportunities are most likely to close?"
- Anomaly detection: "Which accounts have unusual purchasing patterns?"
- Automated recommendations: "Who should follow up with these leads?"
All of these become possible once you've mastered the fundamentals of querying Salesforce objects programmatically.
Key Takeaway
Salesforce objects are more than just database tables - they're the semantic layer that enables intelligent systems. Understanding their structure, relationships, and metadata is the first step toward building AI-powered business intelligence tools that can answer any question executives ask, in real-time, without manual configuration.