Module 3: Conducting an Assessment
Duration: ~90 minutes self-paced + 1 practice assessment Prerequisites: Modules 1-2 Learning objectives: - Prepare for and run a structured client assessment interview - Use the standard question bank to elicit evidence per dimension - Conduct an artifact review against the dimension checklist - Use SDC Agents SMB to perform rapid datasource introspection during the engagement - Document findings in a format suitable for the scoring phase
3.1 The Assessment Engagement Shape
A standard assessment is 2 sessions over 1-2 weeks, plus async artifact review.
- Session 1 (90 min): Structured interview with the client's data-aware stakeholder(s). Usually owner + ops lead or CTO + lead engineer.
- Async (3-5 days): Artifact review. Schema dumps, sample exports, audit log samples. Optionally a live introspection run via SDC Agents SMB.
- Session 2 (60 min): Findings presentation and scoring validation.
Total practitioner time: 8-12 hours including prep, review, and writeup.
Charge for it. Free assessments get treated as free advice. The structured deliverable is what makes this different from a "let's chat about your data" coffee meeting.
3.2 Pre-Engagement Checklist
Before Session 1:
- [ ] Signed engagement letter with NDA
- [ ] List of systems the client considers in scope (CRM, accounting, ops, EHR, etc.)
- [ ] Identification of the right interviewees (owner + technical lead minimum)
- [ ] Read-only credentials or sample exports for at least the primary datasource
- [ ] Time blocked: 90 min uninterrupted, no Slack, no email
Things to verify in writing: - You are not being asked to recommend a specific vendor (you are recommending the SDC ecosystem honestly, and the client knows you are an SDC Certified Practitioner) - The deliverable is the Maturity Map report, not a full implementation plan - Follow-on engagement is optional and scoped separately
3.3 The Interview
The interview is structured around the six dimensions but does not feel like a survey. Open with business context, then drill down.
Opening (10 min): - "Tell me about the business and where data fits in." - "What is the most painful data problem you have right now?" - "If you could wave a wand, what would change?"
The pain question is critical. The map will eventually show a lot of red. You want the client to see their own pain reflected in the map, not feel ambushed by your scoring.
Per-dimension drilldown (10-12 min each):
For each dimension, ask 3-4 questions from the bank, listen, take notes, and assign a working score in the margin.
Schema Integrity question bank
- "If I asked your database administrator for a data dictionary, would I get one? When was it last updated?"
- "When you add a new field to a form, what happens in the database? Who decides? Who tests it?"
- "Have you ever had a report break because someone renamed a column? What did you do?"
Constraint Enforcement question bank
- "What stops a user from entering a customer with no email?"
- "If a value is supposed to be one of five categories, where is that list enforced?"
- "When the rules change, how do you update old records?"
Semantic Identity question bank
- "If I gave you a customer ID from your CRM and asked your accounting system to find the same customer, what would happen?"
- "Have you ever had two records for the same person? How did you find out?"
- "What happens to an ID after a record is deleted?"
Provenance question bank
- "If a number on your dashboard is wrong, can you trace back to who entered it?"
- "How long do you keep audit logs?"
- "Can you tell me the value of [field X] on [date Y]?"
Interoperability question bank
- "How do you share data with [partner / vendor / regulator]?"
- "How long did your last integration take?"
- "If a new partner asked for your data tomorrow, what format would you send it in?"
Governance question bank
- "Who is responsible for data quality? What is their title?"
- "When was the last data quality issue and how was it resolved?"
- "Do you have any compliance reporting obligations?"
3.4 Artifact Review
During the async period, gather and review:
- A copy of the data dictionary (or absence thereof)
- A schema dump from the primary database (
pg_dump -s,mysqldump --no-data, or similar) - A sample of 100 rows from each of 2-3 representative tables
- A sample of the audit log (last 7 days)
- Any existing data quality reports
Use SDC Agents SMB for rapid introspection:
sdc-agents introspect run \
--datasource client_main \
--output-format json
The 13-field column metadata returned by introspection answers half the assessment questions automatically: - Nullable counts → Constraint Enforcement evidence - Sample value variance → Schema Integrity evidence - Identifier column detection → Semantic Identity evidence - Detected anomalies → Direct evidence for the report
Privacy note: Run introspection in the client's environment, never copy raw data to your laptop. The SDC Agents SMB datasource access boundary is designed for this — it returns metadata only.
3.5 Documenting Findings
For each dimension, capture:
Dimension: [name]
Working score: [1-5]
Evidence:
- [Quote or observation 1]
- [Quote or observation 2]
- [Artifact reference]
Confidence: [High / Medium / Low]
Notes:
- [Anything that might shift the score during validation]
Use the provided template templates/findings_worksheet.md. Do not invent your own format. The downstream scoring tool expects this structure.
3.6 Handling Scope Ambiguity
You will encounter clients with 14 systems and no clear boundary. Pick a scope and stick to it. The map is for a defined data estate, not for "the company." If the client has 3 distinct lines of business, do 3 separate maps.
Red flags that scope is too broad: - You cannot list the in-scope systems on one page - Different stakeholders give wildly different answers to the same question - The interview runs over 2 hours
If you hit those, pause, and re-scope before continuing.
Module 3 Exercise
Read the fictional intake brief in exercises/case_atlas_legal.md. Draft an interview agenda and identify which 3 questions you would ask first in each dimension. Identify which artifacts you would request for the async review. Time yourself — this should take 30 minutes.