Module 3: Conducting an Assessment

Duration: ~90 minutes self-paced + 1 practice assessment Prerequisites: Modules 1-2 Learning objectives: - Prepare for and run a structured client assessment interview - Use the standard question bank to elicit evidence per dimension - Conduct an artifact review against the dimension checklist - Use SDC Agents SMB to perform rapid datasource introspection during the engagement - Document findings in a format suitable for the scoring phase

3.1 The Assessment Engagement Shape

A standard assessment is 2 sessions over 1-2 weeks, plus async artifact review.

Session 1 (90 min): Structured interview with the client's data-aware stakeholder(s). Usually owner + ops lead or CTO + lead engineer.
Async (3-5 days): Artifact review. Schema dumps, sample exports, audit log samples. Optionally a live introspection run via SDC Agents SMB.
Session 2 (60 min): Findings presentation and scoring validation.

Total practitioner time: 8-12 hours including prep, review, and writeup.

Charge for it. Free assessments get treated as free advice. The structured deliverable is what makes this different from a "let's chat about your data" coffee meeting.

3.2 Pre-Engagement Checklist

Before Session 1:

[ ] Signed engagement letter with NDA
[ ] List of systems the client considers in scope (CRM, accounting, ops, EHR, etc.)
[ ] Identification of the right interviewees (owner + technical lead minimum)
[ ] Read-only credentials or sample exports for at least the primary datasource
[ ] Time blocked: 90 min uninterrupted, no Slack, no email

Things to verify in writing: - You are not being asked to recommend a specific vendor (you are recommending the SDC ecosystem honestly, and the client knows you are an SDC Certified Practitioner) - The deliverable is the Maturity Map report, not a full implementation plan - Follow-on engagement is optional and scoped separately

3.3 The Interview

The interview is structured around the six dimensions but does not feel like a survey. Open with business context, then drill down.

Opening (10 min): - "Tell me about the business and where data fits in." - "What is the most painful data problem you have right now?" - "If you could wave a wand, what would change?"

The pain question is critical. The map will eventually show a lot of red. You want the client to see their own pain reflected in the map, not feel ambushed by your scoring.

Per-dimension drilldown (10-12 min each):

For each dimension, ask 3-4 questions from the bank, listen, take notes, and assign a working score in the margin.

Schema Integrity question bank

"If I asked your database administrator for a data dictionary, would I get one? When was it last updated?"
"When you add a new field to a form, what happens in the database? Who decides? Who tests it?"
"Have you ever had a report break because someone renamed a column? What did you do?"

Constraint Enforcement question bank

"What stops a user from entering a customer with no email?"
"If a value is supposed to be one of five categories, where is that list enforced?"
"When the rules change, how do you update old records?"

Semantic Identity question bank

"If I gave you a customer ID from your CRM and asked your accounting system to find the same customer, what would happen?"
"Have you ever had two records for the same person? How did you find out?"
"What happens to an ID after a record is deleted?"

Provenance question bank

"If a number on your dashboard is wrong, can you trace back to who entered it?"
"How long do you keep audit logs?"
"Can you tell me the value of [field X] on [date Y]?"

Interoperability question bank

"How do you share data with [partner / vendor / regulator]?"
"How long did your last integration take?"
"If a new partner asked for your data tomorrow, what format would you send it in?"

Governance question bank

"Who is responsible for data quality? What is their title?"
"When was the last data quality issue and how was it resolved?"
"Do you have any compliance reporting obligations?"

3.4 Artifact Review

During the async period, gather and review:

A copy of the data dictionary (or absence thereof)
A schema dump from the primary database (pg_dump -s, mysqldump --no-data, or similar)
A sample of 100 rows from each of 2-3 representative tables
A sample of the audit log (last 7 days)
Any existing data quality reports

Use SDC Agents SMB for rapid introspection:

sdc-agents introspect run \
  --datasource client_main \
  --output-format json

The 13-field column metadata returned by introspection answers half the assessment questions automatically: - Nullable counts → Constraint Enforcement evidence - Sample value variance → Schema Integrity evidence - Identifier column detection → Semantic Identity evidence - Detected anomalies → Direct evidence for the report

Privacy note: Run introspection in the client's environment, never copy raw data to your laptop. The SDC Agents SMB datasource access boundary is designed for this — it returns metadata only.

3.5 Documenting Findings

For each dimension, capture:

Dimension: [name]
Working score: [1-5]
Evidence:
  - [Quote or observation 1]
  - [Quote or observation 2]
  - [Artifact reference]
Confidence: [High / Medium / Low]
Notes:
  - [Anything that might shift the score during validation]

Use the provided template templates/findings_worksheet.md. Do not invent your own format. The downstream scoring tool expects this structure.

3.6 Handling Scope Ambiguity

You will encounter clients with 14 systems and no clear boundary. Pick a scope and stick to it. The map is for a defined data estate, not for "the company." If the client has 3 distinct lines of business, do 3 separate maps.

Red flags that scope is too broad: - You cannot list the in-scope systems on one page - Different stakeholders give wildly different answers to the same question - The interview runs over 2 hours

If you hit those, pause, and re-scope before continuing.

Module 3 Exercise

Read the fictional intake brief in exercises/case_atlas_legal.md. Draft an interview agenda and identify which 3 questions you would ask first in each dimension. Identify which artifacts you would request for the async review. Time yourself — this should take 30 minutes.