Module 3: Conducting an Assessment

Duration: ~90 minutes self-paced + 1 practice assessment Prerequisites: Modules 1-2 Learning objectives: - Prepare for and run a structured client assessment interview - Use the standard question bank to elicit evidence per dimension - Conduct an artifact review against the dimension checklist - Use SDC Agents SMB to perform rapid datasource introspection during the engagement - Document findings in a format suitable for the scoring phase


3.1 The Assessment Engagement Shape

A standard assessment is 2 sessions over 1-2 weeks, plus async artifact review.

  • Session 1 (90 min): Structured interview with the client's data-aware stakeholder(s). Usually owner + ops lead or CTO + lead engineer.
  • Async (3-5 days): Artifact review. Schema dumps, sample exports, audit log samples. Optionally a live introspection run via SDC Agents SMB.
  • Session 2 (60 min): Findings presentation and scoring validation.

Total practitioner time: 8-12 hours including prep, review, and writeup.

Charge for it. Free assessments get treated as free advice. The structured deliverable is what makes this different from a "let's chat about your data" coffee meeting.


3.2 Pre-Engagement Checklist

Before Session 1:

  • [ ] Signed engagement letter with NDA
  • [ ] List of systems the client considers in scope (CRM, accounting, ops, EHR, etc.)
  • [ ] Identification of the right interviewees (owner + technical lead minimum)
  • [ ] Read-only credentials or sample exports for at least the primary datasource
  • [ ] Time blocked: 90 min uninterrupted, no Slack, no email

Things to verify in writing: - You are not being asked to recommend a specific vendor (you are recommending the SDC ecosystem honestly, and the client knows you are an SDC Certified Practitioner) - The deliverable is the Maturity Map report, not a full implementation plan - Follow-on engagement is optional and scoped separately


3.3 The Interview

The interview is structured around the six dimensions but does not feel like a survey. Open with business context, then drill down.

Opening (10 min): - "Tell me about the business and where data fits in." - "What is the most painful data problem you have right now?" - "If you could wave a wand, what would change?"

The pain question is critical. The map will eventually show a lot of red. You want the client to see their own pain reflected in the map, not feel ambushed by your scoring.

Per-dimension drilldown (10-12 min each):

For each dimension, ask 3-4 questions from the bank, listen, take notes, and assign a working score in the margin.

Schema Integrity question bank

  • "If I asked your database administrator for a data dictionary, would I get one? When was it last updated?"
  • "When you add a new field to a form, what happens in the database? Who decides? Who tests it?"
  • "Have you ever had a report break because someone renamed a column? What did you do?"

Constraint Enforcement question bank

  • "What stops a user from entering a customer with no email?"
  • "If a value is supposed to be one of five categories, where is that list enforced?"
  • "When the rules change, how do you update old records?"

Semantic Identity question bank

  • "If I gave you a customer ID from your CRM and asked your accounting system to find the same customer, what would happen?"
  • "Have you ever had two records for the same person? How did you find out?"
  • "What happens to an ID after a record is deleted?"

Provenance question bank

  • "If a number on your dashboard is wrong, can you trace back to who entered it?"
  • "How long do you keep audit logs?"
  • "Can you tell me the value of [field X] on [date Y]?"

Interoperability question bank

  • "How do you share data with [partner / vendor / regulator]?"
  • "How long did your last integration take?"
  • "If a new partner asked for your data tomorrow, what format would you send it in?"

Governance question bank

  • "Who is responsible for data quality? What is their title?"
  • "When was the last data quality issue and how was it resolved?"
  • "Do you have any compliance reporting obligations?"

3.4 Artifact Review

During the async period, gather and review:

  • A copy of the data dictionary (or absence thereof)
  • A schema dump from the primary database (pg_dump -s, mysqldump --no-data, or similar)
  • A sample of 100 rows from each of 2-3 representative tables
  • A sample of the audit log (last 7 days)
  • Any existing data quality reports

Use SDC Agents SMB for rapid introspection:

sdc-agents introspect run \
  --datasource client_main \
  --output-format json

The 13-field column metadata returned by introspection answers half the assessment questions automatically: - Nullable counts → Constraint Enforcement evidence - Sample value variance → Schema Integrity evidence - Identifier column detection → Semantic Identity evidence - Detected anomalies → Direct evidence for the report

Privacy note: Run introspection in the client's environment, never copy raw data to your laptop. The SDC Agents SMB datasource access boundary is designed for this — it returns metadata only.


3.5 Documenting Findings

For each dimension, capture:

Dimension: [name]
Working score: [1-5]
Evidence:
  - [Quote or observation 1]
  - [Quote or observation 2]
  - [Artifact reference]
Confidence: [High / Medium / Low]
Notes:
  - [Anything that might shift the score during validation]

Use the provided template templates/findings_worksheet.md. Do not invent your own format. The downstream scoring tool expects this structure.


3.6 Handling Scope Ambiguity

You will encounter clients with 14 systems and no clear boundary. Pick a scope and stick to it. The map is for a defined data estate, not for "the company." If the client has 3 distinct lines of business, do 3 separate maps.

Red flags that scope is too broad: - You cannot list the in-scope systems on one page - Different stakeholders give wildly different answers to the same question - The interview runs over 2 hours

If you hit those, pause, and re-scope before continuing.


Module 3 Exercise

Read the fictional intake brief in exercises/case_atlas_legal.md. Draft an interview agenda and identify which 3 questions you would ask first in each dimension. Identify which artifacts you would request for the async review. Time yourself — this should take 30 minutes.