Module 6 Lab: Sample Data
Synthetic data for the Module 6 hands-on installation lab. All identifiers, names, phone numbers, and email addresses are fictional. Any resemblance to real persons or businesses is coincidental.
Files
sample_csv/clients.csv— 20 client records simulating an Atlas Legal-style intake. Deliberately includes data quality problems so the introspection run produces interesting anomaly flags.sample_csv/matters.csv— 20 matter records linked to clients viaclient_id.
Intentional data quality issues
The lab data is engineered to demonstrate the SDC Agents SMB anomaly detection. When you run introspection on clients.csv you should see flags including:
-
near_duplicate_identifieronclient_name— Maria Gonzalez appears asMaria Gonzalez,Maria Gonzales, andMaria E. Gonzalez(rows 1001, 1002, 1010). James O'Brien appears asJames O'Brien,J. O'Brien, andJames O Brien(rows 1003, 1008, 1018). Chen Wei appears asChen Wei(twice) andWei Chenwith email and phone matching across all three (rows 1005, 1006, 1014). Sandra Williams appears as bothSandra WilliamsandSandra Williams-Hayeswith the same email (rows 1015, 1016). Acme Holdings, Riverdale Cafe, and Patel Family Trust each appear in two slightly different forms. -
format_driftonphone— formats include(503) 555-0142,503-555-0142,5035550199,503.555.0156, and(503)555-0156. Five different formats coexist. -
format_driftoncase_number— formats includeIM-2023-0001,SB2023-004,SB-2023-011, andIM2023-012. Two competing conventions. -
unparseable_datesonintake_date— row 1005 has00/00/0000. -
mixed_types/null_countonphone— row 1007 has an empty phone field. -
outlier_countonbillable_hoursinmatters.csv— matterM-0018has 9999 billable hours, simulating a fat-finger error.
These flags are direct evidence for a Maturity Map dimension scoring exercise. Trainees who run the lab on this data should be able to identify which dimension each flag supports without referring back to Module 2.
Lab procedure
- Install SDCforSMB on your laptop following Module 6 section 6.2
- Complete the onboarding wizard (use the Axius SDC training SDCStudio wallet credentials provided in the certification portal)
- Add a CSV datasource pointing at this
sample_csv/directory - Run introspection
- Capture screenshots:
- Wizard completion screen
- Introspection result table showing the anomaly flags above
- Assembly review screen showing reuse vs mint counts
- Deployed application health endpoint after Generate Application
- Submit screenshots via the certification portal
What to look for in the assembly review
The first run on this data will propose components for: a Person (covering both clients and attorneys), a PhoneNumber, an EmailAddress, an IdentifierCode (covering case_number variations), a MoneyAmount (for total_billed_usd), and a DurationHours (for billable_hours). Of these, Person, PhoneNumber, and EmailAddress are very likely to match existing components in the Axius SDC reference catalog and be marked as reuse (free). IdentifierCode and the domain-specific types may require minting (billable). The expected wallet impact on the training wallet is $0 — the training wallet covers all minting for certified practitioners in good standing.