Module 2: SDCStudio — Your First Data Model

Duration: ~60 minutes self-paced (includes hands-on lab time) Prerequisites: Module 1 Learning objectives: - Create an SDCStudio account and fund your wallet - Navigate the SDCStudio interface (projects, data sources, models, components) - Configure settings and upload domain ontologies - Upload a CSV file and observe the two-stage AI processing pipeline - Review an AI-generated data model and understand what was created - Publish a model and generate outputs in multiple formats

2.1 Getting Started with SDCStudio

SDCStudio is a cloud-based platform — there is nothing to install. Open your browser and navigate to sdcstudio.axius-sdc.com.

Creating your account and funding your wallet

Click Sign Up and create your account with an email address
Navigate to Wallet and fund it with the $10 minimum
Once your wallet is funded, the Practitioner Curriculum link appears in your user menu

The $10 is your own wallet funding — it stays yours and is used for component minting on your engagements. It is not a program fee.

2.2 Navigating the Interface

SDCStudio is a React single-page application. The main navigation provides access to:

Dashboard — overview of your projects and recent activity
Projects — create and manage projects (each project is a container for related models)
Data Sources — view uploaded files and processing status
Data Models — browse and edit your data models
Components — manage reusable data components
Settings — configure profile, ontologies, and preferences

Key interface features: - Real-time updates: The interface refreshes automatically as AI processes your data - Status indicators: Color-coded badges show processing progress (UPLOADING → PARSING → PARSED → AGENT_PROCESSING → COMPLETED) - Contextual actions: Buttons and menus appear based on what you are viewing

2.3 Configure Your Settings (Do This First)

Before creating models, configure your profile and upload any domain-specific ontologies.

Upload domain ontologies (optional but recommended)

Standard ontologies — FHIR, NIEM, SNOMED CT, LOINC, schema.org — are already built into SDCStudio. You only need to upload your organization's custom or local domain ontologies.

Click Settings → Ontologies tab
Prepare your custom ontology files in Turtle (.ttl) format
Click Upload Ontology, select your file, add metadata (name, description, namespace URI)
Save

Your custom ontologies are now available for AI processing and semantic enrichment during model creation.

Why this matters: The AI uses your uploaded ontologies to make better suggestions during processing. Better ontologies produce better models. This is the minimum knowledge modeling principle in practice — you provide the domain expertise, the AI applies the structural patterns.

2.4 Create Your First Project

Navigate to Projects
Click Create New Project
Fill in: name (e.g., "Customer Analytics"), description, domain
Click Create Project

A project is a container for related data models, components, and data sources. Think of it as a workspace for a specific engagement or use case.

2.5 Upload Data and Watch AI Processing

This is where SDCStudio demonstrates its core value. Upload a data file and watch the two-stage AI pipeline transform it into a structured, constraint-bound data model.

Upload

Open your project
Navigate to Data Sources tab
Click Upload Data
Choose your file — CSV is recommended for your first attempt (5-10 columns, clean headers)
Click Upload

Stage 1: Structural Parsing (30 seconds to 2 minutes)

Status: UPLOADING → PARSING → PARSED

The platform detects file format, identifies columns/fields, and infers basic data types (XdString, XdCount, XdTemporal, etc.). Structure is mapped for the AI analysis stage.

Stage 2: AI Enhancement (1-5 minutes)

Status: AGENT_PROCESSING → COMPLETED

The AI performs: - Semantic analysis: understands what each field represents - Pattern recognition: identifies email patterns, phone formats, date formats, etc. - Ontology matching: uses your uploaded ontologies (and built-in standards) for concept suggestions - Validation rules: recommends appropriate constraints (regex patterns, ranges, enumerations) - Relationship detection: finds logical groupings and connections between fields

The interface updates automatically as processing progresses.

2.6 Review Your Generated Data Model

Once status shows COMPLETED:

Navigate to Data Models tab in your project
Click on your generated model (named after your uploaded file)
Explore what the AI created:
Data Model: the top-level structure
Clusters: logical groupings of related fields
Components: individual data elements with types, validation rules, and semantic links

The AI has created SDC4-compliant components, each with: - An appropriate data type from the SDC4 type hierarchy - Validation rules (pattern matching, range constraints, required fields) - Semantic enrichment (descriptions and labels informed by your ontologies) - Logical groupings in clusters

You can refine any component by clicking on it and modifying properties — data type, validation rules, labels, descriptions, required/optional status. The AI's work is a starting point, not a final product. Domain expertise is yours.

2.7 Publish and Generate Outputs

Publish your model

In your Data Model view, click Publish
Review the model summary
Confirm publication
Status changes to PUBLISHED

Publishing makes your model available for output generation and locks the current version. You can always create a new version later.

Generate outputs

Once published, you can generate outputs in any of 8 formats:

Format	What it provides
XSD Schema	XML Schema Definition for structural validation
XML Instance	Example XML document conforming to the schema
JSON Schema	JSON Schema Definition
JSON-LD	Linked data representation for semantic web integration
HTML Documentation	Human-readable documentation of the model
RDF Triples	Semantic web graph data
SHACL	RDF constraint shapes for graph-level validation
GQL	Graph database query statements

To generate: click the Generate dropdown → select output type → configure options → click Generate → download.

This is the moment where the SDC4 specification becomes concrete. One model, authored once, produces 8 interoperable output formats. The data carries its own constraints, identity, and semantic context in every format.

2.8 What You Just Did

In under an hour, you:

Created an SDCStudio account and funded your wallet
Uploaded a data file (CSV, Markdown, or JSON)
Watched the AI build a constraint-bound, semantically enriched data model
Reviewed the generated components, clusters, and validation rules
Published the model
Generated outputs in multiple interoperable formats

Every output you generated carries structural constraints (XSD 1.1), semantic identity (CUID2 identifiers), and vocabulary bindings — the same properties that make SDC data self-describing across system boundaries. This is the foundation for everything you will learn in the remaining modules.

Module 2 Exercise

Using the sample data from lab/sample_csv/clients.csv (the Atlas Legal case study data):

Create a new project in SDCStudio called "Atlas Legal Lab"
Upload clients.csv
Observe the two-stage processing pipeline
Review the generated model — how many components were created? What types were assigned?
Compare the AI's type assignments to what you would have chosen manually. Where does the AI get it right? Where would you override?
Publish the model and generate the XSD Schema output
Open the XSD and identify the constraint rules the AI embedded

This exercise takes approximately 20 minutes. No quiz — the hands-on experience is the learning.