SDCStudio Overview — From CSV to Published Schema

Reading time: ~15 minutes | Level: Beginner Prerequisites: Tutorial 1: What is SDC?

What You'll Learn

What SDCStudio does and how it fits into the SDC ecosystem
How to navigate the dashboard and main interface
The complete end-to-end workflow from file upload to generated outputs
Key concepts: Projects, Models, Clusters, Components
Where to go next for deeper learning

What SDCStudio Does

SDCStudio is an AI-powered web application that helps you create SDC4-compliant data models. You give it a data file (like a CSV), and it:

Parses the file structure — columns, types, patterns
Analyzes the data using AI agents — semantic meaning, validation rules, relationships
Generates a complete data model — SDC4 components with appropriate types and constraints
Publishes the model as an immutable, versioned schema
Produces output files — XSD, XML, JSON, JSON-LD, HTML, RDF, SHACL, GQL

You review and refine the AI's work at each step. The goal is to go from raw data to a published, multi-format schema with embedded governance and semantics.

Dashboard overview with quick-action cards and recent activity

When you sign in, you see the React SPA interface with these main areas:

Sidebar navigation showing all main sections

Main Navigation: - Dashboard — Overview of your projects and recent activity - Projects — Create and manage project containers - Data Sources — View uploaded files and their processing status - Data Models — Browse and edit your data models - Components — Manage individual data components - Settings — Configure profile, upload ontologies, set preferences

Key Interface Features: - Real-time updates — Status badges refresh automatically as AI processes your data - Color-coded status — Blue for in-progress, yellow for processing, green for complete, red for errors - Contextual actions — Buttons appear based on what you are viewing and its current state

The End-to-End Workflow

Here is the complete flow from a CSV file to published outputs. Each step links to a dedicated tutorial for details.

Step 1: Configure Settings (Do This First)

Before uploading any data, configure your Settings:

Click the Settings icon in the top navigation
Go to the Ontologies tab
Upload your organization's custom ontologies in Turtle (.ttl) format

Standard ontologies (FHIR, SNOMED, NIEM, schema.org) are already built into SDCStudio. You only need to upload your own domain-specific vocabularies. Uploading ontologies before processing data significantly improves AI suggestions.

Deeper dive: Tutorial 8: Semantic Linking

Step 2: Create a Project

Projects organize related data models, components, and data sources.

Navigate to Projects
Click Create New Project
Enter a name (e.g., "Customer Analytics"), description, and domain
Click Create

Deeper dive: Tutorial 10: Working with Models and Projects

Step 3: Upload a Data File

File upload page with drag-and-drop zone

Open your project
Go to the Data Sources tab
Click Upload Data
Select a CSV or Markdown file
Click Upload

Your file appears in the list with a blue UPLOADING status badge.

Deeper dive: Tutorial 3: Uploading Data and AI Processing

Step 4: Watch AI Processing

Data sources list showing uploaded files with status indicators

SDCStudio processes your file in two stages:

Stage 1 — Structural Parsing (30 seconds to 2 minutes):

UPLOADED → PARSING → PARSED

The system reads the file, identifies columns, and infers basic data types. No AI/LLM calls happen here — this is fast, deterministic parsing.

Stage 2 — AI Enhancement (1 to 5 minutes):

PARSED → AGENT_PROCESSING → COMPLETED

AI agents analyze each column for semantic meaning, detect patterns (email, phone, date formats), suggest validation rules, and create SDC4 components with appropriate types.

The key benefit of the two-stage design: if the AI stage fails, you can retry without re-parsing the file.

Deeper dive: Tutorial 3: Uploading Data and AI Processing

Step 5: Review the Generated Model

Model detail page showing components, clusters, and metadata

Once the status shows COMPLETED:

Navigate to the Data Models tab in your project
Click your model (named after your uploaded file)
Explore the structure:
Data Model — top-level container
Cluster — logical grouping of all fields
Components — individual data elements

For a CSV with columns customer_id, name, email, signup_date, status, the AI creates:

Data Model: customers
└── Cluster: customers_cluster
    ├── customer_id  (XdCount — non-negative integer)
    ├── name         (XdString — text)
    ├── email        (XdString — with email pattern validation)
    ├── signup_date  (XdTemporal — ISO 8601 date)
    └── status       (XdString — enumeration: active/inactive)

Each component includes a label, description, data type, and validation rules that the AI inferred from your data.

Deeper dive: Tutorial 4: Understanding Components and Data Types

Step 6: Edit a Component

You can refine any component the AI created:

Navigate to Components or click a component in the model view
Click the Edit button (only visible for unpublished components)
The stepped edit form opens with 5 steps:
Basic Info — label, description, language
Type Config — type-specific settings (regex, range, enumerations)
Advanced Settings — sequence number, UI type, validation flags
Semantic Linking — RDF predicate-object pairs
Review & Save — summary with Save and Save & Publish buttons

In edit mode, you can navigate freely between steps — click any step in the stepper to jump to it.

Deeper dive: Tutorial 6: Editing Components

Step 7: Publish the Model

Publishing is a one-time, irreversible operation that locks the model and enables output generation.

Publication order matters: 1. Publish all child components first (Save & Publish on each) 2. Publish the data model 3. Generate the package (also one-time only)

After publication, the model and its components cannot be edited. If you need changes, create a new version.

Deeper dive: Tutorial 9: Publishing and Generating Outputs

Step 8: Download Outputs

After generating the package, download individual files or the complete ZIP:

File	Purpose
`dm-{id}.xsd`	XML Schema for validation
`dm-{id}.xml`	XML instance with sample data
`dm-{id}-instance.json`	JSON instance data
`dm-{id}.jsonld`	JSON-LD semantic schema
`dm-{id}.html`	Human-readable documentation
`dm-{id}.rdf`	Extracted RDF triples
`dm-{id}.ttl`	SHACL constraint shapes
`dm-{id}.gql`	GQL CREATE statements
`{title}.zip`	Complete package

Deeper dive: Tutorial 9: Publishing and Generating Outputs

Key Concepts

Projects

Projects are top-level containers that organize related work. A project can hold multiple data models, each with its own components and clusters. Use projects to group models by domain, department, or initiative.

Data Models

A data model is the top-level SDC4 structure. It has a name, description, and status (Draft or Published). A model contains a Data cluster that groups all components, plus optional sections for Subject, Provider, Participation, Workflow, Attestation, Audit, and Links.

Clusters

Clusters group related components into logical units. A "Customer" cluster might contain name, email, and phone components. Clusters are flat — all components are direct children of the cluster.

Components

Components are individual data elements — the building blocks of your model. Each component has: - A type (XdString, XdCount, XdTemporal, etc.) - A label (human-readable name) - A description (purpose and usage) - Validation rules (patterns, ranges, enumerations) - Semantic links (RDF predicate-object pairs to ontology concepts)

Components are the individual building blocks of a data model. They represent individual fields in your data.

Summary

The SDCStudio workflow is:

Configure Settings and upload ontologies
Create a project
Upload a data file
Watch the two-stage AI pipeline create a model
Review and refine components
Publish (irreversible) — children first, then model
Generate the package (one-time)
Download outputs in multiple formats

At each step, you control the process. The AI gives you a strong starting point; you refine it to match your exact requirements.

Next Tutorial

Uploading Data and AI Processing — Learn the details of file upload, supported formats, and what happens during each stage of AI processing.