Module 6: SDCforSMB Deployment

Duration: ~120 minutes self-paced + 1 hands-on lab Prerequisites: Modules 1-5, basic Linux/Docker familiarity Learning objectives: - Install and configure SDCforSMB on a Linux host or laptop - Complete the onboarding wizard end-to-end - Connect each supported datasource type - Run an introspection and interpret the results - Walk through an assembly review and approval - Deploy a generated application against the lightweight or enterprise stack - Establish ongoing monitoring and maintenance habits


6.1 System Requirements

Minimum (lightweight stack): - Linux host (Ubuntu 22.04+ or equivalent), macOS, or Windows with WSL2 - 4 vCPU, 8 GB RAM, 50 GB disk - Docker 24+ and docker compose v2 - Outbound network access for SDCStudio wallet operations and Ollama model pulls

Recommended (enterprise stack): - 8 vCPU, 16 GB RAM, 200 GB disk - Persistent volume backups configured - Reverse proxy with TLS (nginx, Caddy, Traefik)

Not supported: Air-gapped environments. Use Sovereign deployment for those.


6.2 Installation

# Clone the deployment repo
git clone https://github.com/SemanticDataCharter/SDCforSMB.git
cd SDCforSMB

# Copy the example env
cp .env.example .env
# Edit .env: set SECRET_KEY, ALLOWED_HOSTS, SDCSTUDIO_API_KEY

# Bring up the stack
docker compose up -d

# Verify
curl http://localhost:9000/health

The stack listens on port 9000 by default. The first launch takes 2-5 minutes as Ollama pulls the default model (gemma4:e4b or whichever you configured).


6.3 Onboarding Wizard

Browse to http://localhost:9000. The wizard runs in 5 steps:

  1. SDCStudio connection: Paste the URL and API key from the client's SDCStudio wallet. Click "Test Connection." A green check means the wallet is reachable and has a non-zero balance.

  2. Stack choice: Lightweight or Enterprise. For most SMB clients, choose Lightweight. Enterprise is for clients with 5+ datasources or multi-user requirements who will not migrate to SDCStudio SaaS.

  3. Ollama configuration: URL (default http://localhost:11434) and model (default ollama_chat/gemma4:e4b). The model is used for assembly suggestions and natural-language datasource descriptions. Test connection.

  4. First datasource: Pick a type and provide credentials. The wizard supports CSV, SQL (PostgreSQL/MySQL/SQLite), JSON, MongoDB, Notion, Google Sheets, and Airtable. Each type has its own credential form.

  5. Notifications (optional): Slack webhook, Telegram bot token, or SMTP for completion alerts. Skip for laptop installs.

On submission the wizard writes sdc-agents.yaml, creates the database records, and redirects to the dashboard.


6.4 Connecting Each Datasource Type

CSV: Path to a directory of CSV files. SDC Agents SMB will discover and introspect each file.

SQL (PostgreSQL/MySQL): Connection string. Use a read-only role. The introspection tools never issue writes.

SQLite: File path. Often the simplest demo target.

JSON: Path to a JSON file or directory. Nested structures are flattened during introspection with dotted-path column names.

MongoDB: Connection URI + database name. Collections are introspected as logical tables.

Notion: Integration token + workspace. Databases are introspected; pages are not.

Google Sheets: OAuth via the wizard, or service account JSON. Each sheet is treated as a separate datasource.

Airtable: API key + base ID. Tables are introspected with their field metadata.

Privacy guarantee: All credentials are stored encrypted in the SDCforSMB SQLite database. They are redacted from audit logs. Introspection results contain metadata only — no row-level data is cached unless the client explicitly opts in via the assembly review.


6.5 Running an Introspection

From the datasource detail page:

  1. Click "Run Introspection"
  2. Wait 30 seconds to 5 minutes depending on datasource size
  3. The result table displays the 13-field column metadata for each column:
  4. name, type, nullable, sample_values, distinct_count, null_count, min, max, mean, detected_unit, detected_format, anomaly_flags, suggested_label

Anomaly flags are the most useful column for assessment work. They surface things like: - mixed_types — values in the column have inconsistent types - unparseable_dates — date column with unparseable values - near_duplicate_identifier — column looks like an identifier but has duplicates - format_drift — formats vary within the column - outlier_count — statistical outliers detected

Each flag is direct evidence for a Maturity Map dimension. Screenshot the table for the report.


6.6 Assembly Review Workflow

After introspection, click "Discover Components." The assembly toolset consults the catalog to find matching existing components (free reuse) and proposes new ones (billable mint).

The assembly review screen shows: - Reuse count (free) — components matched by signature to existing catalog entries - Mint count (billable) — new components that will deduct from the wallet - Estimated cost — wallet impact in tokens - Wallet balance — current balance for confirmation - Component list — each entry expandable to show the proposed name, signature, and source columns

Review carefully. Reject and re-introspect if the AI mislabeled something. Approve when satisfied. Approval triggers the catalog to mint the new components and bind them to the assembly manifest.

Best practice: Run the first assembly in front of the client during Session 2 of the engagement. Watching their data become identified components is the moment the framework becomes real to them.


6.7 Generated Application Deployment

Once an assembly is approved, the next step is application generation. This happens through SDCStudio's AppGen API - SDCforSMB orchestrates the request, but the generation runs in SDCStudio (cloud or Sovereign). The generated application bundle is downloaded to the client's local environment.

How the pipeline works:

  1. SDCforSMB sends the approved assembly manifest to SDCStudio via the AppGen API
  2. SDCStudio generates the complete application bundle (Django app with database, API, validation, audit logging)
  3. The bundle is downloaded to the client's SDCforSMB host
  4. The practitioner deploys the bundle locally using Docker

Note: If the SDCforSMB console does not yet have a UI button for this step, the practitioner can trigger app generation directly through SDCStudio's web interface or via the SDC Agents CLI (sdc-agents assembly generate). The generated bundle is the same regardless of which interface triggers it.

What the generated bundle includes:

  • Validators (XSD 1.1 + Schematron + SHACL)
  • REST and JSON-LD endpoints
  • Audit log integration
  • Optional UI scaffold
  • Context graph outputs (RDF/OWL/SHACL) with governance components structurally present

For lightweight stack clients, the bundle deploys to the same SDCforSMB host. For enterprise clients, it deploys to a separate Docker host or Kubernetes cluster.

Walk the client through the first deployment. Demonstrate that the validators reject bad data and accept good data. This is the closing moment of the implementation engagement.


6.8 Ongoing Maintenance

Tell the client to expect: - Weekly: Glance at the dashboard for drift alerts and audit anomalies (2 minutes) - Monthly: Review new annotations and resolve any (15 minutes) - Quarterly: Re-run introspection on changed datasources (30 minutes) - Annually: Re-run the Maturity Map assessment with you (paid engagement)

Set a calendar reminder before you leave the engagement. Clients who do not look at the system for 6 months will not look at it ever.


Module 6 Lab

Install SDCforSMB on your own laptop. Connect a CSV datasource (use the provided sample lab/sample_csv/). Run introspection. Approve a small assembly with the demo SDCStudio wallet credentials. Generate an application via SDCStudio (either through the SDCforSMB console if the AppGen button is available, or directly through the SDCStudio web interface). Deploy the generated bundle. Submit screenshots of:

  1. The wizard completion screen
  2. The introspection result table
  3. The assembly review screen showing reuse vs mint
  4. The generated application's health endpoint running on your laptop

Submit via the certification portal for review.