Data Examples

Inspect the file shapes behind each workflow stage.

These CSV and JSON examples are synthetic, but they preserve the column names, table boundaries, and export shapes used across the AXP workflow. The value-lifecycle diagram links directly into the sections below, so a collaborator can click a node and land on the matching file structure without needing live survey access.

How to read these examples

Datatype labels are explicit.

Each example calls out whether the shape is a CSV source file, an in-memory R data.frame, a MariaDB table export, or a wide distribution CSV.

The values are mock.

No participant data was copied into this repo. These files exist only to explain shape, naming, and the public/private split that collaborators see in the workflow and OSF bundles.

Use the viewer for wide files.

Each section links to a dedicated full-screen viewer so the raw layout stays readable when an inline preview is too small.

1. Definition and runtime

The questionnaire starts as a stacked multi-language table. After loading, the runtime keeps both the full multilingual table and a selected-language view, plus metadata such as instrument version and definition hash. DB-backed submissions also snapshot the normalized questionnaire definition by definition_hash, so later analysis is not dependent on a mutable Google Sheet tab.

mock-data/questionnaire_definition.csv

Mock source table excerpt for the Google Sheet / CSV definition. English rows carry structure, ASC factor mapping through scale_id, and the randomize_question_order switch on q0.

Datatype: CSV source file. Shape: one row per questionnaire item per language.

mock-data/questionnaire_definition_table.csv

The durable questionnaire snapshot row written before a DB-backed submission is accepted.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per normalized definition_hash.

mock-data/runtime_state.json

Mock runtime metadata showing the loaded questionnaire shapes and the currently selected language view.

Datatype: JSON example of runtime state. Shape: metadata object with row and column counts plus instrument identifiers.

2. Submission assembly and ASC scoring in memory

At submit time the app assembles a common long table, splits it into numeric and text/context rows, and computes canonical ASC factor scores from the numeric rows.

mock-data/responses_df.csv

The shared in-memory response table before numeric and text/context rows are split apart.

Datatype: R data.frame represented as CSV. Shape: one row per answered item with item_id, type, and raw value.

mock-data/response_numeric_memory.csv

The numeric subset that gets scored and then written to response_numeric.

Datatype: R data.frame represented as CSV. Shape: one row per numeric answer with item_id, value, and created_at.

mock-data/response_text_memory.csv

The text/context subset. q0 and q1 drive peer comparison filters. Free-text fields such as q_context are collected in memory but only persist when full text storage is enabled.

Datatype: R data.frame represented as CSV. Shape: one row per text or context field with field_id and text.

mock-data/asc_factor_scores.csv

The canonical ASC factor score rows. A scorable submission must produce all 11 canonical factors before the app accepts the DB write.

Datatype: R data.frame represented as CSV. Shape: one row per canonical ASC scale_id with score_value.

3. Durable MariaDB tables

Once the write succeeds, the session is represented as normalized MariaDB tables. Numeric answers, text/context rows, and scores are one-fact-per-key tables: each submission can only have one row for a given item_id, field_id, or scale_id. Retry paths update the existing fact instead of appending duplicates. The submission-level row also carries broad timing and operational quality metadata used for later review.

mock-data/submission_table.csv

Submission-level metadata written once per survey session, including the duration measured from the context/consent step display to submit, the exact item order shown to the participant, and high-level data-quality review metadata. The exact review criteria are intentionally not documented in this public guide.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per submission_id.

mock-data/response_numeric_table.csv

The durable numeric answer rows keyed by submission and item id. The live schema enforces one row per submission_id + item_id.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per numeric answer.

mock-data/response_text_table.csv

The durable text/context rows. Public export never includes these raw text values. STORE_TEXT_RESPONSES controls full text storage, while STORE_CONTEXT_TEXT separately controls whether q0/q1 context is kept for peer comparability when full text storage is off.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per text or context field.

mock-data/score_table.csv

The durable ASC factor score rows written after scoring succeeds. The schema enforces one row per submission_id + scale_id, and exports require all 11 canonical factors for every submission.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per canonical ASC factor per submission.

4. Peer comparison selection and feedback render

The peer plot does not draw raw text rows directly. It first builds one cache row per submission from durable tables, resolves the participant’s selected canonical comparison pair, then queries all complete matching ASC factor score rows for that bucket and renders them as spoke-local violins with quartile summaries and the current submission overlay.

If you want to adjust the new radial violin version directly in the browser, open the public plot lab.

  • The raw source for the comparison context is submission plus the q0/q1 context rows in response_text plus ASC factor completeness from score.
  • peer_plot_submission_cache stores both the raw labels and the canonical induction/dose tokens used for comparison.
  • The controls only expose canonical buckets back to the participant, even if raw free text, Other-style context, translated labels, or uncommon cache rows were stored.
  • The selected canonical pair then drives a filtered join from peer_plot_submission_cache to score, excluding the current submission and incomplete peer rows.
  • All complete peers in the selected canonical bucket are used for plotting; the feedback view does not apply a second plot-only curation layer.
  • The legend, reading guide, and share or download image actions reuse this same selected pair and peer count. They do not create extra stored tables or export files.

mock-data/peer_plot_cache_build.json

This build recipe shows how one peer_plot_submission_cache row is assembled from submission, the stored q0/q1 context rows in response_text, and the ASC factor row count in score.

Datatype: JSON cache-build example. Shape: one object showing source tables, canonicalization, and the written cache row.

mock-data/peer_plot_selection_state.json

This selection-state example makes the UI path explicit: the current submission provides the default pair, the controls show only canonical buckets, and the selected pair becomes the query filter for peer rows.

Datatype: JSON UI-state example. Shape: one object showing current defaults, canonical control catalog, and the selected query filters.

mock-data/peer_plot_submission_cache.csv

The one-row-per-submission cache that keeps raw and canonical context plus completeness flags. In this mock file, one row uses translated raw labels that canonicalize to the selected bucket, one row is incomplete, and one row belongs to a different bucket.

Datatype: MariaDB table excerpt shown as CSV. Shape: one row per submission with raw plus canonical induction/dose values.

mock-data/peer_plot_scores_query.csv

The score rows returned for one canonical bucket such as psilocybin + medium. Each row becomes one plotted peer value on one ASC axis, even when the stored raw labels were translated or phrased differently before canonicalization.

Datatype: Query result represented as CSV. Shape: one row per peer_id and scale_id.

mock-data/peer_plot_render_payload.json

This mock render summary makes the current feedback UI explicit: it records the selected canonical pair, names which cache rows were excluded, and shows how matching peer score rows become the violin layer, quartile marks, the purple self profile, and the legend or share metadata shown around the plot.

Datatype: JSON render-state example. Shape: one object describing filters and rendered layers.

5. Distribution outputs

The distribution stage widens the normalized rows into publication-ready CSV files. Public output anonymizes identity, drops free text, and omits responses excluded during data-quality review. Private output keeps the raw submission id, timestamp, any stored text columns, and restricted operational metadata. Before writing files, both export paths now fail on duplicate fact rows, non-canonical score ids, or incomplete 11-factor ASC coverage.

mock-data/public_export.csv and mock-data/public_factor_scores.csv

The public wide bundle plus the public factor-only long export. These are the publishable shapes.

Datatype: Wide and long CSV export files. Shape: one wide row per anonymized session plus one long row per public factor score.

mock-data/private_full_export.csv

The private wide bundle. It retains raw submission identity, full timestamps, and text columns alongside numeric answers and ASC factor scores.

Datatype: Wide CSV export file. Shape: one row per raw submission with numeric, text, and score columns.