MariaDB and OSF stack

Understand where AXP data lands and how sheet choices shape it.

This page explains the current production stack for collaborators: Google Sheets defines the questionnaire and comparison registry, MariaDB stores submitted rows, and OSF receives append-only public and private export bundles.

Current production contract

MariaDB is the live source of submitted data.

The survey writes normalized submission, response, score, and peer-cache rows to the server database. Collaborators normally do not need direct database access.

OSF receives generated snapshots.

Exports run daily, then upload to matching run folders in public and private OSF projects. Routine uploads are append-only and refuse overwrites.

Legacy archive rows stay out of routine snapshots.

The imported legacy instrument axp_legacy_11asc22 is excluded from regular public and private snapshots because it is already archived separately.

Google Sheets controls meaning, not storage location.

Sheet tabs define questionnaire versions and q0/q1 choices. The separate comparison_tokens tab decides which canonical conditions can be shown or exported publicly.

Where data lands

Google Sheets Questionnaire tabs plus comparison_tokens
Running survey Loads the current sheet, collects answers, computes ASC scores
MariaDB Durable normalized rows and peer-cache context
Exports Public redacted bundle and private full bundle
OSF Append-only run folders in public/private projects
Place Access What it contains What it does not contain
MariaDB on the survey server Server maintainers only Raw normalized survey data, q0/q1 context, optional stored free text, scores, definition snapshots, peer-cache rows, and quality metadata. It is not a collaborator download surface and is not publicly reachable.
Public OSF project Public download De-identified public CSVs, codebooks, README, changelog, and export manifest. Public condition fields use safe canonical tokens only. No raw submission ids, full timestamps, raw free-text fields, IP/user-agent metadata, or unknown/Other free-input context.
Private OSF project Invite-only collaborator access Full restricted CSVs with raw submission ids, full timestamps, stored text/context columns, operational metadata, codebooks, README, changelog, and export manifest. It is not a public publication target. Do not redistribute it as if it were the public bundle.
Legacy OSF/archive Project-specific archive access The imported legacy 2022 peer-data block that uses instrument_id=axp_legacy_11asc22. Routine daily snapshots do not re-upload this 10k-row archive block.

MariaDB storage model

MariaDB is the durable record of survey submissions. It stores normalized tables rather than one giant spreadsheet row. Export scripts join those tables later when building OSF files.

submission

One row per submitted session: ids, timestamps, questionnaire version, language, duration, quality flags, and item-order metadata.

questionnaire_definition

One snapshot per definition_hash, so later analysis can recover which loaded sheet definition produced a submission.

response_numeric

One row per numeric answer, keyed by submission_id and item_id. These rows feed ASC scoring.

response_text

One row per text/context answer. q0/q1 context is kept for comparability; free text remains private when stored.

score

One row per ASC factor score. Complete scorable submissions must have all 11 canonical ASC factors.

peer_plot_submission_cache

One row per submission with raw and canonical induction/dose context plus score-completeness metadata for feedback queries.

OSF upload workflow

  1. 1
    Daily export at 03:00 Europe/Berlin.

    The server runs the public and private export scripts from MariaDB. Both export paths exclude axp_legacy_11asc22 from routine snapshots.

  2. 2
    Daily upload at 03:15 Europe/Berlin.

    The upload job creates one run folder such as export_YYYYMMDDTHHMMSSZ in both OSF targets. The timestamp is UTC.

  3. 3
    Current files only.

    Routine uploads send the latest files, the current dated files, codebooks, README, changelog, and manifest. Historical dated exports are not re-uploaded every day.

  4. 4
    Append-only guard.

    If a file already exists in the target run folder, upload fails instead of replacing it. This prevents silent stale-file retention or accidental overwrite.

What each OSF bundle contains

Bundle file Visibility Shape Condition fields
axp_public_latest.csv and axp_public_YYYYMMDD.csv Public Wide CSV: one anonymized row per included session, with item_* numeric answers and scale_* ASC scores. induction_method and dose_label are public-safe canonical tokens or blank.
axp_factor_scores_public_latest.csv and dated copy Public Long CSV: one row per anonymized peer id and ASC factor. Includes the same public-safe induction_method and dose_label tokens.
axp_private_full_latest.csv and axp_private_full_YYYYMMDD.csv Private Wide CSV: one raw submission row with numeric answers, stored text/context columns, operational metadata, and scores. Raw context appears as private text_* columns such as text_q0, text_q1, and any stored free text.
axp_factor_scores_full_latest.csv and dated copy Private Long CSV: one row per raw submission id and ASC factor. Used when collaborators need restricted factor-level analysis with raw ids.
codebook*.csv, README*.md, CHANGELOG*.md Public or private target Human-readable support files generated with the bundle. Explain column meanings and export history for that target.
export_manifest_public.json and export_manifest_private_full.json Public or private target Audit JSON with generated time, git commit, row counts, omitted legacy count, file sizes, and SHA-256 hashes. Confirms which filters and files were used for that run.

How Google Sheets entries affect data

Version tabs become submission metadata.

The loaded tab name becomes instrument_version. The normalized sheet creates a definition_hash stored with each submission.

English structural rows carry scoring meaning.

Stable item_id, scale_id, type, required, active, and order fields define how answers are collected and scored.

q0 and q1 become context rows.

q0 is the induction field. q1 is the second context field, whose label and options depend on the selected induction.

comparison_tokens controls canonical output.

Only enabled stable tokens should appear in feedback controls and public condition columns. Raw labels and aliases map into these tokens.

A questionnaire option and a comparison bucket are not the same thing. Adding a q0/q1 option lets the survey collect that answer. Adding an enabled row in comparison_tokens makes that answer eligible for canonical comparison and public-safe export.

Dose, q1, and scoped comparison rows

In the code and exports, dose_label means the canonical token for the second context field. It is not always a pharmacological dose.

Substances can use global dose buckets.

Classic dose labels such as low, medium, and high can be unscoped when they apply across several induction methods.

Non-substance contexts need scoped rows.

Floatation durations, Flicker Light durations, breathwork methods, event labels, and institute names should use kind=dose rows scoped by induction_token.

Aliases keep old labels usable.

Translated labels, old spellings, and old duration ranges belong in aliases so historical submissions still canonicalize to the intended token.

Other and free input stay private.

Other-style answers can be retained as raw private context, but they are blanked from public condition columns unless the team creates a specific stable token.

Example decisions

  • Psilocybin / Medium can map to induction psilocybin and global dose medium.
  • Floatation / 61 bis 90 Minuten should map to induction floatation and a scoped duration token such as minutes_61_to_90.
  • Flicker Light / 20 minutes should map to a scoped Flicker Light duration token, not a global pharmacological dose token.
  • Other: handwritten text remains raw private context and should not become a public comparison option by default.

Collaborator checklist before changing q0/q1

  1. 1
    Decide whether the change is collection-only or comparison-ready.

    If it only needs to collect raw context, the questionnaire tab may be enough. If it should show in feedback or public exports, update comparison_tokens.

  2. 2
    Use stable tokens.

    Tokens should survive later wording changes. Labels can change; tokens should not.

  3. 3
    Scope q1 rows when the meaning depends on q0.

    Set induction_token on dose rows for durations, methods, events, or other induction-specific q1 values.

  4. 4
    Put old labels in aliases.

    Aliases are the bridge between older submitted raw labels and the current canonical token.

  5. 5
    Check exports after the next run.

    Public OSF should show public-safe tokens or blanks. Private OSF should retain the raw context needed for restricted review.