Skip to main content

Goal

Use VOICEVOX Editor .vvproj data directly in Laravel so you can regenerate both talk and song output.
VOICEVOX is a Japanese text-to-speech and singing synthesis software. This page focuses on interoperability with the official VOICEVOX Editor project file.

Top-level structure

.vvproj is UTF-8 JSON. One file can include both talk and song data.
{
  "appVersion": "0.25.2",
  "talk": {
    "audioKeys": [],
    "audioItems": {}
  },
  "song": {
    "tpqn": 480,
    "tempos": [],
    "timeSignatures": [],
    "tracks": {},
    "trackOrder": []
  }
}
KeyDescription
appVersionVOICEVOX Editor version used to save the project
talkTalk data (audioKeys + audioItems)
songSong data (tempo, meter, tracks)

talk section

talk.audioKeys defines order. talk.audioItems is a record keyed by the same IDs.
{
  "audioKeys": ["audio-item-uuid"],
  "audioItems": {
    "audio-item-uuid": {
      "text": "ずんだもんなのだ",
      "voice": {
        "engineId": "engine-uuid",
        "speakerId": "speaker-uuid",
        "styleId": 3
      },
      "query": {
        "accentPhrases": [],
        "speedScale": 1,
        "pitchScale": 0
      },
      "presetKey": "preset-uuid"
    }
  }
}

TalkAudioItem

KeyDescription
textInput text
voice.engineIdEngine ID (matches /engine_manifest)
voice.speakerIdSpeaker UUID
voice.styleIdStyle ID passed directly to /synthesis?speaker={styleId}
queryAudioQuery-equivalent payload
presetKeyEditor preset ID. For Laravel-side presets, see Presets
If query is already stored in .vvproj, you can skip /audio_query and synthesize directly.

accentPhrases

KeyDescription
morasMora array (consonant fields may be omitted)
accentAccent position (1-based)
pauseMoraOptional pause mora
isInterrogativeQuestion sentence flag

song section

{
  "tpqn": 480,
  "tempos": [{ "position": 0, "bpm": 120 }],
  "timeSignatures": [{ "measureNumber": 1, "beats": 4, "beatType": 4 }],
  "tracks": {
    "track-uuid": {
      "name": "Untitled Track",
      "singer": { "engineId": "engine-uuid", "styleId": 3003 },
      "notes": []
    }
  },
  "trackOrder": ["track-uuid"]
}
KeyDescription
tpqnTicks per quarter note. Default is 480
temposTempo map (position in ticks)
timeSignaturesTime-signature map
tracksRecord keyed by track ID
trackOrderPlayback/display order. Must match tracks key set

Track

KeyDescription
singer.styleIdFinal style ID for /frame_synthesis?speaker={styleId}
notesNote array
keyRangeAdjustmentSemitone key adjustment
volumeRangeAdjustmentVolume adjustment
pitchEditData / volumeEditDataFrame-level edit data
phonemeTimingEditDataPhoneme timing overrides by note ID
solo / muteTrack-selection flags

Note

KeyDescription
idUnique note ID
positionStart tick
durationLength in ticks
noteNumberMIDI note number
lyricLyric

Tick/seconds/frame conversion

For a single tempo:
seconds = ticks / tpqn * 60 / bpm
frames = round(seconds * frameRate)
With tempo changes, sum each segment in tempos order:
function ticksToSeconds(int $targetTick, int $tpqn, array $tempos): float
{
    $seconds = 0.0;
    $currentTick = 0;

    foreach ($tempos as $index => $tempo) {
        $nextTick = $tempos[$index + 1]['position'] ?? $targetTick;
        $segmentEnd = min($targetTick, $nextTick);

        if ($segmentEnd <= $currentTick) {
            break;
        }

        $bpm = $tempo['bpm'];
        $seconds += (($segmentEnd - $currentTick) / $tpqn) * (60 / $bpm);
        $currentTick = $segmentEnd;
    }

    return $seconds;
}
For Note::len() helper usage (tick-to-frame workflow), see Score and Note Deep Dive.

Song synthesis flow

Notes for direct JSON editing

  • Keep tracks keys and trackOrder perfectly aligned
  • Keep talk.audioKeys aligned with talk.audioItems
  • Validate position >= 0, duration >= 1, noteNumber in 0..127
  • Prefer tempos[0].position = 0 and timeSignatures[0].measureNumber = 1
  • Preserve unknown keys when possible for forward compatibility

Laravel code example

Minimal example to load .vvproj and render both talk and song outputs:
use Illuminate\Support\Facades\Storage;
use Revolution\Voicevox\Client\TalkAudioQuery;
use Revolution\Voicevox\Song\Note;
use Revolution\Voicevox\Song\Score;
use Revolution\Voicevox\Voicevox;

$project = json_decode(
    Storage::disk('local')->get('voicevox/sample.vvproj'),
    true,
    flags: JSON_THROW_ON_ERROR,
);

// Talk: synthesize from stored query
foreach ($project['talk']['audioKeys'] as $audioKey) {
    $item = $project['talk']['audioItems'][$audioKey];

    Voicevox::talk($item['text'], id: $item['voice']['styleId'])
        ->tap(fn (TalkAudioQuery $talk) => $talk->audioQuery = array_replace($talk->audioQuery, $item['query']))
        ->generate(id: $item['voice']['styleId'])
        ->storeAs('vvproj/talk', "{$audioKey}.wav");
}

// Song: convert first-track durations into frame_length
$trackId = $project['song']['trackOrder'][0];
$track = $project['song']['tracks'][$trackId];
$bpm = $project['song']['tempos'][0]['bpm'] ?? 120;

$score = Score::make([
    Note::make(length: 15, lyric: '', key: null),
    ...collect($track['notes'])->map(
        fn (array $note) => Note::make(
            length: Note::len(ticks: $note['duration'], bpm: $bpm),
            lyric: $note['lyric'] ?? 'ら',
            key: $note['noteNumber'],
            id: $note['id'] ?? null,
        ),
    )->all(),
    Note::make(length: 2, lyric: '', key: null),
]);

Voicevox::song($score, teacher: 6000)
    ->generate(id: $track['singer']['styleId'])
    ->storeAs('vvproj/song', "{$trackId}.wav");
Last modified on May 28, 2026