ホーム

料金体系

API料金

テキスト翻訳API料金

ファイル翻訳API料金

音声翻訳API料金

サポートプラン料金

Phrase料金

WordPress料金

Slackプラグイン料金

カウント方法

APIドキュメント

ご利用方法

すぐにアカウントを発行

営業からのご案内

認証方法

概要

JWT 認証（生成）

JWT リクエスト

Nonce 認証

テキスト翻訳

概要

同期

非同期・翻訳要請

非同期・結果取得

音声翻訳

概要

音声認識

音声認識 (ストリーミング)

テキスト翻訳（同期・高速）

音声合成

ファイル翻訳

概要

翻訳要請

結果取得

全部結果取得

ダウンロード

コールバック

概要

URL取得

URL登録・更新

URL削除

対応リスト

カスタマイズ

概要

モデル生成

全モデル取得

指定モデル取得

モデル生成状態照会

T-3MTエンジン分野照会

ユーザー辞書

概要

専門用語登録

専門用語一括登録

専門用語取得

専門用語削除

専門用語一括削除

専門用語更新

専門分野

言語一覧取得

テキスト翻訳

音声認識

音声認識(ストリーミング)

エラーコード

その他

FAQ

お問い合わせ

リリース履歴

製品

T-4OO Desktop

Slack翻訳プラグイン

Wordpress翻訳プラグイン

Web多言語化ツール

無料で試す

ダーク

ライト

日本語

English

中文

Speech-to-Text (Streaming)

音声認識（ストリーミング）（Beta）

音声ストリーミングはリアルタイムで文字起こしします。ウェブソケットの接続は必須です。

Endpoint

Request details

wss://translate.rozetta-api.io/api/v1/translate/stt-streaming

Header

Header

Description

accessKey, nonce, signature

「認証方法」を参照してください。

Command

Command type

Description

SET_LANGUAGE

音声の言語。現在対応している言語コードは「en」(英語)、「ja」 (日本語)、「zh-CN」(簡体中国語)、「zh-HK」(香港中国語)、「zh-TW」(繫体中国語)。

SET_SAMPLING_RATE

音声ストリーミングの採取率。推奨採取率は16000です。

END_STREAM

音声ストリーミングの終了を識別する。入力値の必要はありません。

END_SESSION

音声認識セッションの終了。入力値の必要はありません。セッションが終了の場合、WebSocketの接続もクローズとする。

Audio stream

音声ストリーミングはバイナリデータとして、WebSocketで送信します。音声ストリーミングでは1チャンネル、16bps、WAVフォーマットが必要です。

Response

Response type

Description

LANGUAGE_READY

スピーチ言語のセッティングが完了

SAMPLING_RATE_READY

音声ストリーミングの採取率のセッティングが完了。

RECOGNITION_RESULT

音声ストリーミングの文字起こし結果。音声認識の結果は複数回送信される可能性があります。

RECOGNITION_ERROR

音声認識のエラーメッセージ。エラーメッセージは複数回送信される可能性があります。

api/v1/translate/stt-streaming

音声ストリーミングを送信し、リアルタイムで結果を取得する。

JavaScript

const fs = require('fs');
const WebSocket = require('ws');

const path = require('path');

const authUtils = require('./utils/auth-utils');
const fsUtils = require('./utils/fs-utils');

const fsPromise = fs.promises;

const apiPath = '/api/v1/translate/stt-streaming';
const apiEndpoint = `wss://translate.rozetta-api.io${apiPath}`;
const authConfig = {
  accessKey: 'ACCESS_KEY',
  secretKey: 'SECRET_KEY',
  nonce: Date.now().toString(),
  contractId: 'CONTRACT_ID',
};
const speechData = {
  language: 'ja',
  samplingRate: 16000,
  audioFile: 'speech.wav',
  audioBuffer: null,
};

/**
* Command type sent from the client.
*/
const commandType = {
  setLanguage: 'SET_LANGUAGE',
  setSamplingRate: 'SET_SAMPLING_RATE',
  endStream: 'END_STREAM',
  endSession: 'END_SESSION',
};

/**
* Response types received from API endpoint.
*/
const responseType = {
  languageReady: 'LANGUAGE_READY',
  samplingRateReady: 'SAMPLING_RATE_READY',
  recognitionResult: 'RECOGNITION_RESULT',
  recognitionError: 'RECOGNITION_ERROR',
};

const getAuth = (url) => {
  const nonce = Date.now().toString();
  return {
      accessKey: authConfig.accessKey,
      nonce: nonce,
      signature: generateSignature(url, authConfig.secretKey, nonce),
      remoteurl: url,
      contractId: authConfig.contractId
  }
}

const handleSessionMessage = (connection, message) => {
  const messageJSON = JSON.parse(message);
  switch (messageJSON.type) {
    case responseType.languageReady:
      // The language is set. Set the sampling rate.
      console.log('Language is set. Set sampling rate.');
      connection.send(JSON.stringify({
        command: commandType.setSamplingRate,
        value: speechData.samplingRate,
      }));
      break;
    case responseType.samplingRateReady:
      // The language is set. Send the audio data stream.
      console.log('Sampling rate is set. Send audio data stream.');
      connection.send(speechData.audioBuffer);
      connection.send(JSON.stringify({
        command: commandType.endStream,
      }));
      break;
    case responseType.recognitionResult:
      console.log('Recognized transcript:');
      console.log(messageJSON.value);
      break;
    case responseType.recognitionError:
      console.error('Recognition error:');
      console.error(messageJSON.value);
      // In case of error, we close the connection immediately.
      connection.send(JSON.stringify({
        command: commandType.endSession,
      }));
      break;
    default:
      console.log('Unexpected response type:');
      console.log(messageJSON.type);
  }
};

const main = async () => {
  speechData.audioBuffer = await fsPromise.readFile(speechData.audioFile);
  const auth = getAuth(apiPath);
  console.log(apiPath);
  console.log(auth);
  const auth64 = btoa(JSON.stringify(auth));
  const url = `${apiEndpoint}?auth=${auth64}`
  console.log(url);
  const connection = new WebSocket(url);
  connection.on('open', () => {
    console.log('Connected to streaming STT API.');
    // Once connected, set the speech language.
    connection.send(JSON.stringify({
      command: commandType.setLanguage,
      value: speechData.language,
    }));
  });
  connection.on('message', (message) => {
    handleSessionMessage(connection, message);
  });
  connection.on('error', (error) => {
    console.error(error.message);
    connection.close();
  });
  connection.on('close', () => {
    console.log('Connection closed.');
  });
};

main();

認証については、「認証方法」をご参照ください。

各言語の完全版のサンプルコードをここで参照できます。