Cantai API

Integrate professional AI singing into your applications

Transform Your Applications

Singing Games

Create karaoke games, rhythm games, or music education apps with real-time vocal synthesis

Music Production

Build music streaming apps, vocal synthesis services, or voice customization platforms

Education

Build interactive music theory tools, ear training apps, or vocal harmony visualizers

Content Creation

Generate singing vocals for podcasts, videos, or interactive media projects

Universal Input Support

Work with your preferred music formats

Standard Notation

MIDI Files
ABC Notation Files
Lilypond Files
MusicXML Files

Developer Friendly

JSON Score Format
YAML Configuration
CANTAI Format Files
Plain Text Lyrics

Advanced Features

Batch Processing
Voice Cloning
Style Transfer

Voice Output Formats

WAV Audio Files
OPUS Format
OGG Format
MP3 Format

Quick Start

song.yaml

song:
  notes: ["C4", "E4", "G4"]
  lyrics: ["Let's", "play", "now"]
  tempo: 120

fetch.js

const response = await fetch('https://cantai.app/v1/synthesize', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify(songData)
});

curl

curl -X POST https://cantai.app/v1/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d @song.yaml

High Performance

Process multiple voices simultaneously with our multi-threaded rendering engine for real-time synthesis and batch processing

Voice Types

High Soprano (Operatic)
Mezzo-Soprano
Tenor & Bass
Classical, Opera & Gospel Choir
Children's Choir
Custom Models (On Request)

Cloud Storage

Automatic file handling with secure cloud storage integration and version control for all your voice synthesis projects

Core Endpoints

POST /api/upload

Upload and normalize music files in any supported notation format

Supported Formats:

MIDI (.mid)
MusicXML (.xml, .mxl)
ABC Notation (.abc)
JSON-based notation (.json)
Gibber (.gibber)

{
  "id": "123abc",
  "detectedFormat": "midi",
  "status": "uploaded"
}

POST /api/render

Convert notation into high-quality AI vocals with customizable parameters

Parameters:

id - Upload ID from previous step
outputFormat - WAV, MP3, or Opus
voiceProfile - Voice type selection
reverb - Enable/disable reverb processing

POST /api/transcribe

Convert audio recordings into musical notation

Features:

WAV/MP3/Opus input support
MIDI/MusicXML output formats
Automatic note detection
Rhythm analysis

GET /api/status/{id}

Check processing status of rendering or transcription tasks

Status Values:

queued - In processing queue
processing - Currently being processed
completed - Ready for download
failed - Error occurred

GET /api/download/{id}

Download processed audio or notation files

Returns:

Audio files (WAV/MP3/Opus)
Notation files (MIDI/MusicXML)

Simple, Usage-Based Pricing

Pay-As-You-Go

$0.167/min

AWS GPU costs ~$0.10/min
No Monthly Minimum
50%+ Profit Margin
Basic Support

Includes free trial

High-Volume Bulk

$0.083/min

Batch Processing
GPU Costs ≤$0.05/min
Priority Support
Volume Discounts

Enterprise License

$10,000/month

~50,000 Minutes Included
$0.20/min Additional Usage
SLA & Dedicated Support
Custom Integration

All plans include standard voice models, API access, and basic audio formats

Prices shown in USD. Additional fees may apply for high-priority processing or specialized voice models.

Need Help?

Contact Support

⚠️ API details and supported formats may change without notice before our Spring 2025 launch date. Check back for updates.