Cantai API
Integrate professional AI singing into your applications
Transform Your Applications
Singing Games
Create karaoke games, rhythm games, or music education apps with real-time vocal synthesis
Music Production
Build music streaming apps, vocal synthesis services, or voice customization platforms
Education
Build interactive music theory tools, ear training apps, or vocal harmony visualizers
Content Creation
Generate singing vocals for podcasts, videos, or interactive media projects
Universal Input Support
Work with your preferred music formats
Standard Notation
- MIDI Files
- ABC Notation Files
- Lilypond Files
- MusicXML Files
Developer Friendly
- JSON Score Format
- YAML Configuration
- CANTAI Format Files
- Plain Text Lyrics
Advanced Features
- Batch Processing
- Voice Cloning
- Style Transfer
Voice Output Formats
- WAV Audio Files
- OPUS Format
- OGG Format
- MP3 Format
Quick Start
song:
notes: ["C4", "E4", "G4"]
lyrics: ["Let's", "play", "now"]
tempo: 120
const response = await fetch('https://cantai.app/v1/synthesize', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify(songData)
});
curl -X POST https://cantai.app/v1/synthesize \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d @song.yaml
High Performance
Process multiple voices simultaneously with our multi-threaded rendering engine for real-time synthesis and batch processing
Voice Types
- High Soprano (Operatic)
- Mezzo-Soprano
- Tenor & Bass
- Classical, Opera & Gospel Choir
- Children's Choir
- Custom Models (On Request)
Cloud Storage
Automatic file handling with secure cloud storage integration and version control for all your voice synthesis projects
Core Endpoints
/api/upload
Upload and normalize music files in any supported notation format
Supported Formats:
- MIDI (.mid)
- MusicXML (.xml, .mxl)
- ABC Notation (.abc)
- JSON-based notation (.json)
- Gibber (.gibber)
{
"id": "123abc",
"detectedFormat": "midi",
"status": "uploaded"
}
/api/render
Convert notation into high-quality AI vocals with customizable parameters
Parameters:
id
- Upload ID from previous stepoutputFormat
- WAV, MP3, or OpusvoiceProfile
- Voice type selectionreverb
- Enable/disable reverb processing
/api/transcribe
Convert audio recordings into musical notation
Features:
- WAV/MP3/Opus input support
- MIDI/MusicXML output formats
- Automatic note detection
- Rhythm analysis
/api/status/{id}
Check processing status of rendering or transcription tasks
Status Values:
queued
- In processing queueprocessing
- Currently being processedcompleted
- Ready for downloadfailed
- Error occurred
/api/download/{id}
Download processed audio or notation files
Returns:
- Audio files (WAV/MP3/Opus)
- Notation files (MIDI/MusicXML)
Simple, Usage-Based Pricing
Pay-As-You-Go
- AWS GPU costs ~$0.10/min
- No Monthly Minimum
- 50%+ Profit Margin
- Basic Support
High-Volume Bulk
- Batch Processing
- GPU Costs ≤$0.05/min
- Priority Support
- Volume Discounts
Enterprise License
- ~50,000 Minutes Included
- $0.20/min Additional Usage
- SLA & Dedicated Support
- Custom Integration
All plans include standard voice models, API access, and basic audio formats
Prices shown in USD. Additional fees may apply for high-priority processing or specialized voice models.
⚠️ API details and supported formats may change without notice before our Spring 2025 launch date. Check back for updates.