Bark：基于转换器的文本到音频模型

Bark是由Suno创建的一个基于转换器的文本到音频模型。Bark可以生成高度逼真的多语言语音以及其他音频，包括音乐、背景噪音和简单的音效。该模型还可以产生非语言交流，如大笑、叹息和哭泣。为了支持研究社区，我们正在提供对预先训练的模型检查点的访问，以便进行推理。

1、安装

pip install git+https://github.com/suno-ai/bark.git

git clone https://github.com/suno-ai/bark
cd bark && pip install .

2、支持的语言

3、基本用法

from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
from IPython.display import Audio

# download and load all models
preload_models()

# generate audio from text
text_prompt = """
Hello, my name is Suno. And, uh — and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.
"""
audio_array = generate_audio(text_prompt)

# save audio to disk
write_wav("bark_generation.wav", SAMPLE_RATE, audio_array)

# play text in notebook
Audio(audio_array, rate=SAMPLE_RATE)

如果转换其它语言，比如日文

text_prompt = """
    추석은 내가 가장 좋아하는 명절이다. 나는 며칠 동안 휴식을 취하고 친구 및 가족과 시간을 보낼 수 있습니다.
"""
audio_array = generate_audio(text_prompt)

音乐：

text_prompt = """
    ♪ In the jungle, the mighty jungle, the lion barks tonight ♪
"""
audio_array = generate_audio(text_prompt)

语言预置（只列出中文的）：

text_prompt = """
    I have a silky smooth voice, and today I will tell you about 
    the exercise regimen of the common sloth.
"""
audio_array = generate_audio(text_prompt, history_prompt="v2/en_speaker_1")

Bark：基于转换器的文本到音频模型-LMLPHP

一些非语音的声音：

[laughter]
[laughs]
[sighs]
[music]
[gasps]
[clears throat]
— or ... for hesitations
♪ for song lyrics
CAPITALIZATION for emphasis of a word
[MAN] and [WOMAN] to bias Bark toward male and female speakers, respectively

caridle

Bark：基于转换器的文本到音频模型