下面我将为你提供一个使用 Spring Boot 框架实现文字转语音和语音转文字功能,并对外提供接口的详细教程。

步骤 1: 创建 Spring Boot 项目

使用 Spring Initializr(https://start.spring.io/)创建一个新的 Spring Boot 项目。在添加依赖时选择 "Spring Web" 和 "Thymeleaf"。

步骤 2: 依赖配置

在项目的 pom.xml 文件中,添加百度语音合成和语音识别的依赖:

<dependencies>
    <!-- Spring Web -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Thymeleaf for HTML templates -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-thymeleaf</artifactId>
    </dependency>

    <!-- Baidu AIP SDK -->
    <dependency>
        <groupId>com.baidu.aip</groupId>
        <artifactId>java-sdk</artifactId>
        <version>4.20.0</version>
    </dependency>
</dependencies>

步骤 3: 配置 Baidu AIP

src/main/resources 目录下创建 application.properties 文件,添加百度语音合成和语音识别的配置:

baidu.app.id=YOUR_APP_ID
baidu.api.key=YOUR_API_KEY
baidu.secret.key=YOUR_SECRET_KEY

步骤 4: 创建 Controller 类

src/main/java/com/example/demo 目录下创建一个名为 BaiduAipController.java 的类:

import com.baidu.aip.speech.AipSpeech;
import com.baidu.aip.speech.TtsResponse;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.HashMap;

@RestController
@RequestMapping("/baidu-aip")
public class BaiduAipController {

    @Value("${baidu.app.id}")
    private String appId;

    @Value("${baidu.api.key}")
    private String apiKey;

    @Value("${baidu.secret.key}")
    private String secretKey;

    private AipSpeech initAipSpeechClient() {
        return new AipSpeech(appId, apiKey, secretKey);
    }

    @PostMapping(value = "/text-to-speech", consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_OCTET_STREAM_VALUE)
    public byte[] textToSpeech(@RequestBody TextToSpeechRequest request) {
        AipSpeech client = initAipSpeechClient();

        HashMap<String, Object> options = new HashMap<>();
        options.put("spd", request.getSpeed());
        options.put("pit", request.getPitch());
        options.put("vol", request.getVolume());
        options.put("per", request.getPersonality());

        TtsResponse response = client.synthesis(request.getText(), "zh", 1, options);

        if (response.isSuccess()) {
            return response.getData();
        } else {
            throw new RuntimeException("Failed to convert text to speech. Error: " + response.getErrorCode() + ", " + response.getErrorMsg());
        }
    }

    @PostMapping(value = "/speech-to-text", consumes = MediaType.APPLICATION_OCTET_STREAM_VALUE)
    public String speechToText(@RequestBody byte[] audioData) {
        AipSpeech client = initAipSpeechClient();

        HashMap<String, Object> options = new HashMap<>();
        options.put("dev_pid", 1536); // 普通话(支持简单的英文识别)

        String result = client.asr(audioData, "wav", 16000, options).toString();

        if (result.contains("err_msg")) {
            throw new RuntimeException("Failed to convert speech to text. Error: " + result);
        } else {
            return result;
        }
    }
}

步骤 5: 创建请求模型类

src/main/java/com/example/demo 目录下创建一个名为 TextToSpeechRequest.java 的类:

public class TextToSpeechRequest {

    private String text;
    private String speed;
    private String pitch;
    private String volume;
    private String personality;

    // getters and setters
}

步骤 6: 创建前端页面

src/main/resources/templates 目录下创建一个名为 index.html 的HTML文件:

<!DOCTYPE html>
<html lang="en" xmlns:th="http://www.thymeleaf.org">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Baidu AIP Demo</title>
</head>
<body>

<h2>Text to Speech</h2>
<form action="/baidu-aip/text-to-speech" method="post" enctype="application/json">
    <label for="text">Text:</label>
    <input type="text" id="text" name="text" required>
    <br>
    <label for="speed">Speed (1-15):</label>
    <input type="number" id="speed" name="speed" min="1" max="15" value="5">
    <br>
    <label for="pitch">Pitch (1-15):</label>
    <input type="number" id="pitch" name="pitch" min="1" max="15" value="5">
    <br>
    <label for="volume">Volume (1-15):</label>
    <input type="number" id="volume" name="volume" min="1" max="15" value="5">
    <br>
    <label for="personality">Personality (0 for female, 1 for male):</label>
    <input type="number" id="personality" name="personality" min="0" max="1" value="0">
    <br>
    <button type="submit">Convert to Speech</button>
</form>

<hr>

<h2>Speech to Text</h2>
<form action="/baidu-aip/speech-to-text" method="post" enctype="application/octet-stream">
    <label for="audio">Upload Audio File:</label>
    <input type="file" id="audio" name="audio" accept=".wav" required>
    <br>
    <button type="submit">Convert to Text</button>
</form>

</body>
</html>

步骤 7: 运行应用

在项目的根目录下执行以下命令启动应用:

mvn spring-boot:run

步骤 8: 访问应用

01-04 12:47