일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
Tags
- C++
- Spring Boot
- STT
- idToken
- Spring
- @Transactional
- google cloud
- 2021 제9회 문화공공데이터 활용경진대회
- OG tag
- html
- react native
- oauth
- marksense.ai
- Loss Function
- yolo
- JPA
- skt fellowship 3기
- Expo
- 순환참조
- 코드업
- AWS
- YOLOv5
- javascript
- matplotlib
- 졸프
- google 로그인
- 커스텀 데이터 학습
- 양방향 매핑
- google login
- pandas
Archives
- Today
- Total
민팽로그
과제02-Google Cloud : Speech-to-Text API(4) 본문
스트리밍 방식 음성 인식
으악!!!!!
암튼 마지막 스트리밍 방식 예제 코드를 돌려보았다.
import com.google.api.gax.rpc.ClientStream;
import com.google.api.gax.rpc.ResponseObserver;
import com.google.api.gax.rpc.StreamController;
import com.google.cloud.speech.v1.*;
import com.google.protobuf.ByteString;
import java.util.ArrayList;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.DataLine.Info;
import javax.sound.sampled.TargetDataLine;
public class StreamingTest {
//마이크인식 함수 선언
public static void streamingMicRecognize() throws Exception {
ResponseObserver<StreamingRecognizeResponse> responseObserver = null;
try (SpeechClient client = SpeechClient.create()) {
responseObserver =
new ResponseObserver<StreamingRecognizeResponse>() {
ArrayList<StreamingRecognizeResponse> responses = new ArrayList<>();
public void onStart(StreamController controller) {}
public void onResponse(StreamingRecognizeResponse response) {
responses.add(response);
}
public void onComplete() {
for (StreamingRecognizeResponse response : responses) {
StreamingRecognitionResult result = response.getResultsList().get(0);
SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
System.out.printf("Transcript : %s\n", alternative.getTranscript());
}
}
public void onError(Throwable t) {
System.out.println(t);
}
};
ClientStream<StreamingRecognizeRequest> clientStream =
client.streamingRecognizeCallable().splitCall(responseObserver);
RecognitionConfig recognitionConfig =
RecognitionConfig.newBuilder()
.setEncoding(RecognitionConfig.AudioEncoding.LINEAR16)
.setLanguageCode("ko-KR")
.setSampleRateHertz(16000)
.build();
StreamingRecognitionConfig streamingRecognitionConfig =
StreamingRecognitionConfig.newBuilder().setConfig(recognitionConfig).build();
StreamingRecognizeRequest request =
StreamingRecognizeRequest.newBuilder()
.setStreamingConfig(streamingRecognitionConfig)
.build(); // The first request in a streaming call has to be a config
clientStream.send(request);
// SampleRate:16000Hz, SampleSizeInBits: 16, Number of channels: 1, Signed: true,
// bigEndian: false
AudioFormat audioFormat = new AudioFormat(16000, 16, 1, true, false);
DataLine.Info targetInfo =
new Info(
TargetDataLine.class,
audioFormat); // Set the system information to read from the microphone audio stream
if (!AudioSystem.isLineSupported(targetInfo)) {
System.out.println("Microphone not supported");
System.exit(0);
}
// Target data line captures the audio stream the microphone produces.
TargetDataLine targetDataLine = (TargetDataLine) AudioSystem.getLine(targetInfo);
targetDataLine.open(audioFormat);
targetDataLine.start();
System.out.println("Start speaking");
long startTime = System.currentTimeMillis();
// Audio Input Stream
AudioInputStream audio = new AudioInputStream(targetDataLine);
while (true) {
long estimatedTime = System.currentTimeMillis() - startTime;
byte[] data = new byte[6400];
audio.read(data);
if (estimatedTime > 60000) { // 60 seconds
System.out.println("Stop speaking.");
targetDataLine.stop();
targetDataLine.close();
break;
}
request =
StreamingRecognizeRequest.newBuilder()
.setAudioContent(ByteString.copyFrom(data))
.build();
clientStream.send(request);
}
} catch (Exception e) {
System.out.println(e);
}
responseObserver.onComplete();
}
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
streamingMicRecognize();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
- 결과 -

어...... 내 디귿 발음이 그렇게 구린가...?
"왕 달고", "왕 길다" 라고 말했는데 발고 길가는 머야...
암튼 Start speaking이 출력되면 60초동안 발화를 하면 되고, 발화가 끝나면 발화 내용이 콘솔창에 슈루룩 텍스트로 나타난다.
처음에 친구들 이름을 언급했었는데 익명성을 위해(?) 내용을 바꿨당!
'프로젝트 > 소리마당(2022 프로보노)' 카테고리의 다른 글
공모전 참여 후기: 2021 제9회 문화공공데이터 활용경진대회 (0) | 2021.08.24 |
---|---|
2021.08.22 회의 느낀점 (0) | 2021.08.23 |
과제02-Google Cloud : Speech-to-Text API(3) (0) | 2021.08.20 |
과제02-Google Cloud : Speech-to-Text API(2) (0) | 2021.08.20 |
과제02-Google Cloud : Speech-to-Text API(1) (0) | 2021.08.20 |