과제02-Google Cloud : Speech-to-Text API(4)

Notice

Recent Posts

Recent Comments

Link

자바, spring

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

민팽로그

과제02-Google Cloud : Speech-to-Text API(4) 본문

프로젝트/소리마당(2022 프로보노)

과제02-Google Cloud : Speech-to-Text API(4)

민팽 2021. 8. 20. 15:11

스트리밍 방식 음성 인식

으악!!!!!

암튼 마지막 스트리밍 방식 예제 코드를 돌려보았다.

import com.google.api.gax.rpc.ClientStream;
import com.google.api.gax.rpc.ResponseObserver;
import com.google.api.gax.rpc.StreamController;
import com.google.cloud.speech.v1.*;
import com.google.protobuf.ByteString;

import java.util.ArrayList;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.DataLine.Info;
import javax.sound.sampled.TargetDataLine;

public class StreamingTest {
	
	//마이크인식 함수 선언
	public static void streamingMicRecognize() throws Exception {

		  ResponseObserver<StreamingRecognizeResponse> responseObserver = null;
		  try (SpeechClient client = SpeechClient.create()) {

		    responseObserver =
		        new ResponseObserver<StreamingRecognizeResponse>() {
		          ArrayList<StreamingRecognizeResponse> responses = new ArrayList<>();

		          public void onStart(StreamController controller) {}

		          public void onResponse(StreamingRecognizeResponse response) {
		            responses.add(response);
		          }

		          public void onComplete() {
		            for (StreamingRecognizeResponse response : responses) {
		              StreamingRecognitionResult result = response.getResultsList().get(0);
		              SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
		              System.out.printf("Transcript : %s\n", alternative.getTranscript());
		            }
		          }

		          public void onError(Throwable t) {
		            System.out.println(t);
		          }
		        };

		    ClientStream<StreamingRecognizeRequest> clientStream =
		        client.streamingRecognizeCallable().splitCall(responseObserver);

		    RecognitionConfig recognitionConfig =
		        RecognitionConfig.newBuilder()
		            .setEncoding(RecognitionConfig.AudioEncoding.LINEAR16)
		            .setLanguageCode("ko-KR")
		            .setSampleRateHertz(16000)
		            .build();
		    StreamingRecognitionConfig streamingRecognitionConfig =
		        StreamingRecognitionConfig.newBuilder().setConfig(recognitionConfig).build();

		    StreamingRecognizeRequest request =
		        StreamingRecognizeRequest.newBuilder()
		            .setStreamingConfig(streamingRecognitionConfig)
		            .build(); // The first request in a streaming call has to be a config

		    clientStream.send(request);
		    // SampleRate:16000Hz, SampleSizeInBits: 16, Number of channels: 1, Signed: true,
		    // bigEndian: false
		    AudioFormat audioFormat = new AudioFormat(16000, 16, 1, true, false);
		    DataLine.Info targetInfo =
		        new Info(
		            TargetDataLine.class,
		            audioFormat); // Set the system information to read from the microphone audio stream

		    if (!AudioSystem.isLineSupported(targetInfo)) {
		      System.out.println("Microphone not supported");
		      System.exit(0);
		    }
		    // Target data line captures the audio stream the microphone produces.
		    TargetDataLine targetDataLine = (TargetDataLine) AudioSystem.getLine(targetInfo);
		    targetDataLine.open(audioFormat);
		    targetDataLine.start();
		    System.out.println("Start speaking");
		    long startTime = System.currentTimeMillis();
		    // Audio Input Stream
		    AudioInputStream audio = new AudioInputStream(targetDataLine);
		    while (true) {
		      long estimatedTime = System.currentTimeMillis() - startTime;
		      byte[] data = new byte[6400];
		      audio.read(data);
		      if (estimatedTime > 60000) { // 60 seconds
		        System.out.println("Stop speaking.");
		        targetDataLine.stop();
		        targetDataLine.close();
		        break;
		      }
		      request =
		          StreamingRecognizeRequest.newBuilder()
		              .setAudioContent(ByteString.copyFrom(data))
		              .build();
		      clientStream.send(request);
		    }
		  } catch (Exception e) {
		    System.out.println(e);
		  }
		  responseObserver.onComplete();
		}
	
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		try {
			streamingMicRecognize();
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

	}

}

- 결과 -

어...... 내 디귿 발음이 그렇게 구린가...?

"왕 달고", "왕 길다" 라고 말했는데 발고 길가는 머야...

암튼 Start speaking이 출력되면 60초동안 발화를 하면 되고, 발화가 끝나면 발화 내용이 콘솔창에 슈루룩 텍스트로 나타난다.

처음에 친구들 이름을 언급했었는데 익명성을 위해(?) 내용을 바꿨당!

'프로젝트 > 소리마당(2022 프로보노)' 카테고리의 다른 글

공모전 참여 후기: 2021 제9회 문화공공데이터 활용경진대회 (0)	2021.08.24
2021.08.22 회의 느낀점 (0)	2021.08.23
과제02-Google Cloud : Speech-to-Text API(3) (0)	2021.08.20
과제02-Google Cloud : Speech-to-Text API(2) (0)	2021.08.20
과제02-Google Cloud : Speech-to-Text API(1) (0)	2021.08.20

'프로젝트/소리마당(2022 프로보노)' Related Articles

Comments

민팽로그

과제02-Google Cloud : Speech-to-Text API(4) 본문

과제02-Google Cloud : Speech-to-Text API(4)

스트리밍 방식 음성 인식

'프로젝트 > 소리마당(2022 프로보노)' 카테고리의 다른 글

티스토리툴바