[stt] 음성인식 API 가이드(#aiq #stt #유튜브 #자막추출)

2022. 4. 28. 10:07

'이다지의 사도세자 뒷이야기' 음성 파일(wav 파일, 14분)을 aiq의 stream 기능을 이용해서 텍스트 추출을 해 볼 예정입니다.

대화형 인공지능 회사 skelterlabs의 aiq 기술의 api를 이용해서 stt(speech-to-text)를 해봅시다.

https://aiq.skelterlabs.com/

AIQ | 대화형 인공지능을 필요한 만큼 사용하세요.

AIQ는 인공지능 기반의 챗봇 및 음성 인식 솔루션을 제공합니다. 가장 진보된 대화형 인공지능, 지금 가입하고 무료로 사용하세요.

aiq.skelterlabs.com

* 이 글은 아래 노션 페이지를 기반으로 작성되었습니다.

https://www.notion.so/API-32daceb508ab430bbf27e15a0387c971

음성인식 API 가이드

@AIQ 사용자 가이드 로 돌아가기

www.notion.so

<프로세스>

1. API Key 발급받기 :https://www.notion.so/API-32daceb508ab430bbf27e15a0387c971

음성인식 API 가이드

@AIQ 사용자 가이드 로 돌아가기

www.notion.so

2. stt 코드 다운받기 : https://github.com/SkelterLabsInc/python-speech

3. 원하는 유튜브 영상 wav 파일로 만들기 : "idaji_sadoseja_backstory.wav" 미리 만들어둔 파일 사용할 예정

https://drive.google.com/file/d/1nNksZAS0MWB3z3V_5RHeFWrlIgLfBnNt/view?usp=sharing

Google Drive - 모든 파일을 한 곳에서

하나의 계정으로 모든 Google 서비스를 Google Drive로 이동하려면 로그인하세요.

accounts.google.com

4. conda로 가상환경 세팅하기

$ conda create -n aiq python=3.7

$ conda activate aiq

5. 필요한 패키지들 설치하기

https://github.com/SkelterLabsInc/python-speech/tree/master/stt

GitHub - SkelterLabsInc/python-speech

Contribute to SkelterLabsInc/python-speech development by creating an account on GitHub.

github.com

$ pip install -U -r ./requirements.txt

6. wav 파일 인식할 수 있도록 grpc_stream.py 파일 수정

- aiq portal address,

- api key,

- wav 파일 이름,

- wav 파일 경로

↓ 코드 수정

7. 텍스트 파일 저장하도록 grpc_utils.py, utils.py 수정

- 수정 내용 : print 되는 word들을 리스트에 담고, txt 파일로 저장하는 코드 추가

- 수정된 코드 깃허브 주소 : https://github.com/donghwan2/stt_aiq

8. stt 폴더에 wav 파일 추가

파일 경로 : https://drive.google.com/file/d/1nNksZAS0MWB3z3V_5RHeFWrlIgLfBnNt/view?usp=sharing

Google Drive - 모든 파일을 한 곳에서

하나의 계정으로 모든 Google 서비스를 Google Drive로 이동하려면 로그인하세요.

accounts.google.com

9. 코드 실행

$ ./grpc_stream.py --api-key=<your API key>

(코드 실행 화면)

datart

[stt] 음성인식 API 가이드(#aiq #stt #유튜브 #자막추출)

+ Recent posts

티스토리툴바