AI Audio Generators and ‘Her’

In 2021, SBS’s New Year Special Program 〈Fight of the Century: AI vs human〉 covered a composing competition between South Korea’s first AI songwriter EVOM, and human songwriter Do-il Kim.

Image from Google Deepmind’s V2A Technology Introduction Page

Three years have passed, and AI audio generators have made rapid growth. and ElevenLabs announced their technology could generate a maximum of three-minute-length music and sound effects, including “People cheering in the baseball stadium” with text prompts. Also, Google DeepMind recently revealed a V2A(Video-to-Audio) technology that could ‘understand’ the videos users upload and generate soundtracks that match the atmosphere.

AI audio generators brought a new wave to music by enabling people to compose through text. However, when the ‘human voice’ intervenes, it guides us to the other direction we should not overlook. For example, IU’s ‘AI cover’ of the song 〈밤양갱(Chestnut Paste Jelly)〉 made with machine-learned IU’s voice, does not serve IU’s neighboring copyright and her personality right due to its process.

A still image from the movie 〈Her〉

Furthermore, this case stretches to the case of ‘imitating’. Open AI’s latest multi-modal, GPT-4o received handsome praise for its ability to recognize voice · text · video, yet was criticized for violating Scarlett Johansson’s publicity right who took the role of AI assistant in the movie 〈Her〉 by ‘impersonating’ her voice for their voice module, ‘Sky’.

Overall, the AI audio generators’ technology and usage will not be stopped at the aspect of the methodology. Its future will be formed with the discourse surrounding ethics, regulations, and limitations.


  1. 스테이블 오디오 2.0(Stable Audio 2.0) 소개 페이지

2. 일레븐랩스 음향효과 소개 페이지  3. 구글 딥마인드 V2A 기술 소개 페이지

 4. [AI 톡터뷰] 작곡가 안창욱 교수에게 “AI 작곡가 이봄”을 묻다

 5. 쏟아지는 ‘AI 커버곡’… 기술 경쟁-저작권 사이 ‘딜레마’

 6. ‘AI 아이유’가 부른 밤양갱… 44만뷰 찍어도 ‘진짜 아이유’ 몫은 0원

 7. 스칼렛 요한슨 "오픈AI, GPT-4o 출연 거절하자 목소리 베껴"

 8. 수츠케버 퇴사에 ‘안전핀’ 잃은 샘 올트먼… 스칼렛 요한슨 ‘목소리 모방’ 논란에 진퇴양난


