作成日: 2024-08-11
更新日: 2024-07-31
portでespeak-egをmacOSにインストールする
espeak-ngをportでインストール
dockerでespeak-ngを使う
sszuev/ubuntu-jammy-openjdk-17-espeak-ng
sszuev/ubuntu-jammy-openjdk-17-espeak-ng
というdocker imageが公開されているので、こちらをpullして試してみます。
docker pull sszuev/ubuntu-jammy-openjdk-17-espeak-ng
Using default tag: latest
latest: Pulling from sszuev/ubuntu-jammy-openjdk-17-espeak-ng
Digest: sha256:4ecfd2fe2689ea9392665594b5225137718bc6f8f2153d2c5ffb76fd0095089e
Status: Image is up to date for sszuev/ubuntu-jammy-openjdk-17-espeak-ng:latest
docker.io/sszuev/ubuntu-jammy-openjdk-17-espeak-ng:latest
What's next:
View a summary of image vulnerabilities and recommendations → docker scout quickview sszuev/ubuntu-jammy-openjdk-17-espeak-ng
docker run -it sszuev/ubuntu-jammy-openjdk-17-espeak-ng espeak-ng -w test.wav 'this is a pen'
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Warningが出ていますが、とりあえず動かしてみます。
/usr/bin/espeak-ng: /usr/bin/espeak-ng: cannot execute binary file
docker run --platform linux/x86_64 -it sszuev/ubuntu-jammy-openjdk-17-espeak-ng espeak-ng -w test.wav 'this is a pen'
/usr/bin/espeak-ng: /usr/bin/espeak-ng: cannot execute binary file
containerに入って実行してみます。
root@09f830188c86:/# espeak-ng 'This is a pen'
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
error: No such file or directory
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
error: No such file or directory
root@09f830188c86:/#
ホスト側のサウンドカードを鳴らそうとして、失敗しているように見えます。
lukeum/espeak-ng
lukeum/espeak-ng
というdocker imageが公開されているので、こちらをpullして試してみます。
docker pull lukeum/espeak-ng
Using default tag: latest
latest: Pulling from lukeum/espeak-ng
Digest: sha256:b78ccbe9da2f1cdc66dabf34e0643681e75976f477640fe0ec0513fa5e81b8e9
Status: Image is up to date for lukeum/espeak-ng:latest
docker.io/lukeum/espeak-ng:latest
What's next:
View a summary of image vulnerabilities and recommendations → docker scout quickview lukeum/espeak-ng
同様にWarningが出ていますが、とりあえず動かしてみます。 i
docker run -it lukeum/espeak-ng espeak-ng -w test.wav 'this is a pen'
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
docker run --platform linux/x86_64 -it lukeum/espeak-ng espeak-ng -w test.wav 'this is a pen'
エラーも出ずに終了しましたが、肝心のtest.wavが見つかりません。
containerに入って実行してみます。
root@09f830188c86:/# espeak-ng 'This is a pen'
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
error: No such file or directory
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
error: No such file or directory
root@09f830188c86:/#
同様に、ホスト側のサウンドカードを鳴らそうとして、失敗しているように見えます。
portでmacOSにespeak-ngをインストール
仕方がないので、portでインストールしてみます。依存関係がいっぱいあるので、憂鬱になります。
sudo port install espeak-ng
Password:
---> Computing dependencies for espeak-ng
The following dependencies will be installed:
Xft2
at-spi2-atk
at-spi2-core
atk
avahi
bash
brotli
cairo
coreutils
db48
dbus
dbus-glib
dbus-python312
fftw-3
fftw-3-single
flac
fontconfig
freetype
fribidi
gdbm
gdk-pixbuf2
gettext
gettext-tools-libs
glib2
gmp
gobject-introspection
graphite2
gtk3
harfbuzz
hicolor-icon-theme
iso-codes
lame
lerc
libdaemon
libdeflate
libelf
libepoxy
libevent
libgcc
libgcc14
libjpeg-turbo
libogg
libopus
libpixman
libpng
libsndfile
libtextstyle
libtool
libvorbis
libxkbcommon
libxkbcommon-x11
libxml2
lz4
m4
mesa
mpg123
orc
ossp-uuid
pango
pcaudiolib
pcre2
perl5.34
pulseaudio
py312-cairo
py312-gdbm
py312-gobject3
py312-mako
py312-markdown
py312-markupsafe
py312-packaging
py312-setuptools
python312
readline
shared-mime-info
sonic
soxr
speexdsp
tiff
xkbcomp
xkeyboard-config
xorg-libX11
xorg-libXau
xorg-libXcomposite
xorg-libXcursor
xorg-libXdamage
xorg-libXdmcp
xorg-libXext
xorg-libXfixes
xorg-libXi
xorg-libXinerama
xorg-libXrandr
xorg-libXtst
xorg-libice
xorg-libsm
xorg-libxcb
xorg-libxkbfile
xorg-xcb-proto
xorg-xcb-util
xorg-xorgproto
xrender
zstd
Continue? [Y/n]: y
インストールは問題なく終了したので、早速使ってみます。
espeak-ng 'This is a pen'
おそらくは何らかの処理化処理を行ったのであろう。 初回起動時はもっさりと起動しましが、二回目以降はスムーズに起動するようになりました。
次にwav形式でファイルに出力してみます。
espeak-ng 'This is a pen' -w test.wav
これも問題なく出力されました。
espeak-ng --help
eSpeak NG text-to-speech: 1.51.1 Data at: /opt/local/share/espeak-ng-data
espeak-ng [options] ["<words>"]
-f <text file> Text file to speak
--stdin Read text input from stdin at once till to the end of a stream.
If neither -f nor --stdin are provided, then <words> from arguments are spoken,
or text is spoken from stdin, read separately one line by line at a time.
-a <integer>
Amplitude, 0 to 200, default is 100
-d <device>
Use the specified device to speak the audio on. If not specified, the
default audio device is used.
-g <integer>
Word gap. Pause between words, units of 10mS at the default speed
-k <integer>
Indicate capital letters with: 1=sound, 2=the word "capitals",
higher values indicate a pitch increase (try -k20).
-l <integer>
Line length. If not zero (which is the default), consider
lines less than this length as end-of-clause
-p <integer>
Pitch adjustment, 0 to 99, default is 50
-s <integer>
Speed in approximate words per minute. The default is 175
-v <voice name>
Use voice file of this name from espeak-ng-data/voices
-w <wave file name>
Write speech to this WAV file, rather than speaking it directly
-b Input text encoding, 1=UTF8, 2=8 bit, 4=16 bit
-m Interpret SSML markup, and ignore other < > tags
-q Quiet, don't produce any speech (may be useful with -x)
-x Write phoneme mnemonics to stdout
-X Write phonemes mnemonics and translation trace to stdout
-z No final sentence pause at the end of the text
--compile=<voice name>
Compile pronunciation rules and dictionary from the current
directory. <voice name> specifies the language
--compile-debug=<voice name>
Compile pronunciation rules and dictionary from the current
directory, including line numbers for use with -X.
<voice name> specifies the language
--compile-mbrola=<voice name>
Compile an MBROLA voice
--compile-intonations
Compile the intonation data
--compile-phonemes=<phsource-dir>
Compile the phoneme data using <phsource-dir> or the default phsource directory
--ipa Write phonemes to stdout using International Phonetic Alphabet
--path="<path>"
Specifies the directory containing the espeak-ng-data directory
--pho Write mbrola phoneme data (.pho) to stdout or to the file in --phonout
--phonout="<filename>"
Write phoneme output from -x -X --ipa and --pho to this file
--punct="<characters>"
Speak the names of punctuation characters during speaking. If
=<characters> is omitted, all punctuation is spoken.
--sep=<character>
Separate phonemes (from -x --ipa) with <character>.
Default is space, z means ZWJN character.
--split=<minutes>
Starts a new WAV file every <minutes>. Used with -w
--stdout Write speech output to stdout
--tie=<character>
Use a tie character within multi-letter phoneme names.
Default is U+361, z means ZWJ character.
--version Shows version number and date, and location of espeak-ng-data
--voices=<language>
List the available voices for the specified language.
If <language> is omitted, then list all voices.
--load Load voice from a file in current directory by name.
-h, --help Show this help.
espeak-ng --voices
Pty Language Age/Gender VoiceName File Other Languages
5 af --/M Afrikaans gmw/af
5 am --/M Amharic sem/am
5 an --/M Aragonese roa/an
5 ar --/M Arabic sem/ar
5 as --/M Assamese inc/as
5 az --/M Azerbaijani trk/az
5 ba --/M Bashkir trk/ba
5 be --/M Belarusian zle/be
5 bg --/M Bulgarian zls/bg
5 bn --/M Bengali inc/bn
5 bpy --/M Bishnupriya_Manipuri inc/bpy
5 bs --/M Bosnian zls/bs
5 ca --/M Catalan roa/ca
5 chr-US-Qaaa-x-west --/M Cherokee_ iro/chr
5 cmn --/M Chinese_(Mandarin,_latin_as_English) sit/cmn (zh-cmn 5)(zh 5)
5 cmn-latn-pinyin --/M Chinese_(Mandarin,_latin_as_Pinyin) sit/cmn-Latn-pinyin (zh-cmn 5)(zh 5)
5 cs --/M Czech zlw/cs
5 cv --/M Chuvash trk/cv
5 cy --/M Welsh cel/cy
5 da --/M Danish gmq/da
5 de --/M German gmw/de
5 el --/M Greek grk/el
5 en-029 --/M English_(Caribbean) gmw/en-029 (en 10)
2 en-gb --/M English_(Great_Britain) gmw/en (en 2)
5 en-gb-scotland --/M English_(Scotland) gmw/en-GB-scotland (en 4)
5 en-gb-x-gbclan --/M English_(Lancaster) gmw/en-GB-x-gbclan (en-gb 3)(en 5)
5 en-gb-x-gbcwmd --/M English_(West_Midlands) gmw/en-GB-x-gbcwmd (en-gb 9)(en 9)
5 en-gb-x-rp --/M English_(Received_Pronunciation) gmw/en-GB-x-rp (en-gb 4)(en 5)
2 en-us --/M English_(America) gmw/en-US (en 3)
5 en-us-nyc --/M English_(America,_New_York_City) gmw/en-US-nyc
5 eo --/M Esperanto art/eo
5 es --/M Spanish_(Spain) roa/es
5 es-419 --/M Spanish_(Latin_America) roa/es-419 (es-mx 6)(es 6)
5 et --/M Estonian urj/et
5 eu --/M Basque eu
5 fa --/M Persian ira/fa
5 fa-latn --/M Persian_(Pinglish) ira/fa-Latn
5 fi --/M Finnish urj/fi
5 fr-be --/M French_(Belgium) roa/fr-BE (fr 8)
5 fr-ch --/M French_(Switzerland) roa/fr-CH (fr 8)
5 fr-fr --/M French_(France) roa/fr (fr 5)
5 ga --/M Gaelic_(Irish) cel/ga
5 gd --/M Gaelic_(Scottish) cel/gd
5 gn --/M Guarani sai/gn
5 grc --/M Greek_(Ancient) grk/grc
5 gu --/M Gujarati inc/gu
5 hak --/M Hakka_Chinese sit/hak
5 haw --/M Hawaiian map/haw
5 he --/M Hebrew sem/he
5 hi --/M Hindi inc/hi
5 hr --/M Croatian zls/hr (hbs 5)
5 ht --/M Haitian_Creole roa/ht
5 hu --/M Hungarian urj/hu
5 hy --/M Armenian_(East_Armenia) ine/hy (hy-arevela 5)
5 hyw --/M Armenian_(West_Armenia) ine/hyw (hy-arevmda 5)(hy 8)
5 ia --/M Interlingua art/ia
5 id --/M Indonesian poz/id
5 io --/M Ido art/io
5 is --/M Icelandic gmq/is
5 it --/M Italian roa/it
5 ja --/M Japanese jpx/ja
5 jbo --/M Lojban art/jbo
5 ka --/M Georgian ccs/ka
5 kk --/M Kazakh trk/kk
5 kl --/M Greenlandic esx/kl
5 kn --/M Kannada dra/kn
5 ko --/M Korean ko
5 kok --/M Konkani inc/kok
5 ku --/M Kurdish ira/ku
5 ky --/M Kyrgyz trk/ky
5 la --/M Latin itc/la
5 lb --/M Luxembourgish gmw/lb
5 lfn --/M Lingua_Franca_Nova art/lfn
5 lt --/M Lithuanian bat/lt
5 ltg --/M Latgalian bat/ltg
5 lv --/M Latvian bat/lv
5 mi --/M Māori poz/mi
5 mk --/M Macedonian zls/mk
5 ml --/M Malayalam dra/ml
5 mr --/M Marathi inc/mr
5 ms --/M Malay poz/ms
5 mt --/M Maltese sem/mt
5 my --/M Myanmar_(Burmese) sit/my
5 nb --/M Norwegian_Bokmål gmq/nb (no 5)
5 nci --/M Nahuatl_(Classical) azc/nci
5 ne --/M Nepali inc/ne
5 nl --/M Dutch gmw/nl
5 nog --/M Nogai trk/nog
5 om --/M Oromo cus/om
5 or --/M Oriya inc/or
5 pa --/M Punjabi inc/pa
5 pap --/M Papiamento roa/pap
5 piqd --/M Klingon art/piqd
5 pl --/M Polish zlw/pl
5 pt --/M Portuguese_(Portugal) roa/pt (pt-pt 5)
5 pt-br --/M Portuguese_(Brazil) roa/pt-BR (pt 6)
5 py --/M Pyash art/py
5 qdb --/M Lang_Belta art/qdb
5 qu --/M Quechua qu
5 quc --/M K'iche' myn/quc
5 qya --/M Quenya art/qya
5 ro --/M Romanian roa/ro
5 ru --/M Russian zle/ru
2 ru-lv --/M Russian_(Latvia) zle/ru-LV
5 sd --/M Sindhi inc/sd
5 shn --/M Shan_(Tai_Yai) tai/shn
5 si --/M Sinhala inc/si
5 sjn --/M Sindarin art/sjn
5 sk --/M Slovak zlw/sk
5 sl --/M Slovenian zls/sl
5 smj --/M Lule_Saami urj/smj
5 sq --/M Albanian ine/sq
5 sr --/M Serbian zls/sr
5 sv --/M Swedish gmq/sv
5 sw --/M Swahili bnt/sw
5 ta --/M Tamil dra/ta
5 te --/M Telugu dra/te
5 th --/M Thai tai/th
5 tk --/M Turkmen trk/tk
5 tn --/M Setswana bnt/tn
5 tr --/M Turkish trk/tr
5 tt --/M Tatar trk/tt
5 ug --/M Uyghur trk/ug
5 uk --/M Ukrainian zle/uk
5 ur --/M Urdu inc/ur
5 uz --/M Uzbek trk/uz
5 vi --/M Vietnamese_(Northern) aav/vi
5 vi-vn-x-central --/M Vietnamese_(Central) aav/vi-VN-x-central
5 vi-vn-x-south --/M Vietnamese_(Southern) aav/vi-VN-x-south
5 yue --/M Chinese_(Cantonese) sit/yue (zh-yue 5)(zh 8)
5 yue --/M Chinese_(Cantonese,_latin_as_Jyutping) sit/yue-Latn-jyutping (zh-yue 5)(zh 8)
一番右のコードを指定すると、その声を使って発声してくれる、はずなのですが、イントネーションが変わるけれども、音声は変わりません。
espeak-ng -v gmw/en-US-nyc 'this is a pen'
-x を指定すると、発音記号を出力して発声してくれます。
espeak-ng -x Ubuntu
u:b'u:ntu:
今日はここまで。