20240127在ubuntu20.04.6下配置whisper

20240131在ubuntu20.04.6下配置whisper

2024/1/31 15:48

首先你要有一张NVIDIA的显卡,比如我用的PDD拼多多的二手GTX1080显卡。【并且极其可能是矿卡!】800¥

2、请正确安装好NVIDIA最新的驱动程序和CUDA。可选安装!

3、配置whisper

rootroot@rootroot-X99-Turbo:~$
rootroot@rootroot-X99-Turbo:~$ python -m pip install --upgrade pip
【可以不安装conda】
rootroot@rootroot-X99-Turbo:~$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
rootroot@rootroot-X99-Turbo:~$ ffmpeg
rootroot@rootroot-X99-Turbo:~$ pip install -U openai-whisper
rootroot@rootroot-X99-Turbo:~$ pip install tiktoken
rootroot@rootroot-X99-Turbo:~$ pip install setuptools-rust
rootroot@rootroot-X99-Turbo:~$ whisper audio.mp3 --model medium --language Chinese
rootroot@rootroot-X99-Turbo:~$ whisper chi.mp4 --model medium --language Chinese
rootroot@rootroot-X99-Turbo:~$ sudo apt-get install ffmpeg
rootroot@rootroot-X99-Turbo:~$ time(whisper chs.mp4 --model medium --language Chinese)

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ python -m pip install --upgrade pip

Collecting pip

Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)

|████████████████████████████████| 2.1 MB 690 kB/s

Installing collected packages: pip

Successfully installed pip-23.3.2

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ sudo mkdir /opt/tools

rootroot@rootroot-X99-Turbo:~$ cd /opt/tools/

rootroot@rootroot-X99-Turbo:/opt/tools$

rootroot@rootroot-X99-Turbo:/opt/tools$ ll

total 8

drwxr-xr-x 2 root root 4096 1月 26 12:21 ./

drwxr-xr-x 4 root root 4096 1月 26 12:21 ../

rootroot@rootroot-X99-Turbo:/opt/tools$

rootroot@rootroot-X99-Turbo:/opt/tools$ cd ~

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

--2024-01-26 12:22:28-- https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8203, ...

Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 141613749 (135M) [application/octet-stream]

Saving to: 'Miniconda3-latest-Linux-x86_64.sh'

Miniconda3-latest-Linux-x86_64.sh 100%[=============================================================================================>] 135.05M 2.82MB/s in 51s

2024-01-26 12:23:20 (2.65 MB/s) - 'Miniconda3-latest-Linux-x86_64.sh' saved [141613749/141613749]

rootroot@rootroot-X99-Turbo:~$ ffmpeg

ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers

built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

libavutil 56. 31.100 / 56. 31.100

libavcodec 58. 54.100 / 58. 54.100

libavformat 58. 29.100 / 58. 29.100

libavdevice 58. 8.100 / 58. 8.100

libavfilter 7. 57.100 / 7. 57.100

libavresample 4. 0. 0 / 4. 0. 0

libswscale 5. 5.100 / 5. 5.100

libswresample 3. 5.100 / 3. 5.100

libpostproc 55. 5.100 / 55. 5.100

Hyper fast Audio and Video encoder

usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Use -h to get full help or, even better, run 'man ffmpeg'

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ pip install -U openai-whisper

Defaulting to user installation because normal site-packages is not writeable

Requirement already satisfied: openai-whisper in ./.local/lib/python3.8/site-packages (20231117)

Requirement already satisfied: triton<3,>=2.0.0 in ./.local/lib/python3.8/site-packages (from openai-whisper) (2.2.0)

Requirement already satisfied: numba in ./.local/lib/python3.8/site-packages (from openai-whisper) (0.58.1)

Requirement already satisfied: numpy in ./.local/lib/python3.8/site-packages (from openai-whisper) (1.24.4)

Requirement already satisfied: torch in ./.local/lib/python3.8/site-packages (from openai-whisper) (2.1.2)

Requirement already satisfied: tqdm in ./.local/lib/python3.8/site-packages (from openai-whisper) (4.66.1)

Requirement already satisfied: more-itertools in ./.local/lib/python3.8/site-packages (from openai-whisper) (10.2.0)

Requirement already satisfied: tiktoken in ./.local/lib/python3.8/site-packages (from openai-whisper) (0.5.2)

Requirement already satisfied: filelock in ./.local/lib/python3.8/site-packages (from triton<3,>=2.0.0->openai-whisper) (3.13.1)

Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in ./.local/lib/python3.8/site-packages (from numba->openai-whisper) (0.41.1)

Requirement already satisfied: importlib-metadata in ./.local/lib/python3.8/site-packages (from numba->openai-whisper) (7.0.1)

Requirement already satisfied: regex>=2022.1.18 in ./.local/lib/python3.8/site-packages (from tiktoken->openai-whisper) (2023.12.25)

Requirement already satisfied: requests>=2.26.0 in ./.local/lib/python3.8/site-packages (from tiktoken->openai-whisper) (2.31.0)

Requirement already satisfied: typing-extensions in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (4.9.0)

Requirement already satisfied: sympy in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (1.12)

Requirement already satisfied: networkx in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (3.1)

Requirement already satisfied: jinja2 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (3.1.3)

Requirement already satisfied: fsspec in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (2023.12.2)

Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)

Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)

Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)

Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (8.9.2.26)

Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.3.1)

Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (11.0.2.54)

Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (10.3.2.106)

Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (11.4.5.107)

Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.0.106)

Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (2.18.1)

Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in ./.local/lib/python3.8/site-packages (from torch->openai-whisper) (12.1.105)

Collecting triton<3,>=2.0.0 (from openai-whisper)

Downloading triton-2.1.0-0-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)

Requirement already satisfied: nvidia-nvjitlink-cu12 in ./.local/lib/python3.8/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch->openai-whisper) (12.3.101)

Requirement already satisfied: charset-normalizer<4,>=2 in ./.local/lib/python3.8/site-packages (from requests>=2.26.0->tiktoken->openai-whisper) (3.3.2)

Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (2.8)

Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (1.25.8)

Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken->openai-whisper) (2019.11.28)

Requirement already satisfied: zipp>=0.5 in ./.local/lib/python3.8/site-packages (from importlib-metadata->numba->openai-whisper) (3.17.0)

Requirement already satisfied: MarkupSafe>=2.0 in ./.local/lib/python3.8/site-packages (from jinja2->torch->openai-whisper) (2.1.3)

Requirement already satisfied: mpmath>=0.19 in ./.local/lib/python3.8/site-packages (from sympy->torch->openai-whisper) (1.3.0)

Downloading triton-2.1.0-0-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.2/89.2 MB 25.9 MB/s eta 0:00:00

Installing collected packages: triton

Attempting uninstall: triton

Found existing installation: triton 2.2.0

Uninstalling triton-2.2.0:

Successfully uninstalled triton-2.2.0

Successfully installed triton-2.1.0

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ pip install tiktoken

Defaulting to user installation because normal site-packages is not writeable

Requirement already satisfied: tiktoken in ./.local/lib/python3.8/site-packages (0.5.2)

Requirement already satisfied: regex>=2022.1.18 in ./.local/lib/python3.8/site-packages (from tiktoken) (2023.12.25)

Requirement already satisfied: requests>=2.26.0 in ./.local/lib/python3.8/site-packages (from tiktoken) (2.31.0)

Requirement already satisfied: charset-normalizer<4,>=2 in ./.local/lib/python3.8/site-packages (from requests>=2.26.0->tiktoken) (3.3.2)

Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (2.8)

Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (1.25.8)

Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken) (2019.11.28)

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ pip install setuptools-rust

Defaulting to user installation because normal site-packages is not writeable

Requirement already satisfied: setuptools-rust in ./.local/lib/python3.8/site-packages (1.8.1)

Requirement already satisfied: setuptools>=62.4 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (69.0.3)

Requirement already satisfied: semantic-version<3,>=2.8.2 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (2.10.0)

Requirement already satisfied: tomli>=1.2.1 in ./.local/lib/python3.8/site-packages (from setuptools-rust) (2.0.1)

rootroot@rootroot-X99-Turbo:~$ sudo apt update && sudo apt install ffmpeg

Get:1 file:/var/cuda-repo-ubuntu2004-12-0-local InRelease [1,575 B]

Get:2 file:/var/cuda-repo-ubuntu2004-12-3-local InRelease [1,572 B]

Get:1 file:/var/cuda-repo-ubuntu2004-12-0-local InRelease [1,575 B]

Get:2 file:/var/cuda-repo-ubuntu2004-12-3-local InRelease [1,572 B]

Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal InRelease

Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-updates InRelease

Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-backports InRelease

Hit:6 http://security.ubuntu.com/ubuntu focal-security InRelease

Hit:7 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu focal InRelease

Reading package lists... Done

Building dependency tree

Reading state information... Done

30 packages can be upgraded. Run 'apt list --upgradable' to see them.

Reading package lists... Done

Building dependency tree

Reading state information... Done

ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).

0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ whisper audio.mp3 --model medium --language Chinese

100%|█████████████████████████████████████| 1.42G/1.42G [03:24<00:00, 7.48MiB/s]

Traceback (most recent call last):

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 58, in load_audio

out = run(cmd, capture_output=True, check=True).stdout

File "/usr/lib/python3.8/subprocess.py", line 516, in run

raise CalledProcessError(retcode, process.args,

subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'audio.mp3', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 478, in cli

result = transcribe(model, audio_path, temperature=temperature, **args)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 122, in transcribe

mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 140, in log_mel_spectrogram

audio = load_audio(audio)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 60, in load_audio

raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers

built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

libavutil 56. 31.100 / 56. 31.100

libavcodec 58. 54.100 / 58. 54.100

libavformat 58. 29.100 / 58. 29.100

libavdevice 58. 8.100 / 58. 8.100

libavfilter 7. 57.100 / 7. 57.100

libavresample 4. 0. 0 / 4. 0. 0

libswscale 5. 5.100 / 5. 5.100

libswresample 3. 5.100 / 3. 5.100

libpostproc 55. 5.100 / 55. 5.100

audio.mp3: No such file or directory

Skipping audio.mp3 due to RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers

built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

libavutil 56. 31.100 / 56. 31.100

libavcodec 58. 54.100 / 58. 54.100

libavformat 58. 29.100 / 58. 29.100

libavdevice 58. 8.100 / 58. 8.100

libavfilter 7. 57.100 / 7. 57.100

libavresample 4. 0. 0 / 4. 0. 0

libswscale 5. 5.100 / 5. 5.100

libswresample 3. 5.100 / 3. 5.100

libpostproc 55. 5.100 / 55. 5.100

audio.mp3: No such file or directory

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ whisper chi.mp4 --model medium --language Chinese

Traceback (most recent call last):

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 58, in load_audio

out = run(cmd, capture_output=True, check=True).stdout

File "/usr/lib/python3.8/subprocess.py", line 516, in run

raise CalledProcessError(retcode, process.args,

subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'chi.mp4', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 478, in cli

result = transcribe(model, audio_path, temperature=temperature, **args)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/transcribe.py", line 122, in transcribe

mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 140, in log_mel_spectrogram

audio = load_audio(audio)

File "/home/rootroot/.local/lib/python3.8/site-packages/whisper/audio.py", line 60, in load_audio

raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers

built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

libavutil 56. 31.100 / 56. 31.100

libavcodec 58. 54.100 / 58. 54.100

libavformat 58. 29.100 / 58. 29.100

libavdevice 58. 8.100 / 58. 8.100

libavfilter 7. 57.100 / 7. 57.100

libavresample 4. 0. 0 / 4. 0. 0

libswscale 5. 5.100 / 5. 5.100

libswresample 3. 5.100 / 3. 5.100

libpostproc 55. 5.100 / 55. 5.100

chi.mp4: No such file or directory

Skipping chi.mp4 due to RuntimeError: Failed to load audio: ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers

built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

libavutil 56. 31.100 / 56. 31.100

libavcodec 58. 54.100 / 58. 54.100

libavformat 58. 29.100 / 58. 29.100

libavdevice 58. 8.100 / 58. 8.100

libavfilter 7. 57.100 / 7. 57.100

libavresample 4. 0. 0 / 4. 0. 0

libswscale 5. 5.100 / 5. 5.100

libswresample 3. 5.100 / 3. 5.100

libpostproc 55. 5.100 / 55. 5.100

chi.mp4: No such file or directory

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ sudo apt-get install ffmpeg

Reading package lists... Done

Building dependency tree

Reading state information... Done

ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).

0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ ll *.mp4

-rwx------ 1 rootroot rootroot 3465644 1月 12 01:28 chs.mp4*

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ whisper chs.mp4 --model medium --language Chinese

[00:00.000 --> 00:01.400] 前段時間有個巨石鴻吼

[00:01.400 --> 00:03.000] 某某是男人最好的衣妹

[00:03.000 --> 00:04.800] 這裡的某某可以替換為減肥

[00:04.800 --> 00:07.800] 長髮 西裝 考研 術唱 永潔無間等等等等

[00:07.800 --> 00:09.200] 我聽到最新的一個說法是

[00:09.200 --> 00:12.000] 微分碎蓋加口罩加半框眼鏡加春風衣

[00:12.000 --> 00:13.400] 等於男人最好的衣妹

[00:13.400 --> 00:14.400] 大概也就前幾年

[00:14.400 --> 00:17.400] 春風衣還和格子襯衫並列為程序員穿搭精華

[00:17.400 --> 00:20.000] 紫紅色春風衣還被譽為廣場舞大媽標配

[00:20.000 --> 00:21.600] 路透牌還是我爹這個年紀的人

[00:21.600 --> 00:22.800] 才會願意買的牌子

[00:22.800 --> 00:24.400] 不知道風向為啥變得這麼快

[00:24.400 --> 00:26.800] 為啥這東西突然變成男生逆襲神器

[00:26.800 --> 00:27.800] 時尚潮流單品

[00:27.800 --> 00:29.400] 後來我翻了一下小紅書就懂了

[00:29.400 --> 00:30.400] 時尚這個時期

[00:30.400 --> 00:31.600] 重點不在於衣服

[00:31.600 --> 00:32.200] 在於人

[00:32.200 --> 00:34.600] 先在小紅書上面和春風衣相關的筆記

[00:34.600 --> 00:36.200] 照片裡的男生都是這樣的

[00:36.200 --> 00:37.000] 這樣的

[00:37.000 --> 00:38.000] 還有這樣的

[00:38.000 --> 00:39.400] 你們哪裡是看穿搭的

[00:39.400 --> 00:40.600] 你們明明是看臉

[00:40.600 --> 00:41.800] 就這個造型 這個年齡

[00:41.800 --> 00:44.000] 你換上老頭衫也能穿出氛圍感好嗎

[00:44.000 --> 00:46.600] 我又想起了當年郭德綱老師穿季凡西的殘劇

[00:46.600 --> 00:48.600] 這個世界對我們這些長得不好看的人

[00:48.600 --> 00:49.600] 還真是苛刻的

[00:49.600 --> 00:52.000] 所以說我總結了一下春風衣傳達的要領

[00:52.200 --> 00:54.400] 大概就是一張白鏡且人畜無憾的臉

[00:54.400 --> 00:55.200] 充足的髮量

[00:55.200 --> 00:56.200] 纖細的體型

[00:56.200 --> 00:58.200] 當然身上的春風衣還得是駱駝的

[00:58.200 --> 00:59.400] 去年在戶外用品界

[00:59.400 --> 01:00.200] 最頂流的

[01:00.200 --> 01:01.200] 既不是鳥橡樹

[01:01.200 --> 01:02.800] 也不是有校服之稱的北面

[01:02.800 --> 01:04.200] 或者老臺頂流哥倫比亞

[01:04.200 --> 01:05.000] 而是駱駝

[01:05.000 --> 01:07.200] 雙11 駱駝在天貓戶外服飾品類

[01:07.200 --> 01:09.000] 拿下銷售額和銷量雙料冠軍

[01:09.000 --> 01:10.200] 銷量達到百萬幾

[01:10.200 --> 01:10.800] 再抖音

[01:10.800 --> 01:13.400] 駱駝銷售同比增幅高達296%

[01:13.400 --> 01:16.200] 旗下主打的三合一高性價比春風衣成為爆品

[01:22.600 --> 01:23.200] 至於線下

[01:23.200 --> 01:24.400] 還是網友總覺得好

[01:24.400 --> 01:26.800] 如今在南方街頭的駱駝比沙漠裡的都多

[01:30.000 --> 01:31.200] 至於駱駝為啥這麼火

[01:31.200 --> 01:32.000] 便宜啊

[01:32.000 --> 01:33.600] 拿賣得最好的丁珍同款

[01:33.600 --> 01:35.600] 幻影黑三合一春風衣舉個例子

[01:35.600 --> 01:36.000] 線下買

[01:36.000 --> 01:37.600] 標牌價格2198

[01:37.600 --> 01:39.200] 但是跑到網上看一下

[01:39.200 --> 01:40.800] 標價就變成了699

[01:40.800 --> 01:41.400] 至於折扣

[01:41.400 --> 01:42.400] 日常也都是有的

[01:42.400 --> 01:43.600] 400出頭就能買到

[01:43.600 --> 01:45.200] 甚至有時候能递到300價

[01:45.200 --> 01:46.200] 要是你還嫌貴

[01:46.200 --> 01:48.400] 駱駝還有200塊出頭的單層春風衣

[01:48.400 --> 01:49.200] 就這個價格

[01:49.200 --> 01:51.800] 哥上海恐怕還不夠兩次City Walk的報名費

[01:51.800 --> 01:52.600] 看來這個價格

[01:52.600 --> 01:54.800] 再對比一下北面1000塊錢起步

[01:54.800 --> 01:56.000] 你就能理解為啥北面

[01:56.000 --> 01:58.200] 這麼快就被大學生踢出了校服序列了

[01:58.200 --> 02:00.400] 我不知道現在大學生每個月生活費多少

[02:00.400 --> 02:02.200] 反正按照我上學時候的生活費

[02:02.200 --> 02:03.200] 一個月不吃不喝

[02:03.200 --> 02:05.000] 也就買得起倆袖子加一個帽子

[02:05.000 --> 02:06.400] 難怪當年全是假北面

[02:06.400 --> 02:07.400] 現在都是真駱駝

[02:07.400 --> 02:08.800] 至少人家是正品啊

[02:08.800 --> 02:10.000] 我翻了一下社交媒體

[02:10.000 --> 02:11.200] 發現對駱駝的吐槽

[02:11.200 --> 02:12.000] 和買了駱駝的

[02:12.000 --> 02:13.400] 基本上是1比1的比例

[02:13.400 --> 02:15.000] 吐槽最多的就是衣服會掉色

[02:15.000 --> 02:15.800] 還會串色

[02:15.800 --> 02:17.000] 比如圖層洗個幾次

[02:17.000 --> 02:18.200] 穿個兩天就掉光了

[02:18.200 --> 02:19.600] 比如不同倉庫發的貨

[02:19.600 --> 02:20.600] 質量參差不齊

[02:20.600 --> 02:21.600] 買衣服還得看戶口

[02:21.600 --> 02:22.400] 聽出聲

[02:22.400 --> 02:23.600] 至於什麼做工比較差

[02:23.600 --> 02:24.800] 內膽多 走線操

[02:24.800 --> 02:26.400] 不防水之類的就更多了

[02:26.400 --> 02:27.400] 但是這些吐槽

[02:27.400 --> 02:29.200] 並不意味著會影響駱駝的銷量

[02:29.200 --> 02:30.800] 甚至還會有不少自來水表示

[02:30.800 --> 02:32.600] 就這價格要啥子行車啊

[02:32.600 --> 02:34.000] 所謂性價比性價比

[02:34.000 --> 02:35.200] 脫離價位談性能

[02:35.200 --> 02:37.000] 這就不符合消費者的需求嘛

[02:37.000 --> 02:38.400] 無數次價格戰告訴我們

[02:38.400 --> 02:39.400] 只要肯降價

[02:39.400 --> 02:41.000] 就沒有賣不出去的產品

[02:41.000 --> 02:42.400] 一件衝鋒衣1000多

[02:42.400 --> 02:43.600] 你覺得平平無奇

[02:43.600 --> 02:45.000] 500多你覺得差點意思

[02:45.000 --> 02:46.400] 200塊你就秒下單了

[02:46.400 --> 02:47.000] 到99

[02:47.000 --> 02:48.400] 恐怕就要拼點手速了

[02:48.400 --> 02:49.600] 像衝鋒衣這個品類

[02:49.600 --> 02:50.800] 本來價格跨度就大

[02:50.800 --> 02:52.800] 北面最便宜的GORTEX衝鋒衣

[02:52.800 --> 02:53.800] 價格3000起步

[02:53.800 --> 02:55.200] 大概是同品牌最便宜

[02:55.200 --> 02:56.200] 衝鋒衣的三倍價格

[02:56.200 --> 02:57.200] 至於十足那樣

[02:57.200 --> 02:59.000] 搭載了GORTEX的硬殼起步價

[02:59.000 --> 03:00.000] 就要到4500

[03:00.000 --> 03:01.200] 而且同樣是GORTEX

[03:01.200 --> 03:02.800] 內部也有不同的系列和檔次

[03:02.800 --> 03:03.600] 做成衣服

[03:03.600 --> 03:05.600] 中間的差價恐怕就夠買兩件駱駝了

[03:05.600 --> 03:06.600] 至於智能控溫

[03:06.600 --> 03:07.400] 防水拉鍊

[03:07.400 --> 03:08.000] 全壓膠

[03:08.000 --> 03:09.800] 更加不可能出現在駱駝這裡了

[03:09.800 --> 03:11.800] 至少不會是300 400的駱駝身上會有的

[03:11.800 --> 03:12.800] 有的價外的衣服

[03:12.800 --> 03:14.200] 買的就是一個放棄幻想

[03:14.200 --> 03:15.800] 吃到肚子裡的科技魚很活

[03:15.800 --> 03:17.000] 是能給你省錢的

[03:17.000 --> 03:18.400] 穿在身上的科技魚很活

[03:18.400 --> 03:20.000] 裝裝件件都是要加錢的

[03:20.000 --> 03:21.600] 所以正如羅曼羅蘭所說

[03:21.600 --> 03:23.200] 這世界上只有一種英雄主義

[03:23.200 --> 03:24.800] 就是在認清了駱駝的本質以後

[03:24.800 --> 03:26.000] 依然選擇買駱駝

[03:26.000 --> 03:27.000] 關於駱駝的火爆

[03:27.000 --> 03:28.200] 我有一些小小的看法

[03:28.200 --> 03:29.000] 駱駝這個東西

[03:29.000 --> 03:30.400] 它其實就是個潮牌

[03:30.400 --> 03:32.000] 看看它的營銷方式就知道了

[03:32.000 --> 03:33.000] 現在打開小黃書

[03:33.000 --> 03:35.000] 日常可以看到駱駝穿搭是這樣的

[03:35.000 --> 03:36.600] 加一點氛圍感是這樣的

[03:36.600 --> 03:37.400] 對比一下

[03:37.400 --> 03:39.000] 其他品牌的風格是這樣的

[03:39.000 --> 03:39.800] 這樣的

[03:39.800 --> 03:41.200] 其實對比一下就知道了

[03:41.200 --> 03:42.600] 其他品牌突出一個時程

[03:42.600 --> 03:44.200] 能防風就一定要講防風

[03:44.200 --> 03:46.000] 能扛動就一定要講扛動

[03:46.000 --> 03:47.400] 但駱駝在營銷的時候

[03:47.400 --> 03:49.200] 主打的就是一個城市戶外風

[03:49.200 --> 03:50.400] 雖然造型是春風衣

[03:50.400 --> 03:52.200] 但場景往往是在城市裡

[03:52.200 --> 03:54.200] 哪怕在野外也要突出一個風和日麗

[03:54.200 --> 03:55.000] 陽光美媚

[03:55.000 --> 03:56.400] 至少不會在明顯的嚴寒

[03:56.400 --> 03:58.000] 高海拔或是惡劣氣候下

[03:58.200 --> 04:00.200] 如果用一個詞形容駱駝的營銷風格

[04:00.200 --> 04:01.000] 那就是清洗

[04:01.000 --> 04:03.000] 或者說他很理解自己的消費者是誰

[04:03.000 --> 04:04.000] 需要什麼產品

[04:04.000 --> 04:05.200] 從使用場景來說

[04:05.200 --> 04:06.600] 駱駝的消費者買春風衣

[04:06.600 --> 04:08.800] 不是真的有什麼大風大雨要去應對

[04:08.800 --> 04:11.000] 春風衣的作用是下雨沒帶傘的時候

[04:11.000 --> 04:12.000] 臨時頂個幾分鐘

[04:12.000 --> 04:13.600] 讓你能圖書館跑回宿舍

[04:13.600 --> 04:15.000] 或者是冬天騎電動車

[04:15.000 --> 04:16.200] 被風吹得不行的時候

[04:16.200 --> 04:17.200] 稍微扛一下風

[04:17.200 --> 04:18.400] 不至於體感太冷

[04:18.400 --> 04:19.800] 當然他們也會出門

[04:19.800 --> 04:21.800] 但大部分時候也都是去別的城市

[04:21.800 --> 04:24.000] 或者在城市周邊搞搞簡單的徒步

[04:24.000 --> 04:26.000] 這種情況下穿個駱駝已經夠了

[04:26.000 --> 04:27.200] 從購買動機來說

[04:27.200 --> 04:29.200] 駱駝就更沒有必要上那些應回科技了

[04:29.200 --> 04:31.000] 消費者買駱駝買的是個什麼呢

[04:31.000 --> 04:32.200] 不是春風衣的功能性

[04:32.200 --> 04:33.400] 而是春風衣的造型

[04:33.400 --> 04:34.400] 寬鬆的版型

[04:34.400 --> 04:36.400] 能精準遮住微微隆起的小肚子

[04:36.400 --> 04:37.400] 棱角分明的質感

[04:37.400 --> 04:39.400] 能隱藏一切不完美的身體線條

[04:39.400 --> 04:41.400] 顯瘦的副作用就是顯年輕

[04:41.400 --> 04:42.600] 再配上一條牛仔褲

[04:42.600 --> 04:43.800] 配上一雙大黃靴

[04:43.800 --> 04:45.200] 大學生的氣質就出來了

[04:45.200 --> 04:46.200] 要是自拍的時候

[04:46.200 --> 04:47.800] 再配上大學宿舍洗素臺

[04:47.800 --> 04:49.200] 那永遠擦不乾淨的鏡子

[04:49.200 --> 04:50.600] 瞬間青春無敵了

[04:50.800 --> 04:51.800] 說的更直白一點

[04:51.800 --> 04:53.200] 人家買的是個簡靈神器

[04:53.200 --> 04:53.800] 所以說

[04:53.800 --> 04:56.000] 吐槽穿駱駝都是假戶外愛好者的人

[04:56.000 --> 04:57.600] 其實並沒有理解駱駝的定位

[04:57.600 --> 04:59.800] 駱駝其實是給了想要入門山系穿搭

[04:59.800 --> 05:01.800] 想要追逐流行的人一個最平價

[05:01.800 --> 05:03.000] 決策成本最低的選擇

[05:03.000 --> 05:04.800] 至於那些真正的硬核戶外愛好者

[05:04.800 --> 05:05.800] 駱駝既沒有能力

[05:05.800 --> 05:07.200] 也沒有打算觸打他們

[05:07.200 --> 05:08.000] 反過來說

[05:08.000 --> 05:09.600] 那些自駕穿越邊疆國道

[05:09.600 --> 05:11.800] 或者去奧爾卑斯山區登山探險的人

[05:11.800 --> 05:13.600] 也不太可能在戶外服飾上省錢

[05:13.600 --> 05:15.000] 畢竟光是交通住宿

[05:15.400 --> 05:16.400] 成本就不低了

[05:16.400 --> 05:17.200] 對他們來說

[05:17.200 --> 05:19.000] 戶外裝備很多時候是保命用的

[05:19.000 --> 05:21.000] 也就不存在跟風奧造型的必要了

[05:21.000 --> 05:22.200] 最後我再說個題外話

[05:22.200 --> 05:24.200] 年輕人追捧駱駝一個隱藏的原因

[05:24.200 --> 05:25.800] 其實是羽絨服越來越貴了

[05:25.800 --> 05:26.600] 有媒體統計

[05:26.600 --> 05:30.000] 現在國產羽絨服的平均售價已經高達881元

[05:30.000 --> 05:32.000] 波斯登均價最高接近2000元

[05:32.000 --> 05:32.800] 而且過去幾年

[05:32.800 --> 05:34.800] 國產羽絨服品牌都在轉向高端化

[05:34.800 --> 05:37.000] 羽絨服市場分為8000元以上的奢侈級

[05:37.000 --> 05:38.400] 2000元以下的大眾級

[05:38.400 --> 05:39.800] 而在中間的高端級

[05:39.800 --> 05:41.200] 國產品牌一直沒有存在感

[05:41.200 --> 05:42.200] 所以過去幾年

[05:42.200 --> 05:43.600] 波斯登天工人這些品牌

[05:43.600 --> 05:45.200] 都把2000元到8000元這個市場

[05:45.200 --> 05:46.600] 當成未來的發展趨勢

[05:46.600 --> 05:48.000] 東新證券研報顯示

[05:48.000 --> 05:49.600] 從2018到2021年

[05:49.600 --> 05:52.200] 波斯登均價4年漲幅達到60%以上

[05:52.200 --> 05:53.200] 過去5個菜年

[05:53.200 --> 05:55.000] 這個品牌的營銷開支從20多億

[05:55.000 --> 05:56.000] 漲到了60多億

[05:56.000 --> 05:57.200] 羽絨服價格往上走

[05:57.200 --> 05:59.200] 年輕消費者就開始拋棄羽絨服

[05:59.200 --> 06:00.400] 購買平價衝鋒衣

[06:00.400 --> 06:02.200] 裡面再穿個普通價外的瑤麗絨

[06:02.200 --> 06:03.400] 或者羽絨小夾克

[06:03.400 --> 06:05.200] 也不比大幾千的羽絨服差多少

[06:05.200 --> 06:05.800] 說到底

[06:05.800 --> 06:07.000] 現在消費社會發達了

[06:07.000 --> 06:08.000] 沒有什麼需求是

[06:08.000 --> 06:09.600] 一定要某種特定的解決方案

[06:09.600 --> 06:11.600] 特定價位的商品才能實現的

[06:11.600 --> 06:12.200] 要保暖

[06:12.200 --> 06:13.200] 羽絨服固然很好

[06:13.200 --> 06:15.200] 但衝鋒衣加一些內搭也很暖和

[06:15.200 --> 06:16.000] 要時尚

[06:16.000 --> 06:18.000] 大幾千塊錢的設計師品牌非常不錯

[06:18.000 --> 06:19.400] 但350的拼多多服飾

[06:19.400 --> 06:20.600] 搭得好也能出彩

[06:20.600 --> 06:21.600] 要去野外徒步

[06:21.600 --> 06:23.000] 花五六千買鳥也可以

[06:23.000 --> 06:25.200] 但迪卡農也足以應付大多數狀況

[06:25.200 --> 06:25.800] 所以說

[06:25.800 --> 06:27.600] 花高價買衝鋒衣當然也OK

[06:27.600 --> 06:28.600] 三四百買件駱駝

[06:28.600 --> 06:29.800] 也是可以接受的選擇

[06:29.800 --> 06:32.000] 何況駱駝也多多少少有一些功能性

[06:32.000 --> 06:33.800] 畢竟它再怎麼樣還是個衝鋒衣

[06:33.800 --> 06:34.800] 理解了這個事情

[06:34.800 --> 06:36.800] 就很容易分辨什麼是智商稅的

[06:36.800 --> 06:38.800] 那些向你灌輸非某個品牌不用

[06:38.800 --> 06:39.800] 告訴你某個需求

[06:39.800 --> 06:41.400] 只有某個產品才能滿足

[06:41.400 --> 06:42.200] 某個品牌

[06:42.200 --> 06:44.400] 就是某個品牌絕對的比試鏈頂端

[06:44.400 --> 06:46.800] 這類銀銷的智商稅含量必然是很高的

[06:46.800 --> 06:48.800] 它的目的是剝奪你選擇的權利

[06:48.800 --> 06:51.200] 讓你主動放棄比價和尋找平梯的想法

[06:51.200 --> 06:53.000] 從而避免與其他品牌競爭

[06:53.000 --> 06:54.200] 而沒有競爭的市場

[06:54.200 --> 06:56.200] 才是智商稅含量最高的市場

[06:56.200 --> 06:57.400] 消費商業洞穴

[06:57.400 --> 06:58.400] 禁在IC實驗室

[06:58.400 --> 06:59.000] 我是館長

[06:59.000 --> 07:00.000] 我們下期再見

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$

rootroot@rootroot-X99-Turbo:~$ time(whisper chs.mp4 --model medium --language Chinese)

https://www.toutiao.com/article/7189209812264075835/?app=news_article\&timestamp=1706203570\&use_new_style=1\&req_id=20240126012609901ACEF7F5666533AA21\&group_id=7189209812264075835\&tt_from=mobile_qq\&utm_source=mobile_qq\&utm_medium=toutiao_android\&utm_campaign=client_share\&share_token=5e0cda89-00c5-40fe-afa0-c3c88dd056c4\&source=m_redirect

已达到人类水准语音识别模型的whisper,真的有这么厉害吗?

transcribe函数的language目前支持99种语言,如下:

"en": "english","zh": "chinese",

"de": "german","es": "spanish",

"ru": "russian","ko": "korean",

"fr": "french","ja": "japanese",

"pt": "portuguese","tr": "turkish",

"pl": "polish","ca": "catalan",

"nl": "dutch","ar": "arabic",

"sv": "swedish","it": "italian",

"id": "indonesian","hi": "hindi",

"fi": "finnish","vi": "vietnamese",

"he": "hebrew","uk": "ukrainian",

"el": "greek","ms": "malay",

"cs": "czech","ro": "romanian",

"da": "danish","hu": "hungarian",

"ta": "tamil","no": "norwegian",

"th": "thai","ur": "urdu",

"hr": "croatian","bg": "bulgarian",

"lt": "lithuanian","la": "latin",

"mi": "maori","ml": "malayalam",

"cy": "welsh","sk": "slovak",

"te": "telugu","fa": "persian",

"lv": "latvian","bn": "bengali",

"sr": "serbian","az": "azerbaijani",

"sl": "slovenian","kn": "kannada",

"et": "estonian","mk": "macedonian",

"br": "breton","eu": "basque",

"is": "icelandic","hy": "armenian",

"ne": "nepali","mn": "mongolian",

"bs": "bosnian","kk": "kazakh",

"sq": "albanian","sw": "swahili",

"gl": "galician","mr": "marathi",

"pa": "punjabi","si": "sinhala",

"km": "khmer","sn": "shona",

"yo": "yoruba","so": "somali",

"af": "afrikaans","oc": "occitan",

"ka": "georgian","be": "belarusian",

"tg": "tajik","sd": "sindhi",

"gu": "gujarati","am": "amharic",

"yi": "yiddish","lo": "lao",

"uz": "uzbek","fo": "faroese",

"ht": "haitian creole","ps": "pashto",

"tk": "turkmen","nn": "nynorsk",

"mt": "maltese","sa": "sanskrit",

"lb": "luxembourgish","my": "myanmar",

"bo": "tibetan","tl": "tagalog",

"mg": "malagasy","as": "assamese",

"tt": "tatar","haw": "hawaiian",

"ln": "lingala","ha": "hausa",

"ba": "bashkir","jw": "javanese","su": "sundanese",

官方还提供了另外一种调用方案:

import whisper

model = whisper.load_model("base")

load audio and pad/trim it to fit 30 seconds

audio = whisper.load_audio("audio.mp3")

audio = whisper.pad_or_trim(audio)

make log-Mel spectrogram and move to the same device as the model

mel = whisper.log_mel_spectrogram(audio).to(model.device)

detect the spoken language

_, probs = model.detect_language(mel)

print(f"Detected language: {max(probs, key=probs.get)}")

decode the audio

options = whisper.DecodingOptions(language='Chinese')

result = whisper.decode(model, mel, options)

print the recognized text

print(result.text)

参考资料:

https://www.toutiao.com/article/7229151806801248807/?app=news_article\&timestamp=1706203733\&use_new_style=1\&req_id=20240126012853D9D3D4539BEF1333DBCC\&group_id=7229151806801248807\&tt_from=mobile_qq\&utm_source=mobile_qq\&utm_medium=toutiao_android\&utm_campaign=client_share\&share_token=085ce76c-b23a-4609-b2d0-d18c8d7ab8f8\&source=m_redirect

C++版本人工智能实时语音转文字(字幕/语音识别)Whisper.cpp实践

【WINDOWS,大模型需要10GB】

https://blog.csdn.net/hhy321/article/details/134897967?spm=1001.2101.3001.6650.2\&utm_medium=distribute.wap_relevant.none-task-blog-2\~default\~CTRLIST\~Rate-2-134897967-blog-130001848.237^v3^wap_relevant_t0_download\&depth_1-utm_source=distribute.wap_relevant.none-task-blog-2\~default\~CTRLIST\~Rate-2-134897967-blog-130001848.237^v3^wap_relevant_t0_download\&share_token=845e69c5-c625-4834-8faa-08f1f29f55b2

【小沐学Python】Python实现语音识别(Whisper)

https://blog.csdn.net/xkukeer/article/details/130227944?share_token=f48bfb40-9399-4375-894e-3ecf96d1c51d

openai的whisper语音识别介绍

第三步,选择使用的模型。

官方说有5种模型,其中4种是English-only模型,但是实测english-only也可以支持中文(只测了base可以支持中文,其他的没测但应该也可以)

虽说支持中文,但是也有不理想的地方,中文的识别错误率(WER (Word Error Rate))还不低,在所有支持语言的大概排中游水平。

第四步,具体使用

有好几种方法:

1、命令行模式

whisper audio.flac audio.mp3 audio.wav --model medium

对于非英文语言,加上--language参数,例如日语

whisper japanese.wav --language Japanese

支持的语言类型还挺多的

【WINDOWS】

https://blog.csdn.net/liaoqingjian/article/details/132474687?share_token=e6ad6f74-2fab-45c5-bdb5-40b48fe2cd79

whisper 语音识别项目部署

https://www.toutiao.com/article/7327918175801164325/?app=news_article\&timestamp=1706203446\&use_new_style=1\&req_id=202401260124058D2D3B0452AC9B3435B3\&group_id=7327918175801164325\&tt_from=mobile_qq\&utm_source=mobile_qq\&utm_medium=toutiao_android\&utm_campaign=client_share\&share_token=ad4cdc74-1590-4a7b-b020-14f9186f9ef2\&source=m_redirect

Whisper对于中文语音识别与转写中文文本优化的实践(Python3.10)

【WINDOWS】

https://www.toutiao.com/article/7276749520275456572/?app=news_article\&timestamp=1706203504\&use_new_style=1\&req_id=2024012601250342BCD0F3D434AA335380\&group_id=7276749520275456572\&tt_from=mobile_qq\&utm_source=mobile_qq\&utm_medium=toutiao_android\&utm_campaign=client_share\&share_token=5bc13cbe-db1d-4883-bff4-b01f258dd1c2\&source=m_redirect

语音转文字软件Whisper,实时自动语音识别,音频视频文案提取

相关推荐
XiaoLiuLB4 小时前
最佳语音识别 Whisper-large-v3-turbo 上线,速度更快(本地安装 )
人工智能·whisper·语音识别
hawk2014bj15 天前
使用开源 Whisper 视频转文字
深度学习·whisper·音视频
言京谅21 天前
Jetson 部署 Faster Whisper
人工智能·whisper·语音识别·jetson nano
@我们的天空1 个月前
【AIGC】Whisper语音识别模型概述,应用场景和具体实例及如何本地搭建Whisper语音识别模型?
人工智能·python·深度学习·机器学习·whisper·aigc·语音识别
AI逍遥子1 个月前
如何本地搭建Whisper语音识别模型
人工智能·whisper·语音识别·ai编程
不会代码的小林1 个月前
如何本地搭建Whisper语音识别模型
whisper
rhythmcc1 个月前
【whisper】使用whisper实现语音转文字
whisper
营赢盈英1 个月前
OpenAI transcription API bad request
javascript·ai·node.js·whisper·axios·openai api
LQS20201 个月前
本地搭建和运行Whisper语音识别模型小记
whisper
营赢盈英1 个月前
Python Poetry fails to add openai-whisper due to triton installation error
python·ai·whisper·openai·poetry