Lucas Sifoni

Hello Bumblebee. Hello, Whisper !

elixir audioskop bumblebee


I’ve had the pleasure of working with Bumblebee today.

Bumblebee provides pre-trained Neural Network models on top of Axon. It includes integration with 🤗 Models, allowing anyone to download and perform Machine Learning tasks with few lines of code.

As amazing as it sounds to someone that did not try it yet (so, yesterday-me), that is actually the experience I got from it. I needed to transcribe audio, replacing a third party python solution that :

This is an Elixir module allowing me to transcribe audio using openai’s SOTA whisper model. It runs alongside my main application, not in tests, and can be configured away if the application administrator does not need to transcribe audio anymore.

defmodule Transcriber do
  @moduledoc """
  Nx serving for openai/whisper-small, allowing to transcribe audio.
  API : Transcriber.transcribe(file_path)
  """
  use GenServer

  def transcribe(file) do
    GenServer.call(__MODULE__, {:transcribe, file}, :infinity)
  end

  @impl true
  def init(_init_arg) do
    {:ok, nil, {:continue, :load_models}}
  end

  def start_link(_) do
    GenServer.start_link(__MODULE__, nil, name: __MODULE__)
  end

  @impl true
  def handle_continue(:load_models, _state) do
    {:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-small"})
    {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-small"})
    {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-small"})
    {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-small"})

    serving =
      Bumblebee.Audio.speech_to_text_whisper(whisper, featurizer, tokenizer, generation_config,
        defn_options: [compiler: EXLA],
        chunk_num_seconds: 30,
        timestamps: :segments
      )

    {:noreply, serving}
  end

  @impl true
  def handle_call({:transcribe, file}, _from, state) do
    {:reply, Nx.Serving.run(state, {:file, file}), state}
  end
end

What a joy ! I’m immensely thankful of the work done by the teams behind Bumblebee, Nx and the general machine-learning Elixir ecosystem allowing us to tap into that kind of resource with elegance and coherence, properly integrating with the rest of our solutions.


Previous post : Let's dive deep into Plug.Conn.send_file/5
Next post : Hello AtomVM — Elixir & Erlang on ESP32