9 AI & Audio Dash apps for Voice Computing Research

Companies like Bose Corporation and US government agencies rely on Dash Enterprise for their NLP and Voice Computing research.

In this post, we showcase some of our favorite Dash apps (with Python code!) that combine AI and audio.

Every decade we see a paradigm shift in the way we interact with computers. First it was the Internet. Then came smartphone apps. Now, according to Chatbots Magazine, it’s intelligent voice assistants (IAs).

Interest in voice computing has grown exponentially over the past few years.

Today’s IAs are programmed with machine learning to predict user’s needs. They can understand and carry out increasingly complex tasks like booking flights or making hotel reservations.

Household brands and their products, like Amazon’s Echo and Google’s Home, have made IA technology near ubiquitous.

At Plotly, we’ve felt this trend. Dash developers and Dash Enterprise customers across the world are creating Dash apps for their voice computing research and IA product development.

Here are some of our favorite voice computing Dash apps:

1. Word2Vec Dash app

Check out this app:

In teaching a computer to understand us, we use natural language processing (NLP). NLP allows us to parameterize human speech while preserving the correct semantic meaning of a string of words. This is crucial in AI’s like personal-assistants or predictive text.

In this Dash Word2Vec app, you can build, visualize, explore, and share the results of an NLP model.

This Word2Vec Dash app serves as an “explainable AI” interface for the complex mathematics that go into word embeddings and dimension reduction for NLP.

The Python model behind this Dash app was trained on a dataset from Google News, while the dimensionality reduction relies on datasets from Twitter or Wikipedia. The Python model encodes words from the Google News dataset as N-dimensional vectors – a technique known as “word embedding”.

Finally, the UMAP and t-SNE algorithms in this Dash app reduce the high-dimensional word embedding vectors to two or three dimensions that can be easily visualized.

Model runs can be saved, archived, and shared with Dash Enterprise’s Snapshot Engine.

2. Word Embeddings Arithmetic

Check out this Dash app:

To learn more about this app, please refer to our detailed feature post, Understanding Word Embedding Arithmetic: Why there’s no single answer to “King − Man + Woman = ?”

This Dash app was styled using Dash Enterprise Design Kit and deployed through the Dash Enterprise App Manager.

3. Dash for AI Speech Recognition

Check out this Dash app:

Check out this Dash app’s Python code:

Using this Dash app, we drag a slider to transcribe audio from a popular NPR podcast (“Planet Money”). This podcast episode is about the price of milk, so you may notice transcription errors such as “melt” instead of “milk” or “derby” instead of “dairy.” The transcription is done in real time with Python bindings to Carnegie Mellon’s Sphinx Speech Recognition software.

Sphinx, the underlying technology, is a group of speech recognition systems developed at Carnegie Mellon University. It’s one of the only open-source speech transcription libraries. Big brand voice assistants like Alexa, Siri, and Google Voice are all closed source.

This app includes a Plotly WebGL graph to visualize the audio clip waveform. Users can zoom in at any point along the waveform to inspect where Sphinx misunderstood a word. 

4. Audio Explorer Dash app

Check out this Dash app:

Check out this Dash app’s Python code:

The Audio Explorer Dash app allows users to upload a sound or music file and cluster audio into different sonic components. This open source Dash app was developed by Polish scientist Łukasz Tracewski, a Dash community member who develops tools for conservation science.

First, this Dash app computes a set of features for each audio file through signal processing methods. It uses simple methods such as taking the mean/median/quartiles of the frequency and pitch, as well as more complex algorithms like Linear Predictor Coefficients, Line Spectral Frequency, Mel-Frequency Cepstral Coefficients, or more.

Then, this Dash app projects all of those extracted features into 2D space using a reduction algorithm. The clusters can be inspected visually through the Dash user interface. The “Profile” tab shows a spectrogram of the sound you selected; this is used by audio engineers to classify animal sounds like bird calls.

Tracewski told Plotly that Audio Explorer helps conservation scientists analyze audio data for “remote monitoring” of bird species. Scientists look at several years’ worth of audio data over different sites around the world in order to analyze how bird populations change over time.

“In essence, it should help people answer questions like, ‘are your conservation efforts working?’,” said Tracewski.

One of the issues that this Dash app targets is bird species population count. Conservationists record years of audio and then count the species in these recordings. The problem with this is that we don’t yet have every bird call labelled for every species. This labelling takes a long time and is very error-prone.

“This is exactly where the Dash audio explorer comes in,” said Tracewski. “It allows us to easily and quickly label a lot of data, such that we can later use machine learning to train the program to recognize the birds of interest.”

5. Named Entity Recognition (NER) Dash app with spaCy

Check out this Dash app:

This Dash app demonstrates a UI for human assisted classification of “named entities.” For example, if the Python-based AI behind this Dash app erroneously classifies “Miley Cyrus” as a place instead of a person, the Dash app has a UI for a human to make the correction. It’s a nice demonstration of a feedback loop where humans, AI researchers, and the AI itself continuously improve the accuracy of an AI model. 

Plotly’s Chief of Product Chris Parmer created this Name Entity Recognition (NER) Dash app using the excellent spaCy Python library and its pre-trained models. Instead of using spaCy’s “displaCy” visualizer, Parmer created a custom UI with Dash that allows correcting erroneous terms.

Should any word classification be incorrect, we can correct it manually in the Dash app for model retraining purposes:

6. ML Model Training Dash app

Check out this Dash app:

Consider this ML problem: A data scientist has been tasked to create a model that automatically assigns ratings to product reviews. To do this, they need to try out different ML models and explain the nuance of each to their boss.

This Dash app lets you interactively design your own ML model by choosing the desired text encoding (bag-of-words vs bigrams, word frequency vs TF-IDF) and ML parameters (learning rate, tree depth, and number of iterations/estimators).

This Dash app runs the model as a background task. When it has finished running, it automatically generates an interactive report displaying common metrics and graphs (ROC Curve, PR Curve, etc).

Running background jobs that create, archive, and send reports is a common use case for Dash Enterprise’s Snapshot Engine.

Custom Dash UIs like this one help data scientists quickly explore ML model parameters and interactively communicate their findings to business stakeholders.

7. Dash Optical Character Recognition (OCR)

Check out this Dash app:

Check out this Dash app’s Python code:

This Dash Optical Character Recognition (OCR) app showcases Tesseract, a neural network that recognizes and “reads” handwritten text.

Similar to how “3. Dash for AI Speech Recognition” uses Sphinx to transcribe audio clips to text, this Dash app uses Tesseract to transcribe handwriting to text.

Artificial neural networks (ANNs) are a vast subject on their own. At a very general level, ANNs are computing systems that “learn” to perform tasks by considering examples, generally without being programmed with task-specific rules. For example, in image recognition, an ANN might learn to identify images that contain cats by analyzing example images that have been manually labeled as “cat” or “no cat.”

As you can see, Dash is ideal for creating custom research and business tools that apply neural networks to uploaded data.

8. Dash NLP Report Template

Check out this Dash app:

Check out this Dash app’s Python code:

Have we mentioned how ubiquitous and important modern natural language processing (NLP) tools are?

Extracting information from unstructured text remains a difficult, yet important challenge in the era of big data. Figuring this out is paying huge dividends for some businesses – whether that’s capturing prevailing moods about a particular topic or product (sentiment analysis), identifying key topics from texts (summarization/classification), or answering context-dependent questions (like Alexa or Google Assistant).

Want to learn more about the Python methods behind this Dash app? Please refer to our detailed feature post, NLP visualizations for clear, immediate insights into text data and outputs.

9. Text to Speech Dash app (with NVIDIA text2speech

Check out this Dash app’s Python code:

This Dash app is the inverse of “3. Dash for AI Speech Recognition.” Instead of converting a sound file to text, it uses NVIDIA’s WaveGlow model to convert text to a realistic human voice.

Create your own AI & Audio Dash apps

Dash is open-source, free, and available for Python, R, and Julia. You can download Dash for Python right now with pip install dash. See the Dash Python docs for help in creating your first AI/ ML Dash app.

Once you’ve created a custom AI/ML tool with Dash, you’ll want an ML Ops platform to deploy it. Take a look at Dash Enterprise for secure, horizontally scalable hosting and deployment of Dash apps and Jupyter notebooks.