This repository contains a flask API for the project FiggleSpeak, aimed to provide real-time feedback on users' pronunciation and articulation.
We then deploy this API on Google Cloud Run, from which, we interface with in the front end.
- Clone the repository.
- Create a .env file, and within it add a Hugging Face token, see sample.env for an idea of the format.
- Within the directory of the cloned repository, run
docker build -t <tag_name> .in order to build the image. - Next, we can run the application in a container using the
docker runcommand.
$ docker run -e PORT=<port_number> -p <port_number>:<port_number> <tag_name>
- The server may now be accessed at
localhost:<port_number>or127.0.0.1:<port_number>.
For additional help, you may refer to the Docker documentation on containerising an application.
- Before deploying to Cloud Run, we first have to push the docker image to a supported container registry. Here, we'll use Artifact Registry.
- We create a Google Cloud project, and enable Artifact Registry.
- We need to install the Google Cloud CLI, following the instructions here.
- Create a remote repository in Artifact Registry.
- Following that, run
gcloud auth configure-dockerto authenticate yourself with your gmail account. - Next, we run the following commands, (also found in deploy.sh, however this will require a few modifications):
docker build -t figglespeak-api .
docker tag figglespeak-api gcr.io/<project_id>/figglespeak-api
docker push gcr.io/<project_id>/figglespeak-api
- We may check if the image has been successfully pushed, by checking the Artifact Registry console at https://console.cloud.google.com/artifacts
- We then go over to Cloud Run, and start a new service.
- Fill up the form accordingly, however ensure that the memory allocated to the instance is >= 4 Gb.
- The server may now be accessed at a location specified on the Google Cloud Run server.
For additional help, you may refer to the Google Cloud Documentation on deploying to Cloud Run or the Google Cloud Documentation on pushing and pulling images from Artifact Registry.
GET '/' RESPONSE "Hello, world!"
POST '/evaluate_user'
REQUEST
{
"audio": "# audio clip to be evaluated, in any audio file format",
"sentence": "# sentence for which the audio clip is to be compared against"
}RESPONSE
[
[
An array containing an array for the letters of each word, indicating if that portion of that word had been pronounced properly.
],
[
An array containing pronunciation tips for a phoneme in that word, for each word in the sentence
]
]
-
g2p - Grapheme to Phoneme