You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,26 +17,26 @@ This is a stripped down version of the [**Open Speech Recording**](https://githu
17
17
python -m flask run
18
18
```
19
19
20
-
3. Then open the link provided in the terminal in a web browser to run the application. Make sure to **run the application in a private or incognito window** which avoids any cacheing issues. Also we've found that the app works best when **using Chrome**. Once the app opens you'll need give access to your microphone, and then you can click ```Record```. Once you finish recording all of the words a popup will appear and ask if you'd like to download the data. Simply click ```OK``` and the files will be downloaded into the folder from which you are running the flask app (which should be the open-speech-recording folder). You can then close the browser window and kill the app with ```cntrl+c```.
20
+
3. Then open the link provided in the terminal in a web browser to run the application. Make sure to **run the application in a private or incognito window** which avoids any cacheing issues. Also we've found that the app works best when **using Chrome**. Once the app opens you'll need give access to your microphone, and then you can click ```Record```. Once you finish recording all of the words a popup will appear and ask if you'd like to download the data. Simply click ```OK``` and the files will be downloaded into the folder from which you are running the flask app (which should be the open-speech-recording folder).
21
21
22
-
Note: if you want to change the words or counts recorded make sure to kill and re-start the app and open the link in a new incognito window!
22
+
Note: if you want to change the words or counts make sure to kill and re-start the app and open the link in a new incognito window to avoid any cacheing issues at the server or browser level!
23
23
24
24
### You can use the scripts to manipulate the data as follows:
25
25
26
-
1. You now have a large collection of ```.ogg``` files in your directory. We now need to convert them to ```.wav``` files using ```ffmpeg```:
26
+
1. You can convert the ```.ogg``` files to ```.wav``` files using ```ffmpeg```:
2.Then we need to trim them using Pete's tool. Note: background noise does not need to be trimmed and does not need to be recorded by the tool — you simply need to convert them to ```.wav``` files. In fact, they are expected to be longer. See the [original dataset](https://storage.googleapis.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz) for examples.
33
+
2.You can then trim the ```.wav``` files with Pete's tool.
3. Finally we need to create the directory structure as the training script expects that the data is organized into folders where there is one folder for each of the ```wanted_words``` filled with ```.wav``` files of the wanted words (as 1 second clips) and an additional folder called ```_background_noise_``` containing longer files of background noise. We will then compress this into a zip file so it can be easily uploaded to Colab.
39
+
3. Finally we can create the directory structure expected by the Tensorflow training script by running another of Pete's scripts adn then compress it into a zip file so it can be easily uploaded to Colab.
0 commit comments