The EMIME Mandarin/English Bilingual Database

README which describes the main features of the Mandarin/English portion of the EMIME Bilingual Database.

----------------------------------------------------------------------

The EMIME Bilingual Mandarin/English Database
         Version 1.1 RELEASE February 2011

            Mirjam Wester
            mwester@inf.ed.ac.uk
   Centre for Speech Technology Research,
       University of Edinburgh

----------------------------------------------------------------------

This Database - The EMIME Bilingual Mandarin/English Database - is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/. A copy of this license is available in this directory: odbl-10.txt.

This file - README_mandarin_1.1.txt - describes the EMIME Bilingual database. It is a Mandarin/English bilingual database recorded at the University of Edinburgh in 2010 in the context of the EMIME project (www.emime.org). It includes the recordings of seven female and seven male speakers of Mandarin.

For further documentation and when referencing this database in publications, please refer to: M. Wester & H. Liang "The EMIME Mandarin Bilingual Database", Technical Report EDI-INF-RR-xxxx, University of Edinburgh, February 2011. (In this directory: EMIME_MANDARIN_DATABASE_ACCENTS.pdf)

----------------------------------------------------------------------

0. Notes before you start
1. Directory structure
2. Wave files
   2.1 Microphones
   2.2 Sampling rate
   2.3 Segmentation
   2.4 Naming convention
3. Prompts
   3.1 English
   3.2 Mandarin
4. Test set

**********************
0. Notes before you start
**********************

For further documentation and when referencing this database in publications, please refer to: M. Wester & H. Liang "The EMIME Mandarin Bilingual Database", Technical Report EDI-INF-RR-xxxx, University of Edinburgh, February 2011. (In this directory: EMIME_MANDARIN_DATABASE_ACCENTS.pdf)

For further information contact Mirjam Wester, mwester@inf.ed.ac.uk

**********************
1. Directory structure
**********************
Example of part of the directory structure:

UEDIN_bilingual_data_2010/
   downsampled_22kHz/
    Mandarin_talkers/
         Female/
            MF1/
           Eng/
              MF1_ENG_0001_0.wav
              MF1_ENG_0001_1.wav
              MF1_ENG_0002_0.wav
           Man/
              MF1_MAN_0001_0.wav
              MF1_MAN_0001_1.wav
              MF1_MAN_0002_0.wav
        MF2/
        |
        |
        MF7/
         Male/
            MM1/
        MM2/
        |
        |
        MM7/
   Prompts/
      English/
      Mandarin/
**********************
2. Wave files
**********************

2.1.Microphones
---------------
Two different microphones were used, a close-talking DPA 4035 mounted on the subjects headphones and a Sennheiser MKH 800 p48 microphone placed about 10cm from subject using an omnidirectional pattern. The speech was sampled at 96kHz 24bit depth and stored directly to a computer. These recordings were subsequently downsampled, using Pro-Tools to 22 kHz 16bit.

DPA 4035 is referred to as microphone 0
Sennheiser MKH 800 is referred to as microphone 1

2.2.Sampling rate
-----------------
The original 96khz files are not included in this release but can be requested from mwester@inf.ed.ac.uk. The 22kH files can be found in the directory "downsampled_22kHz".

2.3. Segmentation
-----------------
All recordings were obtained with subjects reading the prompts from paper. Subsequently, they have been hand-segmented so leading and trailing silence and mouth smacks etc, have been removed.

2.4. Naming convention
----------------------
Filename examples:
MM1_ENG_0001_0.wav

MM1 = Mandarin male 1
ENG = English
0001 = sentence 1
0 = microphone 0 (close-talking)

MF5_ENG_0033_1.wav

MF5 = Mandarin female 5
ENG = English
0033 = sentence 33
1 = microphone 1 (Sennheiser)

**********************
3. Prompts
**********************
There is a prompt set for English and for Mandarin.
Each set contains:
* 25 Europarl sentences, (for Mandarin these have been translated from English)
* 100 English news sentences / 124 Mandarin news sentences
* 20 semantically unpredictable sentences (SUS).

3.1. English
------------
The news sentences for English were taken from the Wall Street Journal 1 corpus comprising 40 enrolment sentences and 60 test set sentences. Enrolment sentences from wsj1/doc/lng_model/adapt.txt/pl_adapt and the other 60 from wsj1/doc/lng_modl/rec_txt/pl_dt_20.nvp

3.2. Mandarin
------------
The Mandarin news sentences were selected from the Speecon corpus.

**********************
4. Test sets
**********************

4.2 English
------------
The list of sentences in the English test set can be found in Prompts/English/EnglishTestingData. This is the test set that was defined for: M. Wester & R. Karhila ``Speaker Similarity Evaluation of Foreign-accented speech Synthesis using HMM-based speaker adaptation'' In Proc. of ICASSP, 2011.

4.1 Mandarin
------------
As the Mandarin news sentences are rather long a designated test set has been designed. It contains 22 of the 25 Europarlement sentences and a set of longer sentences (7) which have been segmented into parts (18), in total the test set contains 40 sentences.

The test set prompts can be found in Prompts/Mandarin/MandarinTestingData. The segmented files can be found in UEDIN_mandarin_bi_data_2010/downsampled_22kHz/Mandarin_test_set_segmentations/ . Only _0 (DPA 4035 microphone) sentences have been segmented.

----------------------------------------------------------------------

Document Actions

Print this

Sections

Personal tools

The EMIME Mandarin/English Bilingual Database

Document Actions