Temat: speech segmentation

Skocz do pozycji: 1.

Tytuł:: Phoneme Segmentation Based on Wavelet Spectra Analysis
Autorzy:: Ziółko, B.
Manandhar, S.
Wilson, R. C.
Ziółko, M.
Tematy:: speech recognition
speech segmentation
discrete wavelet transform; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/177480.pdf Link otwiera się w nowym oknie
Opis:: A phoneme segmentation method based on the analysis of discrete wavelet transform spectra is described. The localization of phoneme boundaries is particularly useful in speech recognition. It enables one to use more accurate acoustic models since the length of phonemes provide more information for parametrization. Our method relies on the values of power envelopes and their first derivatives for six frequency subbands. Specific scenarios that are typical for phoneme boundaries are searched for. Discrete times with such events are noted and graded using a distribution-like event function, which represent the change of the energy distribution in the frequency domain. The exact definition of this method is described in the paper. The final decision on localization of boundaries is taken by analysis of the event function. Boundaries are, therefore, extracted using information from all subbands. The method was developed on a small set of Polish hand segmented words and tested on another large corpus containing 16 425 utterances. A recall and precision measure specifically designed to measure the quality of speech segmentation was adapted by using fuzzy sets. From this, results with F-score equal to 72.49% were obtained.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 2.

Tytuł:: Rule Based Speech Signal Segmentation
Autorzy:: Greibus, M.
Telksnys, L.
Tematy:: rule base
speech analysis
speech endpoint detection
speech segmentation; Pokaż więcej
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Powiązania:: https://bibliotekanauki.pl/articles/308535.pdf Link otwiera się w nowym oknie
Opis:: This paper presents the automated speech signal segmentation problem. Segmentation algorithms based on energetic threshold showed good results only in noise-free environments. With higher noise level automatic threshold calculation becomes complicated task. Rule based postprocessing of segments can give more stable results. Off-line, on-line and extrema types of rules are reviewed. An extrema-type segmentation algorithm is proposed. This algorithm is enhanced by a rule base to extract higher energy level segments from noise. This algorithm can work well with energy like features. The experiments were made to compare threshold and rule-based segmentation in different noise types. Also was tested if multifeature segmentation can improve segmentation results. The extrema rule-based segmentation showed smaller error ratio in different noise types and levels. Proposed algorithm does not require high calculation resources. Such algorithm can be processed by devices with limited computing power.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: Segmentacja mowy polskiej z wykorzystaniem transformacji falkowej
Speech segmentation in polish language by wavelet transformation
Autorzy:: Tarasiuk, M.
Gosiewski, Z.
Tematy:: segmentacja słów
transformacja falkowa
MATLAB
mowa polska
speech segmentation
wavelet transformation
Polish speech; Pokaż więcej
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Powiązania:: https://bibliotekanauki.pl/articles/386344.pdf Link otwiera się w nowym oknie
Opis:: W artykule przedstawiono koncepcję metody segmentacji słów wypowiadanych w języku polskim. Jako narzędzie w procesie segmentacji wykorzystano transformację falkową. Zaproponowano algorytm postępowania oraz przedstawiono wyniki prowadzonych prac badawczych. Wykorzystując opracowaną metodę dokonano podziału wypowiadanych słów i sprawdzono poprawność jego wykonania. Niniejsze badanie stanowi platformę bazową do dalszych prac zmierzających w kierunku opracowania automatycznego systemu rozpoznawania mowy. Badania i obliczenia wykonywano w oparciu o oprogramowanie Matlab.
This article introduces an conception on polish spoken words segmentation using wavelet transformation. There was suggested an algorithm and presented achievements made during researches. Spoken words were then divided and their segmentation correctness was verified with use of mentioned above method. This study provides a base platform for further development of the automatic speech recognition system. Research and calculations were executed in MATLAB.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Speech Rhythm in English and Italian: an Experimental Study on Early Sequential Bilingualism
Autorzy:: Verbeni, Vincenzo
Tematy:: speech rhythm
language acquisition
sequential bilingualism
immersion education
interval-based metrics
speech segmentation; Pokaż więcej
Wydawca:: Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Powiązania:: https://bibliotekanauki.pl/articles/57112439.pdf Link otwiera się w nowym oknie
Opis:: The study investigates the dynamics of speech rhythm in early sequential bilingual children who have access to Italian-English immersion programs. The research focused on the Italian and English semi-spontaneous narrative productions of 9 students, aged between 6;7 and 10;11 and distributed across three different classes (Year 1, Year 3, Year 5). Their speech was recorded and subject to an interval-based analysis via computation of %V/ΔC, PVI and Varco metrics. The retrieved metrics underwent within-group and between-group one-way ANOVAs in order to identify valuable cross-linguistic variations among children of the same age and statistically significant differences between different age groups (Y1, Y3, Y5). The results appear to support a stress-centered interpretation of speech rhythm: according to this view, all languages could be arranged on a stress-timed continuum in which “syllable-timing” is marked by sparser occurrences of (regular) prominence due to the relative absence of vocalic elision and consonantal complexity. Indeed, the comparative analysis drawn between the normalized vocalic indexes of Y1, Y3 and Y5 students revealed a statistically relevant increase in vocalic variation phenomena both in Italian and in English. Moreover, Y1 and Y3 consonantal scores were comparatively higher in the Italian sample: it will be discussed how unpredictable stress-timed patterns can arise as a function of proficiency, speech-rate and age-related disfluencies.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: Automatic speech signal segmentation based on the innovation adaptive filter
Autorzy:: Makowski, R.
Hossa, R.
Tematy:: automatic speech segmentation
inter phoneme boundaries
Schur adaptive filtering
detection threshold determination
automatyczna segmentacja mowy
filtracja adaptacyjna
określenie progu detekcji; Pokaż więcej
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Powiązania:: https://bibliotekanauki.pl/articles/330096.pdf Link otwiera się w nowym oknie
Opis:: Speech segmentation is an essential stage in designing automatic speech recognition systems and one can find several algorithms proposed in the literature. It is a difficult problem, as speech is immensely variable. The aim of the authors’ studies was to design an algorithm that could be employed at the stage of automatic speech recognition. This would make it possible to avoid some problems related to speech signal parametrization. Posing the problem in such a way requires the algorithm to be capable of working in real time. The only such algorithm was proposed by Tyagi et al., (2006), and it is a modified version of Brandt’s algorithm. The article presents a new algorithm for unsupervised automatic speech signal segmentation. It performs segmentation without access to information about the phonetic content of the utterances, relying exclusively on second-order statistics of a speech signal. The starting point for the proposed method is time-varying Schur coefficients of an innovation adaptive filter. The Schur algorithm is known to be fast, precise, stable and capable of rapidly tracking changes in second order signal statistics. A transfer from one phoneme to another in the speech signal always indicates a change in signal statistics caused by vocal track changes. In order to allow for the properties of human hearing, detection of inter-phoneme boundaries is performed based on statistics defined on the mel spectrum determined from the reflection coefficients. The paper presents the structure of the algorithm, defines its properties, lists parameter values, describes detection efficiency results, and compares them with those for another algorithm. The obtained segmentation results, are satisfactory.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Phonetic Segmentation using a Wavelet-based Speech Cepstral Features and Sparse Representation Classifier
Autorzy:: Al-Hassani, Ihsan
Al-Dakkak, Oumayma
Assami, Abdlnaser
Tematy:: Arabic speech corpus
ASR
F1-score
phonetic segmentation
sparse representation classifier
TTS
wavelet packet; Pokaż więcej
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Powiązania:: https://bibliotekanauki.pl/articles/2058484.pdf Link otwiera się w nowym oknie
Opis:: Speech segmentation is the process of dividing speech signal into distinct acoustic blocks that could be words, syllables or phonemes. Phonetic segmentation is about finding the exact boundaries for the different phonemes that composes a specific speech signal. This problem is crucial for many applications, i.e. automatic speech recognition (ASR). In this paper we propose a new model-based text independent phonetic segmentation method based on wavelet packet speech parametrization features and using the sparse representation classifier (SRC). Experiments were performed on two datasets, the first is an English one derived from TIMIT corpus, while the second is an Arabic one derived from the Arabic speech corpus. Results showed that the proposed wavelet packet decomposition features outperform the MFCC features in speech segmentation task, in terms of both F1-score and R-measure on both datasets. Results also indicate that the SRC gives higher hit rate than the famous k-Nearest Neighbors (k-NN) classifier on TIMIT dataset.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Informacja

Wyszukujesz frazę "speech segmentation" wg kryterium: Temat