Zhaomy：/* GFbank */

2014-03-07T04:58:55Z

‎GFbank

Cslt：以内容“==Resoruce Building== * Current text resource has been re-arranged and listed == AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # GA-based block...”创建新页面

2014-03-07T02:51:26Z

以内容“==Resoruce Building== * Current text resource has been re-arranged and listed == AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # GA-based block...”创建新页面

新页面

==Resoruce Building==
* Current text resource has been re-arranged and listed

== AM development ==

=== Sparse DNN ===

* Optimal Brain Damage(OBD).

# GA-based block sparsity

=== Efficient DNN training ===

# Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?

===Multi GPU training===
* Error encountered

===GMM - DNN co-training===
* Error encountered

=== Multilanguage training===

# Pure Chinese training reached 4.9%
# Chinese + English reduced to 7.9%
# English phone set should discriminate beginning phone and ending phone
# Should set up multilingual network structure which shares low layers but separate languages at high layers

===Noise training===

* Train with wsj database by corrupting data with various noise types
:* White noise + car noise training partially completed
:* Mixture training produces better performance for both car and white noise
:* Unknown noise testing is on progress

===AMR compression re-training===
* WeChat uses AMR compression method, which requires adaptation for our AM
* Test AMR & non-AMR model

<pre>
test-wav WAV AMR
model
WAV 4.31 26.09
AMR 13.80 6.77
</pre>

* Prepare to do adaptation

===GFbank===
* Finished the first round of gfbank training & test
* The same gmm model (mfcc feature) was used to get the alignment
* Traing fbank & gfbank based on the mfcc alignment
* Clean training and noise test

<pre>
clean 25db 5db
gf 4.22 5.60 73.03
fb 5.87 84.12
</pre>

===Engine optimization===

* Investigating LOUDS FST.

==Word to Vector==

* Test a training toolkit Standford University, which can involve global information into word2vector training
:* C++ implementation (instead of python) for data pre-processing. Failed. Just use python.

* Basic wordvector plus global sense
:* 1 MB corpus costs 5 mins,vocab size 16698
:* 10 MB corpus costs about 82 mins vocab size 56287

* Improved wordvector with multi sense
:* Almost impossible with the toolkit
:* Can think of pre-training vectors and then do clusering

* WordVecteor-based keyword extraction
:* Prepared 7 category totally 500+ articles
:* A problem in keyword identification. Fix it by using the article vector space

* Investigating Senna toolkit from NEC. Intending to implement POS tagging based on word vectors.

==LM development==

===NN LM===

* Character-based NNLM (6700 chars, 7gram), 500M data training done.
:* Performance lower than word-based NNLM
:* Prepare to run boundary-involved char NNLM

* WordVector-based word and char NNLM training done
:* Google wordvecotr-based NNLM is worse than random initialized NNLM

===3T Sogou LM===

* Improved training
:* 3T LM + Tencent 80k lM: performance worse than the original 80K LM
:* Need to check if it is caused by the mismatched vocabu9lary
:* 3T LM + QA LM : use online1 as the EM target, performance worse than QA LM
:* Probably due to the incorrect EM target

==QA Matching==

* Working on edit FST for fuzzy matching
* TF/IDF score matching completed

==Embedded development==

* CLG embedded decoder is almost done. Online compiler is on progress.
* English scoring is under go

==Speech QA==

* N-best with entity LM was analyzed
* Entity-class LM comparision
:* re-segmentation & re-train
:* SRILM class-based LM ???
:* Subgraph integration from Zhiyong

* WER summary is done
* Prepare to compose a paper

@@ 第56行： / 第56行： @@
       clean     25db    5db
 gf   4.22      5.60    73.03
-fb             5.87    84.12
+fb   4.31      5.87    84.12
 </pre>
 ===Engine optimization===

2014-03-07 - 版本历史

Zhaomy：/* GFbank */

Cslt：以内容“==Resoruce Building== * Current text resource has been re-arranged and listed == AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # GA-based block...”创建新页面