<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://www.cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://www.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Dongxu_Zhang_14-11-03</id>
		<title>Dongxu Zhang 14-11-03 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://www.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Dongxu_Zhang_14-11-03"/>
		<link rel="alternate" type="text/html" href="http://www.cslt.org/mediawiki/index.php?title=Dongxu_Zhang_14-11-03&amp;action=history"/>
		<updated>2026-04-04T01:02:18Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://www.cslt.org/mediawiki/index.php?title=Dongxu_Zhang_14-11-03&amp;diff=12238&amp;oldid=prev</id>
		<title>Zhangdx：以“=== Accomplished this week === * Create 100k,200k,150576 vocabulary. And use 150576 to build baiduhi, baiduzhidao language model(still running, preprocess).  * Use 1...”为内容创建页面</title>
		<link rel="alternate" type="text/html" href="http://www.cslt.org/mediawiki/index.php?title=Dongxu_Zhang_14-11-03&amp;diff=12238&amp;oldid=prev"/>
				<updated>2014-11-02T16:52:38Z</updated>
		
		<summary type="html">&lt;p&gt;以“=== Accomplished this week === * Create 100k,200k,150576 vocabulary. And use 150576 to build baiduhi, baiduzhidao language model(still running, preprocess).  * Use 1...”为内容创建页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;=== Accomplished this week ===&lt;br /&gt;
* Create 100k,200k,150576 vocabulary. And use 150576 to build baiduhi, baiduzhidao language model(still running, preprocess). &lt;br /&gt;
* Use 166k vocabulary to train lm on baiduhi, baiduzhidao seperately,(still running ,pruning)&lt;br /&gt;
* Extract sentences which contains English and numbers from weibo corpus.&lt;br /&gt;
* Running BPTT using rwthlm. Still not normal. High ppl, low wer. But it seems that using rwthlm itself, lstm is indeed better than standard bptt.&lt;br /&gt;
* Found a tool called Shenlan which can parse Sogou cell vocabulary. Using its code with a crawler, we can update our vocabulary with new words.&lt;br /&gt;
&lt;br /&gt;
=== Planned for next week ===&lt;br /&gt;
* Working on building lm and comparing vocabulary.&lt;br /&gt;
* Working on rwthlm.&lt;/div&gt;</summary>
		<author><name>Zhangdx</name></author>	</entry>

	</feed>