pp comments

Results 9 comments of

pp

Error when training on ScanNet dataset

> @bityigoss did you solve it? @florinhegedus i increase the bs, but i stil met the problem. Sorry, I didn't find a solution

about post normalization

@vince62s Thanks for your reply, here is my code for post-normalization #2199

about post normalization

@vince62s thank you, I tested on IWSLT14-DE-EN dataset, and my training config is like(skiped the cropus and vocab opts parts ): if I set **normalize_after** option to true, the BLEU...

@mayang1 训练数据基于生成数据训练的，最终数量约900万条，主要因为图片数据太大，不太方便上传。 - 生成工具主要基于ReadMe.txt中使用的repo代码，可以根据情况增加一些效果，如阴影边框等； - 语料是基于开放的搜狗新闻+wiki等中文语料提取，提取时基本遵循标点符号断句，样本字符长度[6,24]。 - 分析了部分字符的分布，对样本进行了采样和选择，去掉了部分低频字符，最终选择了config/chn.txt字符列表 - 生成图片样本时，也采用了多个开源字体如果你有好的建议或者数据集，欢迎分享，谢谢。

dataset

LMDB数据格式转换可使用脚本：[utils/create_lmdb_dataset.py](https://github.com/bityigoss/mtl-text-recognition/blob/master/utils/create_lmdb_dataset.py)

dataset

@longnanxi 训练log中的准确率如何，预测是在测试集上做的还很差吗？还有考虑训练集是否太小导致过拟合了。跑通代码可以首先下载预训练的模型运行预测部分，了解预测过程；其次构造训练样本迭代，与预训练模型的baseline对比；

您好，我在使用infer的时候出现了问题

https://github.com/pytorch/pytorch/issues/23393 看看这里是否有帮助

Additional support of MariaDB

@matrixssy any progress ? thank you.

Everyone, I have implemented multi-token prediction of InfiniAttention and meta.

if we still tie the weights of lm_head in multi-token prediction, how they output different token predictions? @win10ogod