user avatar
Shohei Taniguchi
@ishohei220
Reseacher at the University of Tokyo @Matsuo_Lab. Deep generative models, stochastic optimization.
Joined March 2015
Posts
  • Pinned
    user avatar
    Our NeurIPS paper is published on arXiv. In this paper, we propose a new optimizer ADOPT, which converges better than Adam in both theory and practice. You can use ADOPT by just replacing one line in your code. arxiv.org/abs/2411.02853
  • user avatar
    NeurIPSใฎ่ซ–ๆ–‡ใ‚’arXivใซไธŠใ’ใพใ—ใŸ๏ผŽAdamใซ่ปฝๅพฎใชไฟฎๆญฃใ‚’ๅŠ ใˆใ‚‹ใ“ใจใง๏ผŒใƒใ‚คใƒ‘ใƒฉใซไพๅญ˜ใ›ใšใซๅธธใซๅŽๆŸใ‚’ไฟ่จผใงใใ‚‹ใ“ใจใ‚’็คบใ—ใŸ่ซ–ๆ–‡ใงใ™๏ผŽๆๆกˆๆณ•ใฎADOPTใฏ๏ผŒใ‚ณใƒผใƒ‰ใ‚’1่กŒๅค‰ใˆใ‚Œใฐใ™ใใซไฝฟใˆใ‚‹ใฎใง๏ผŒใœใฒไฝฟใฃใฆ่ฆ‹ใฆใใ ใ•ใ„๏ผŽ
    Our NeurIPS paper is published on arXiv. In this paper, we propose a new optimizer ADOPT, which converges better than Adam in both theory and practice. You can use ADOPT by just replacing one line in your code. arxiv.org/abs/2411.02853
  • user avatar
    slideshare.net/ShoheiTaniguchโ€ฆ ใ“ใกใ‚‰่ฌ›็พฉใ‚นใƒฉใ‚คใƒ‰ใ‚’ๅ…ฌ้–‹ใ—ใพใ—ใŸ
    ใ“ใกใ‚‰็ฌฌ5ๅ›žใฎ่ฌ›ๅธซใ‚’ๆ‹…ๅฝ“ใ™ใ‚‹ใ“ใจใซใชใ‚Šใพใ—ใŸใ€‚ๅผทๅŒ–ๅญฆ็ฟ’ใจ็ขบ็އๆŽจ่ซ– (ใƒ™ใ‚คใ‚บ็ตฑ่จˆ) ใฎ้–ขไฟ‚ใ€POMDPใซใŠใ‘ใ‚‹ๅผทๅŒ–ๅญฆ็ฟ’ๆ‰‹ๆณ• (e.g., ไธ–็•Œใƒขใƒ‡ใƒซ) ใซใคใ„ใฆ่ฉฑใ—ใพใ™ใ€‚่ˆˆๅ‘ณใ‚ใ‚‹ไบบใฏใœใฒๅฟœๅ‹Ÿใ—ใฆใใ ใ•ใ„๏ผ็ท ๅˆ‡ใฏๆ—ฅๆ›œๆ—ฅใฎ23:59ใงใ™ใ€‚
  • user avatar
    ไธป่‘—ใŒNeurIPSใซ้€šใ‚Šใพใ—ใŸใ€‚ AdamใฎๅŽๆŸๆ€งใฎ่งฃๆžใจใ€ใใ‚ŒใซๅŸบใฅใใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใฎๆ”น่‰ฏใ‚’ๆๆกˆใ™ใ‚‹่ซ–ๆ–‡ใงใ™ใ€‚ ๆๆกˆๆณ•ใฎADOPTใฏๅน…ๅบƒใ„ใ‚ฟใ‚นใ‚ฏใงAdamใ‚ˆใ‚Šใ„ใ„ๆ€ง่ƒฝใ‚’็คบใ—ใ€LLMใฎ่จ“็ทดใงใ‚‚ใƒญใ‚นใฎใ‚นใƒ‘ใ‚คใ‚ฏใ‚’ๆŠ‘ใˆใ‚‰ใ‚Œใ‚‹ใ“ใจใ‚’็ขบ่ชใ—ใฆใ„ใพใ™ใ€‚ ่ฟ‘ใ„ใ†ใกใซarXivใซใ‚‚ไธŠใ’ใ‚‹ไบˆๅฎšใงใ™ใ€‚
    ๅฝ“็ ”็ฉถๅฎคใฎ่ซ–ๆ–‡ใŒNeurIPS 2024ใซ๏ผ’ไปถๆŽก้Œฒใ•ใ‚Œใพใ—ใŸใ€‚ weblab.t.u-tokyo.ac.jp/%e5%bd%93%e7%aโ€ฆ
  • user avatar
    ๅ’่ซ–ใจใจใ‚‚ใซๅ–ใ‚Š็ต„ใ‚“ใงใใŸPyTorchใจPixyzใซใ‚ˆใ‚‹DeepMindใฎGQNใฎๅ†็พๅฎŸ่ฃ…ใ‚’ๅ…ฌ้–‹ใ—ใพใ—ใŸใ€‚่‘—่€…ใฎEslamiใ•ใ‚“ใ‹ใ‚‰็›ดๆŽฅใ™ในใฆใฎใƒใ‚คใƒ‘ใƒฉใ‚’ๆ•™ใˆใฆใ„ใŸใ ใ„ใŸใฎใงใ€่ซ–ๆ–‡ใซๅฟ ๅฎŸใชๅฎŸ่ฃ…ใซใชใฃใฆใ„ใพใ™ใ€‚่ˆˆๅ‘ณใฎใ‚ใ‚‹ๆ–นใฏใฉใ†ใžใ€‚(ใชใŠGPU4ๆžšใฏๅฟ…้ ˆใงใ™) github.com/iShohei220/torโ€ฆ github.com/masa-su/pixyzoโ€ฆ
  • user avatar
    OMRON SINIC Xใงใฎใ‚คใƒณใ‚ฟใƒผใƒณใฎๆˆๆžœใŒICRA2020ใซ้€šใ‚Šใพใ—ใŸใ€‚ๆŠŠๆŒ็‰ฉไฝ“ใŒ็’ฐๅขƒใจๆŽฅ่งฆใ—ใŸ้š›ใฎๅนพไฝ•็š„ใชๅˆถ็ด„ใ‚’็”จใ„ใฆ็ฒ’ๅญใƒ•ใ‚ฃใƒซใ‚ฟใ‚’ๆ›ดๆ–ฐใ™ใ‚‹ใ“ใจใงใ€ใƒญใƒœใƒƒใƒˆใƒใƒณใƒ‰ๅ†…ใฎ็‰ฉไฝ“ๅงฟๅ‹ขใ‚’ๆŽจๅฎšใ™ใ‚‹็ ”็ฉถใงใ™ใ€‚ใƒ™ใ‚คใ‚บร—ใƒญใƒœใƒƒใƒˆๅˆถๅพกใซ่ˆˆๅ‘ณใŒใ‚ใ‚‹ไบบใฏใœใฒ่ชญใ‚“ใงใฟใฆใใ ใ•ใ„ใƒผ
  • user avatar
    PFNใฎใ‚คใƒณใ‚ฟใƒผใƒณ้€šใฃใŸใƒผ
  • user avatar
    ๅผŠใƒฉใƒœใ‹ใ‚‰ใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใฎLLMใŒๅ‡บใพใ—ใŸ ็พๆ™‚็‚นใงๆ—ฅๆœฌ่ชžๅฏพๅฟœใฎใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใงๆœ€ๅคง่ฆๆจกใชใฎใงใ€ใ„ใ‚ใ„ใ‚ไฝฟใฃใฆ้Šใ‚“ใงใใ ใ•ใ„
    100ๅ„„ใƒ‘ใƒฉใƒกใƒผใ‚ฟใ‚ตใ‚คใ‚บใƒปๆ—ฅ่‹ฑ2ใƒถๅ›ฝ่ชžๅฏพๅฟœใฎๅคง่ฆๆจก่จ€่ชžใƒขใƒ‡ใƒซโ€œWeblab-10Bโ€ใ‚’ใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใงๅ…ฌ้–‹ใ—ใพใ—ใŸใ€‚ใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใฎๆ—ฅๆœฌ่ชžๅคง่ฆๆจก่จ€่ชžใƒขใƒ‡ใƒซใงๆœ€้ซ˜ๆฐดๆบ–ใงใ™ใ€‚๏ผˆJGLUE่ฉ•ไพกใงใฎๅฎŸ็ธพ๏ผˆ2023ๅนด8ๆœˆ16ๆ—ฅๆ™‚็‚น๏ผ‰๏ผ‰bit.ly/47zPTJi
    Readers added context
    ใ“ใฎใƒ—ใƒฌใ‚นใƒชใƒชใƒผใ‚นใฏใ€Open Source InitiativeใŒๅฎš็พฉใ™ใ‚‹ไธ€่ˆฌ็š„ใช "Open Source" ใจใฏ็•ฐใชใ‚‹ๅฎš็พฉใงใ€Œใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใ€ใจ่กจ็คบใ•ใ‚Œใฆใ„ใ‚‹ใ“ใจใซๆณจๆ„ใŒๅฟ…่ฆใงใ™ใ€‚ opensource.org/faq/#commercial 2023ๅนด8ๆœˆ19ๆ—ฅ็พๅœจใ€Weblab-10BใฏHugging FaceไธŠใซใŠใ„ใฆCC BY-NC 4.0ใƒฉใ‚คใ‚ปใƒณใ‚นใงๅ…ฌ้–‹ใ•ใ‚Œใฆใ„ใพใ™ใ€‚ huggingface.co/matsuo-lab/webโ€ฆ huggingface.co/matsuo-lab/webโ€ฆ CC BY-NC 4.0ใƒฉใ‚คใ‚ปใƒณใ‚นใฏๅ•†็”จๅˆฉ็”จใ‚’็ฆใšใ‚‹ใ‚‚ใฎใงใ€ OSI Approved Licenses ใฎใƒชใ‚นใƒˆใซใฏๆŽฒ่ผ‰ใ•ใ‚Œใฆใ„ใพใ›ใ‚“ใ€‚ opensource.org/licenses/
  • user avatar
    ไธป่‘—่ซ–ๆ–‡ใŒNeurIPS 2022ใซๆŽกๆŠžใ•ใ‚Œใพใ—ใŸ VAEใฎใ‚ˆใ†ใชใ‚จใƒณใ‚ณใƒผใƒ€ใ‚’ไฝฟใฃใŸๆŽจ่ซ–ใ‚’MCMCใซใ‚‚ๅ–ใ‚Šๅ…ฅใ‚Œใ‚ˆใ†ใจใ„ใ†็ ”็ฉถใงใ™ ใใ‚Œใ‚’ไฝฟใฃใŸๆ–ฐใ—ใ„ๆทฑๅฑค็”Ÿๆˆใƒขใƒ‡ใƒซ Langevin autoencoderใ‚‚ๆๆกˆใ—ใพใ—ใŸ
    Our paper โ€œLangevin Autoencoders for Learning Deep Latent Variable Modelsโ€ has been accepted at NeurIPS 2022๐ŸŽ‰ We proposed a novel framework of deep generative models named the Langevin autoencoder (LAE). Brief summary in the thread below. arxiv.org/abs/2209.07036
  • user avatar
    ๅฅจๅŠฑใ•ใ‚Œใพใ—ใŸ
    ๅฝ“็ ”็ฉถๅฎคไฟฎๅฃซ๏ผ‘ๅนดใฎ่ฐทๅฃๅฐšๅนณใใ‚“ใฎ็™บ่กจใŒใ€2019ๅนดๅบฆไบบๅทฅ็Ÿฅ่ƒฝๅญฆไผš ๅ…จๅ›ฝๅคงไผšๅญฆ็”ŸๅฅจๅŠฑ่ณžใซ้ธใฐใ‚Œใพใ—ใŸใ€‚ weblab.t.u-tokyo.ac.jp/%e5%bd%93%e7%aโ€ฆ
  • user avatar
    ใ“ใกใ‚‰็ฌฌ5ๅ›žใฎ่ฌ›ๅธซใ‚’ๆ‹…ๅฝ“ใ™ใ‚‹ใ“ใจใซใชใ‚Šใพใ—ใŸใ€‚ๅผทๅŒ–ๅญฆ็ฟ’ใจ็ขบ็އๆŽจ่ซ– (ใƒ™ใ‚คใ‚บ็ตฑ่จˆ) ใฎ้–ขไฟ‚ใ€POMDPใซใŠใ‘ใ‚‹ๅผทๅŒ–ๅญฆ็ฟ’ๆ‰‹ๆณ• (e.g., ไธ–็•Œใƒขใƒ‡ใƒซ) ใซใคใ„ใฆ่ฉฑใ—ใพใ™ใ€‚่ˆˆๅ‘ณใ‚ใ‚‹ไบบใฏใœใฒๅฟœๅ‹Ÿใ—ใฆใใ ใ•ใ„๏ผ็ท ๅˆ‡ใฏๆ—ฅๆ›œๆ—ฅใฎ23:59ใงใ™ใ€‚
    ใ€ๅญฆ็”Ÿ้™ๅฎš๏ผš็ŸญๆœŸ่ฌ›ๅบง็ฌฌ1ๅผพใ€‘ๅผทๅŒ–ๅญฆ็ฟ’่ฌ›ๅบงใฎๅ‹Ÿ้›†้–‹ๅง‹๏ผ8/11ใ‚ˆใ‚Šๅ…จ6ๅ›žใฎ่ฌ›ๅบงใงใ™ใ€‚ๅผทๅŒ–ๅญฆ็ฟ’ใฎๅŸบ็คŽใ‹ใ‚‰ใ€sim2realใ€ๆจกๅ€ฃๅญฆ็ฟ’ใ€Control as Inferenceใ€ไธ–็•Œใƒขใƒ‡ใƒซใชใฉใ‚’ใ‚ซใƒใƒผใ—ใพใ™ใ€‚ๆทฑๅฑคๅญฆ็ฟ’ใฎๅŸบ็คŽใ‚’็†่งฃใ—ใฆใ„ใ‚‹ๅญฆ็”Ÿใ•ใ‚“ใฏใœใฒใ”ๅฟœๅ‹Ÿใ‚’๏ผ๏ผˆ7ๆœˆ26ๆ—ฅ 23:59็ท ใ‚ๅˆ‡ใ‚Š๏ผ‰ deeplearning.jp/reinforcement_โ€ฆ
  • user avatar
    **Update on the ADOPT optimizer** To address several reports that ADOPT sometimes gets unstable, a minor modification has been made to the algorithm. We observe that this modification greatly improves stability in many cases.
    Our NeurIPS paper is published on arXiv. In this paper, we propose a new optimizer ADOPT, which converges better than Adam in both theory and practice. You can use ADOPT by just replacing one line in your code. arxiv.org/abs/2411.02853
  • user avatar
    Our paper โ€œLangevin Autoencoders for Learning Deep Latent Variable Modelsโ€ has been accepted at NeurIPS 2022๐ŸŽ‰ We proposed a novel framework of deep generative models named the Langevin autoencoder (LAE). Brief summary in the thread below. arxiv.org/abs/2209.07036
  • user avatar
    ๅนฃ็ ”M2ใฎๅค็”ฐใใ‚“ใ€็›ด่ฟ‘1ๅนดใงICLRใ€ICMLใ€NeurIPSใซใ™ในใฆ็ญ†้ ญ่‘—่€…ใง่ซ–ๆ–‡้€šใ—ใฆใ„ใฆใ€ใพใ˜ใงใ™ใ”ใ„
    ๆ—ขๅญ˜ใฎๆทฑๅฑคๅผทๅŒ–ๅญฆ็ฟ’ใฎใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใ‚’ๅˆ†้กžใ—ใŸ่ซ–ๆ–‡ใŒNeurIPS2021ใซๆŽกๆŠžใ•ใ‚Œใพใ—ใŸ๏ผๆพๅฐพ็ ”ใ€@Tdash_Kozใ•ใ‚“ใ€ @shaneguMLใ•ใ‚“ใฎๅ…ฑๅŒ็ ”็ฉถใฎๆˆๆžœใงใ™ arxiv.org/abs/2103.17258