Skip to content

Bad error message when using closed dictionary: SIGSEGV - A fatal error has been detected by the Java Runtime Environment #214

@azagniotov

Description

@azagniotov

TL;DR: I am getting memory errors:

# Problematic frame:
# J 1404 c1 java.nio.DirectIntBufferU.get(I)I java.base@17.0.8 (34 bytes) @ 0x000000010de8ba86 [0x000000010de8b660+0x0000000000000426]

More Details

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00000001160fc686, pid=47517, tid=10243
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.8+7 (17.0.8+7) (build 17.0.8+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.8+7 (17.0.8+7, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
# Problematic frame:
# J 1398 c1 java.nio.DirectIntBufferU.get(I)I java.base@17.0.8 (34 bytes) @ 0x00000001160fc686 [0x00000001160fc260+0x0000000000000426]
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/azagniotov/git/solr-morphological-analyzer-sudachi/hs_err_pid47517.log
Compiled method (c1)     689 1398   !   3       java.nio.DirectIntBufferU::get (34 bytes)
 total in heap  [0x00000001160fc010,0x00000001160fcef8] = 3816
 relocation     [0x00000001160fc170,0x00000001160fc248] = 216
 main code      [0x00000001160fc260,0x00000001160fca40] = 2016
 stub code      [0x00000001160fca40,0x00000001160fcab0] = 112
 oops           [0x00000001160fcab0,0x00000001160fcab8] = 8
 metadata       [0x00000001160fcab8,0x00000001160fcae8] = 48
 scopes data    [0x00000001160fcae8,0x00000001160fcc40] = 344
 scopes pcs     [0x00000001160fcc40,0x00000001160fcda0] = 352
 dependencies   [0x00000001160fcda0,0x00000001160fcda8] = 8
 handler table  [0x00000001160fcda8,0x00000001160fced8] = 304
 nul chk table  [0x00000001160fced8,0x00000001160fcef8] = 32
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#

I am trying to use one of the prebuilt dictionaries using the http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachidict/sudachi-dictionary-20230711-core.zip, together with Sudachi 0.7.3, which I download from Maven Central in a Gradle-based project

That's how I init the Sudachi tokenizer using the canonical file path to the system_core.dic that I download. I do not have any custom sudachi.json and I guess the default one is used at runtime, whatever is packaged with Sudachi:

 public static Tokenizer fromSystemDict(final String fileCanonicalPath) throws IOException {
        final Config config = Config.defaultConfig().systemDictionary(Paths.get(fileCanonicalPath));
        try (final Dictionary dictionary = new DictionaryFactory().create(config)) {
            return dictionary.create();
        }
    }

I get the error when I simply try to:

sudachiTokenizer.tokenize(Tokenizer.SplitMode.A, "京都。東京.東京都。京都")

This is happening for me both on JDK 11 and 17, and I even tried two different Mac OS laptops. (I am on Mac OS)

The hs_err_pid47801.log is attached
hs_err_pid47801.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions