2

I have a Windows dll called some.dll with the following function:

void some_func(TCHAR* input_string)
{
...
}

some_func expects a pointer to utf-16 encoded string.

Running this python code:

from ctypes import *

some_string = "disco duck"
param_to_some_func = c_wchar_p(some_string.encode('utf-16'))  #  here exception!

some_dll = ctypes.WinDLL(some.dll)
some_dll.some_func(param_to_some_func)

fails with exception "unicode string or integer address expected instead of bytes instance"

The documentation for ctypes and ctypes.wintypes is very thin, and I have not found a way to convert a python string to a Windows wide char and pass it to a function.

1 Answer 1

1

According to [Python 3.Docs]: Built-in Types - Text Sequence Type - str (emphasis is mine):

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.

On Win they are UTF16 encoded.

So, the correspondence between CTypes and Python (also visible by checking the differences between):

╔═══════════════╦══════════════╦══════════════╗
║    CTypes     ║   Python 3   ║   Python 2   ║
╠═══════════════╬══════════════╬══════════════╣
║   c_char_p    ║    bytes     ║     str      ║
║   c_wchar_p   ║     str      ║   unicode    ║
╚═══════════════╩══════════════╩══════════════╝

Example:

  • Python 3:

    >>> import ctypes as cts
    >>> import sys
    >>>
    >>> sys.version
    '3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)]'
    >>>
    >>> text_ascii = b"Dummy"
    >>> text_unicode = "Dummy"
    >>>
    >>> cts.c_char_p(text_ascii)
    c_char_p(2563882450144)
    >>>
    >>> cts.c_wchar_p(text_ascii)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unicode string or integer address expected instead of bytes instance
    >>>
    >>> cts.c_char_p(text_unicode)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: bytes or integer address expected instead of str instance
    >>>
    >>> cts.c_wchar_p(text_unicode)
    c_wchar_p(2563878400656)
    
  • Python 2 (note that str <=> unicode conversions are performed automatically):

    >>> import ctypes as cts
    >>> import sys
    >>>
    >>> sys.version
    '2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)]'
    >>>
    >>> text_ascii = "Dummy"
    >>> text_unicode = u"Dummy"
    >>>
    >>> cts.c_char_p(text_ascii)
    c_char_p('Dummy')
    >>>
    >>> cts.c_wchar_p(text_ascii)
    c_wchar_p(u'Dummy')
    >>>
    >>> cts.c_char_p(text_unicode)
    c_char_p('Dummy')
    >>>
    >>> cts.c_wchar_p(text_unicode)
    c_wchar_p(u'Dummy')
    

Back to your situation:

>>> import ctypes as cts
>>>
>>> some_string = "disco duck"
>>>
>>> enc_utf16 = some_string.encode("utf16")
>>> enc_utf16
b'\xff\xfed\x00i\x00s\x00c\x00o\x00 \x00d\x00u\x00c\x00k\x00'
>>>
>>> type(some_string), type(enc_utf16)
(<class 'str'>, <class 'bytes'>)
>>>
>>> cts.c_wchar_p(some_string)  # This is the right way
c_wchar_p(2508534214928)
>>>
>>> cts.c_wchar_p(enc_utf16)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode string or integer address expected instead of bytes instance

As a side note, TCHAR varies (it's a typedef) on _UNICODE (not) being defined. Check [MS.Learn]: Generic-Text Mappings in tchar.h for more details. So, depending on the C code compilation flags, the Python code might also need adjustments.

You could also check:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.