Skip to content

unicode parts of env are lost when running createProcess #152

@joeyh

Description

@joeyh

In a non-unicode locale, such as LANG=C, running createProcess with an env that contains a unicode character, such as '¡' results in it being stripped out of the value that is seen by the child proccess; only the ascii characters remain.

A program demonstrating the bug is:

import System.Process
import System.Environment

main = do
v <- lookupEnv "FOO"
case v of
	Just foo -> print $ "FOO is set to: " ++ foo
	Nothing -> do
		let e = [("FOO", "¡foo!")]
		print $ "running child process with environment " ++ show e
		self <- getExecutablePath
		let p = (proc self []) { env = Just e }
		(_, _, _, pid) <- createProcess p
		_ <- waitForProcess pid
		return ()

This program execs itself, so needs to be compiled, not run in ghci. On linux, built with process-1.6.5.0 and ghc-8.6.5, it behaves like this:

# LANG=en_US.utf8 ./foo
"running child process with environment [(\"FOO\",\"\\161foo!\")]"
"FOO is set to: \56514\56481foo!"
# LANG=C ./foo
"running child process with environment [(\"FOO\",\"\\161foo!\")]"
"FOO is set to: foo!"

Using strace -vf foo shows that the non-ascii characters do not make it to exec:

[pid 4335] execve("/home/joey/foo", ["/home/joey/foo"], ["FOO=foo!"] <unfinished ...>

This was discovered affecting a program that sets GIT_INDEX_FILE to a path when running git; if the path happens to contain unicode, this results in an corrupted path being passed to git.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions