Skip to content

Add compiler support for raw pointers#724

Closed
ghost wants to merge 6 commits intotrunkfrom
unknown repository
Closed

Add compiler support for raw pointers#724
ghost wants to merge 6 commits intotrunkfrom
unknown repository

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Jul 28, 2016

This PR takes part of #652, completes it and gets it ready for review. We have several use cases at Jane Street where this feature is a pre-requisite and we'd like to have it for 4.04.

Synopsis

This patch adds primitives for reading and writing 1, 2, 4 and 8 bytes integers from/to memory outside of the OCaml heap. Memory locations are naturally represented as native integers, which are the same size as pointers in OCaml.

The new primitives are for advanced users only and are not yet exposed in the stdlib. We have some ideas to type-check compiler primitives and we'll be working on this for the next version of the compiler, which will address concerns expressed in the original PR.

Motivation

As mentioned in the original PR, this allows to manipulate more C data structures directly from OCaml.

At Jane Street we have at least two use cases for this:

1. C bindings

We have some code binding C function that tries to manipulate C pointers in OCaml. While this allow to write less C code, it is often hard to reason about correctness regarding the GC. This feature would improve this as the GC doesn't look at these pointer at all.

2. Lighter interface to C memory

In several places, we'd like a lighter interface than bigarrays to C memory. One reason is that we have some code where the bigarray indirection has a non-negligible cost. With this patch, we would first assert that the pointer is 2-aligned, then store it in an integer. When using the primitives we would convert it back to a nativeint and the compiler will be able to get rid of the temporary boxing.

Another reason is that we often know exactly the scope of such a bigarray and we'd prefer to release the memory explicitly rather than rely on the GC to do it, especially if the memory buffer is short lived.

We thought about just using regular OCaml string allocated outside the heap. However this is unsafe if the external memory can be freed.

Details

Several %-primitives are added: %load{8,16,32,64} for loading integers and %store{8,16,32,64} for storing them. These primitives take a pointer represented as a nativeint. The types used for the integers to read/write are the same as for the string/bigstring primitives: char, int, int32 and int64.

For the byte-code stubs, given that they are very similar to the string primitives, I moved the code for reading/writing from str.c to inline functions in byterun/get_set.h and used them in both str.c and the new ptr.c file.

I also copied the tests for the string primitives in testsuite/tests/prim-bigstring/string_access.ml as testsuite/tests/prim-bigstring/ptr_access.ml and adapted them.

Notes

The original patch also added primitives for aligned reads. I wasn't totally sure what to do for the byte-code versions, and in order to keep this PR simple I didn't included them. I leave this part for future work. The current primitives already allow to do quite a lot.

I also renamed the new constructors for Lambda.primitive to be more consistent with the current string/bigstring primitives.

@bobot I tried to start my work from the branch you sent me, but there was too much refactoring. In order to keep this patch small it was easier for me to start from the original PR. I let you submit your refactoring in another PR.

@mshinwell mshinwell added this to the 4.04 milestone Jul 28, 2016
@mshinwell
Copy link
Copy Markdown
Contributor

One other point: I don't yet know of a compelling argument as to why a raw pointer API should be exposed within the compiler distribution. It seems a good candidate for a separate library for expert users only.

@alainfrisch
Copy link
Copy Markdown
Contributor

I don't yet know of a compelling argument as to why a raw pointer API should be exposed within the compiler distribution.

It's weird to have compiler/runtime support for primitives which are not exposed in the compiler distribution. Adding them to stdlib to Obj is a good way to document the types of new primitives and to ensure that external higher-level libraries will properly break at compile-time if the types of these primitives change in the future. It's not because the feature is for expert users only that they should not be properly documented.

What's the argument against moving these external definitions from the new test into Obj?

@mshinwell
Copy link
Copy Markdown
Contributor

@alainfrisch As Jeremie said, we have an embryonic plan (which we may be able to work on quite soon) to do some kind of type checking for all (or close to all) primitives, rather than just these new ones. If that fails to work, we could consider adding externals for them, but I kind of think we have to make it work given the existing situation we find ourselves in with the string and bigstring primitives. It also seems a pretty robust solution---you can check everyone's code, as opposed to being in the situation where some people use the primitives and some people use the "API" and only the latter uses are checked.

I don't think we should be providing substandard APIs, even in Obj, since there is a high risk they become the de facto standard.

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

@alainfrisch I agree that they should be documented, but it doesn't have to be in the stdlib.

For breaking properly at compile-time, we'll be working on adding a warning in the compiler to type-check %-primitives. Given that users often redefine %-primitives themselves, this will have more impact than having a default declaration in the stdlib.

Basically we wanted to keep this patch simple. And it's not clear either that this kind of operations should be exposed in the stdlib.

@alainfrisch
Copy link
Copy Markdown
Contributor

What's the benefit of type-checking %-primitives rather than exposing them in the stdlib? Do you plan to support features which cannot be expressed in the OCaml type system?

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

The benefit is that it will catch errors when %-primitives are defined/redefined outside of the stdlib, which is not uncommon

@alainfrisch
Copy link
Copy Markdown
Contributor

Why would people prefer to redefine these primitives if they are properly exposed?

@chambart
Copy link
Copy Markdown
Contributor

By the way if we want to expose every primitive, we might consider adding a new module for that (CamlInternalPrimitives ?)

@mshinwell
Copy link
Copy Markdown
Contributor

There's always the chance that someone will do it. Also, please see what I said in the first comment above: there is already a problem that we need to fix in this regard.

@chambart
Copy link
Copy Markdown
Contributor

@mshinwell what kind of type would you like to add to primitives ? As they are usually redefined to circumvent the type system, I'm not sure what can effectively be asserted on them.

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 28, 2016

Adding more primitives without types will encourage people to declare their own (possibly wrong) types for them. A hope that there might one day be a type checker for primitives doesn't seem like a compelling reason to make the problem worse in the short term.

So, assuming that it's possible to express the types of the new primitives in OCaml, I agree with @alainfrisch that it would be better to expose suitable types for these new primitives.

@mshinwell
Copy link
Copy Markdown
Contributor

@yallop You seem to have missed in your analysis that there are other benefits to adding these new primitives, without any API; such benefits may outweigh the disadvantages.

I had another idea just now that could help with the existing problem as well: what about adding a compiler flag such that, if the flag is not specified (the default), any use of a %-primitive is rejected? (Then do what @chambart suggests and expose a single interface containing all the primitives.)

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

@chambart the same type as what you would write in an ml/mli file. This would benefit many primitives. And when you want to expose "%identify" with a magic type, you'd have to disable the warning

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 28, 2016

@mshinwell: if so, perhaps it would be helpful to say what they are.

@mshinwell
Copy link
Copy Markdown
Contributor

@yallop Some of them are enumerated in @diml 's description at the top of this pull request...

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 28, 2016

I read it, and didn't see any disadvantage-outweighing benefits. What have I missed?

@mshinwell
Copy link
Copy Markdown
Contributor

I tend to think the advantages outweigh the disadvantages given the other things I pointed out above.

However I am interested in pursuing a little this idea of having a flag to disallow % primitives, and then just having a normal interface. Are there any opinions on that?

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

For the AppVeyor failure, I tried on a Windows box and the tests are passing. I'm not sure why this is failing in AppVeyor

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Jul 28, 2016

However I am interested in pursuing a little this idea of having a flag to disallow % primitives, and then just having a normal interface. Are there any opinions on that?

I think it is a very good idea.

The original patch also added primitives for aligned reads. I wasn't totally sure what to do for the byte-code versions, and in order to keep this PR simple I didn't included them.

You can do like in the original PR. If the user says it is aligned, we can still do as if it is unaligned. Less efficient but still true. Finally I think the use of memcpy for unaligned access was indeed correct in the original PR.

Edit: replace wrong by correct...

@alainfrisch
Copy link
Copy Markdown
Contributor

what I said in the first comment above: there is already a problem that we need to fix in this regard.

Is the problem the "the existing situation we find ourselves in with the string and bigstring primitives"? If so, I think the best way to address this problem is to expose these primitives in the stdlib and tell people to use them (and also ask them to forgive us for not providing the functions in the first place, thus forcing them to use an even more fragile style).

We have a type system which works well. We should we complexify the compiler when this is just about exposing primitives in the stdlib?

I cannot see how exposing proper OCaml functions/types is worse than asking users to use directly the primitives, which is more fragile and then gives us incentives to reinvent the wheel.

@alainfrisch
Copy link
Copy Markdown
Contributor

Some of them are enumerated in @diml 's description at the top of this pull request...

I think everyone agrees that there are advantages in providing the primitives. I don't see any argument against exposing them properly (which is not difficult).

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

You can do like in the original PR. If the user says it is aligned, we can still do as if it is unaligned. Less efficient but still true

I was thinking that if you accidentally use an unaligned pointer with an aligned primitive, you'd get different behavior between byte-code and native-code which might makes things hard to debug. Thinking a bit more about this, we could just check that the pointer is aligned in the byte code stubs for aligned access and raise if it is not. We do something similar for string access: we always check for out of bounds accesses in byte-code.
In any case, I'm leaving this for another PR, I think this one has enough changes.

@mshinwell
Copy link
Copy Markdown
Contributor

mshinwell commented Jul 28, 2016

@alainfrisch What is your opinion on a flag to disallow % primitives?
I imagine we could do that for 4.04, default off, but announce that for 4.05 the default would be on. (And for 4.05 provide a full interface to all primitives.)

@avsm
Copy link
Copy Markdown
Member

avsm commented Jul 28, 2016

@diml:

For the AppVeyor failure, I tried on a Windows box and the tests are passing. I'm not sure why this is failing in AppVeyor

Looks like there is some issue with the malloc_stubs.o -- a symlink related issue perhaps?

make[7]: *** [/cygdrive/c/projects/ocaml/testsuite/makefiles/Makefile.several:102: run-file] Error 2
 => failed
 ... testing 'ptr_access.ml': ocamlcFatal error: exception Arg.Bad("don't know what to do with malloc_stubs.o")
make[7]: *** [/cygdrive/c/projects/ocaml/testsuite/makefiles/Makefile.several:102: run-file] Error 2
 => failed
 ... testing 'string_access.ml': ocamlcFatal error: exception Arg.Bad("don't know what to do with malloc_stubs.o")
make[7]: *** [/cygdrive/c/projects/ocaml/testsuite/makefiles/Makefile.several:102: run-file] Error 2
 => failed
make[2]: Entering directory '/cygdrive/c/projects/ocaml/testsuite'

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Jul 28, 2016

I was thinking that if you accidentally use an unaligned pointer with an aligned primitive, you'd get different behavior between byte-code and native-code which might makes things hard to debug.

It is an undefined behavior so anything can happen, it would be the same thing if one access not in a valid range. I think it is not a problem to have different behavior in this case.

In any case, I'm leaving this for another PR, I think this one has enough changes.

Could you still take it into account in the nomenclature name (add unaligned).

@mshinwell
Copy link
Copy Markdown
Contributor

I don't see why it's necessary to change the names. To me, "unaligned" implies the load is expected to be unaligned. (Looking at it slightly differently, there is no constraint on the pointer argument, so there probably isn't a need for an extra name.)

@ghost
Copy link
Copy Markdown
Author

ghost commented Jul 28, 2016

Yeah, that was my reasoning as well. We don't have unaligned in the name of other primitives for memory access.

@avsm not sure, they should be no symlink involved here. It's a bit annoying that we can't see the command line

@alainfrisch
Copy link
Copy Markdown
Contributor

However I am interested in pursuing a little this idea of having a flag to disallow % primitives, and then just having a normal interface. Are there any opinions on that?

I'm not sure that distinguishing %-primitives from normal primitives is worthwhile.

What about adding an "use of unsafe feature" warning, which would be triggered by any reference to a module or funciton marked with @@ocaml.unsafe attribute (same plumbing as for deprecated)? Of course, all "external" declarations would imply this attribute.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Jul 28, 2016

To me, "unaligned" implies the load is expected to be unaligned

Since usually in C, all pointers are aligned to their underlying type, it seems strange but the type of the argument is perhaps sufficient.

I tried to understand why we wrote all this code instead of using memcpy. I though that was because memcpy work only on aligned pointer, but that makes no sense:

The memcpy function copies n characters from the object pointed to by s2 into the
object pointed to by s1. If copying takes place between objects that overlap, the behavior
is undefined.

It is just that gcc could optimize memcpy if he knows how the pointers are aligned. So at the end, get_set.h can be simplified into:

static inline value mem_get64_unaligned(unsigned char* ptr){
  uint64_t res;
  memcpy(ptr,&res,sizeof int64_t);
  return caml_copy_int64(res);
}

and for the aligned case, just the prototype change:

static inline value mem_get64_aligned(uint64_t* ptr){

@ghost
Copy link
Copy Markdown
Author

ghost commented Mar 31, 2017

@DemiMarie last time we discussed this at a developper meeting @xavierleroy asked that we write benchmarks to compare the primitives to C stubs.

I've been looking at this again for a project I'm working on. I wrote some benchmarks using various ways to access external memory: https://github.com/diml/memory-access-bench. On the example I'm using, using C stubs is about 3 to 4 times slower than the other methods (using bigarrays, raw pointers or just bytes). The other methods give approximately the same results.

Numbers:

  Name                                                Time/Run   mWd/Run   Percentage
 --------------------------------------------------- ---------- --------- ------------
  [mem_bench.ml:With_bigarray]                          6.83ns                 15.29%
  [mem_bench.ml:With_bytes]                             5.51ns                 12.33%
  [mem_bench.ml:With_raw_pointers]                      8.03ns                 17.99%
  [mem_bench.ml:With_2aligned_raw_pointers_as_int]      6.71ns                 15.02%
  [mem_bench.ml:With_raw_pointers_c]                   23.04ns                 51.59%
  [mem_bench.ml:With_2aligned_raw_pointers_c]          44.66ns    12.00w      100.00%

@damiendoligez damiendoligez modified the milestones: long-term, 4.07-or-later Sep 28, 2017
@damiendoligez damiendoligez removed this from the consider-for-4.07 milestone Jun 4, 2018
@alainfrisch
Copy link
Copy Markdown
Contributor

@xavierleroy Considering the benchmark results above, are you still opposed to this PR?

@alainfrisch
Copy link
Copy Markdown
Contributor

My opinion on this PR:

  • It is a nice addition, that will allow faster FFI in some cases.
  • The primitives should be exposed in a new module of the Stdlib, with an appropriate disclaimer.
  • Generalize the "deprecated" alert #1804 will give a proper way to mark this module has being unsafe.
  • The proposal goes against the desire to restrict to fully portable features in the stdlib (the primitive won't be supported by Javascript backends), but I think it's ok.

@xavierleroy
Copy link
Copy Markdown
Contributor

Yes, I am still very very much opposed to exposing these primitives in the Stdlib, and very much (but not very very much) opposed to implementing those primitives in the compiler without exposing them. The C implementation looks fast enough to me. So, whomever wants PEEK and POKE can have them via their own C implementation, but they should not pollute the core system.

@alainfrisch
Copy link
Copy Markdown
Contributor

Let's assume that "very much opposed" from @xavierleroy is equivalent to a veto. So I'm closing the PR (but anyone can reopen if they feel like doing it!).

@ghost
Copy link
Copy Markdown
Author

ghost commented Oct 29, 2018

I'd like to re-open it, having the primitives implemented but not exposed is all I ask for.

This PR is simply about offering a lightweight alternative to bigarrays. Everything that can be done with these new primitives can be done with plain bigarrays as well, so I wouldn't say it is polluting the system. However, with the new primitives the user has better control over the management of the external memory and this is important in many use cases. It seems to me that using C primitives is not a good solution: it's not only slower, it also makes the assembly code harder to read and follow. Having good assembly code is important when using tools such as gdb or perf.

@ghost ghost reopened this Oct 29, 2018
@alainfrisch
Copy link
Copy Markdown
Contributor

FWIW, I'm still opposed (but not "very much opposed") to exposing primitives without proper documentation as part of the stdlib. The slight gain of making it even more difficult for end users to use them is, in my opinion, largely compounded by (1) the chances of having segfault because a change in the API of these primitives between two versions of OCaml goes unnoticed and breaks third-party libraries built on top of them; (2) the extra burden of having to document the primitives somewhere (whereas a mention in e.g. the Obj module is "self-explanatory"). Adding explicitly undocumented features with the intention to have third-party libraries use them seems very dangerous to me.

In other words, I don't buy the "safety through obscurity" argument.

@ghost
Copy link
Copy Markdown
Author

ghost commented Oct 29, 2018

These primitives are very simple, so it's unlikely their API will ever change. Additionally, there is one precedent with %obj_as_pointer and so far it has been working well.

@alainfrisch
Copy link
Copy Markdown
Contributor

These primitives are very simple, so it's unlikely their API will ever change.

So even more, if the API is stable and ready to be used by third-party libraries, it should be documented, and doing so by exposing OCaml types in a relatively hidden and scary place of the stdlib (e.g. Obj) is better than not documenting it at all, IMO.

@ghost
Copy link
Copy Markdown
Author

ghost commented Oct 29, 2018

Perhaps, but if adding these to the stdlib is controversial, then I'd much rather have the primitives undocumented that not at all. If they are present and undocumented, then I have the choice to use them at my own risk. If they are not present at all, I cannot do anything.

@alainfrisch
Copy link
Copy Markdown
Contributor

if adding these to the stdlib is controversial, then I'd much rather have the primitives undocumented that not at all

Agreed (hence my "not very much opposed").

@alainfrisch
Copy link
Copy Markdown
Contributor

alainfrisch commented Oct 29, 2018

Anyway, we need a maintainer ready to champion this controversial PR. @mshinwell ?

@ghost
Copy link
Copy Markdown
Author

ghost commented Dec 6, 2018

I thought a bit more about this PR, in the end there are two cases where I hope it would help:

  • memory management: for applications that allocate a lot of bigarrays, the gc doesn't always behave as we would like and we would prefer to treat external memory blocks the same as other external resources such as file descriptors: i.e. allocate and release them manually. This is especially true when bigarrays are used to represent memory not coming from the RAM where allocation and freeing is not done via malloc/free.

  • reducing indirections: we often have an additional layer of abstraction on top of bigarrays, meaning that memory accesses go through two indirections rather than just one. Having raw pointers would allow to remove one indirection

At this point, the actual benefit we would get by using raw pointers rather than bigarrays in such applications in still speculative. I'll do more research in order to quantify all this better. In the meantime, I'm closing this PR so that it is out of the way.

@ghost ghost closed this Dec 6, 2018
@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Dec 6, 2018

I'm not really sure why this PR was shot down.

Yes, we love OCaml being a high-level language, but at the same time, we want to squeeze as much performance out of it as possible, and this PR allows to improve performance at the underlying levels without even contradicting the use of functional paradigms at the higher levels! Between Rust and Haskell getting better every day, the performance competition is fierce, and this is a change that allows us to improve without requiring massive type theory or engineering effort, so why are we rejecting it? 99% of users will not see any effect of this change other than their code becoming faster. It's like we're deliberately trying to hamper our own efforts, and it just doesn't make sense to me.

Would it help if we placed this module in a place that is already barely touched by the average user, such as Obj.Ptr?

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 6, 2018

Designing a language is not just growing a list of things that can be done in the language, but also shaping things that cannot be done, or are hard to do or discouraged, for everyone to be able to reason about any OCaml program. "Memory safety" is one of the golden guarantees provided by OCaml programs (even though the C interface can defeat it), and it is defined by the absence of certain undesirable behaviors -- forbidding certain thing. This PR was shot down because it introduces behavior, in plain OCaml programs, that go against the reasoning principles offered by memory safety.

It is possible to improve performance without endangering memory safety, although it may require more work on the language design front. Doing more design work for stronger guarantees is a usual tradeoff for typed functional languages (algebraic datatypes instead of unions, GADTs instead of unsafe casts, modules instead of code duplication, sensible concurrent memory models instead of arbitrary racy accesses, etc.), so it's not particularly surprising that this approach also won the argument for this particular PR.

@mshinwell
Copy link
Copy Markdown
Contributor

@gasche I don't think that the approach of only providing absolutely memory-safe operations has actually won the argument per se. I'm also uncomfortable with the use of the phrase "shot down", which in this case, strikes me as unnecessarily strong language. The GPR has been voluntarily withdrawn for further experimentation and thereafter more rigorous justification as to why these kinds of operations are being sought after. Some strong opinions have been expressed thus far, which are welcomed and noted, but we haven't actually (for example) all sat down and had what would probably be a fairly large-scale discussion. There are some difficult issues, not just relating directly to the provision of fast memory access operations, but problems with existing abstractions dealing with out-of-heap data (for example the long-running saga of manual deallocation of Bigarrays). I think it is fair to say that this GPR did not adequately consider the wider context.

My general thoughts thus far are that some kind of stratified approach, with a spectrum of safety vs. efficiency guarantees, is perhaps necessary to achieve the level of performance (or in the Bigarray case, manual control) that is being sought after without excessive type system complexity. However that can be discussed as and when this issue arises again.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.