Skip to content

Conversation

@MeirShpilraien
Copy link

@MeirShpilraien MeirShpilraien commented Dec 26, 2021

Redis Function Libraries

This PR implements Redis Functions Libraries as describe on: #9906.

Libraries purpose is to provide a better code sharing between functions by allowing to create multiple functions in a single command. Functions that were created together can safely share code between each other without worrying about compatibility issues and versioning.

Creating a new library is done using 'FUNCTION LOAD' command (full API is described below)

This PR introduces a new struct called libraryInfo, libraryInfo holds information about a library:

  • name - name of the library
  • engine - engine used to create the library
  • code - library code
  • description - library description
  • functions - the functions exposed by the library

When Redis gets the FUNCTION LOAD command it creates a new empty libraryInfo. Redis passes the CODE to the relevant engine alongside the empty libraryInfo. As a result, the engine will create one or more functions by calling 'libraryCreateFunction'. The new funcion will be added to the newly created libraryInfo. So far Everything is happening locally on the libraryInfo so it is easy to abort the operation (in case of an error) by simply freeing the libraryInfo. After the library info is fully constructed we start the joining phase by which we will join the new library to the other libraries currently exist on Redis. The joining phase make sure there is no function collision and add the library to the librariesCtx (renamed from functionCtx). LibrariesCtx is used all around the code in the exact same way as functionCtx was used (with respect to RDB loading, replicatio, ...). The only difference is that apart from function dictionary (maps function name to functionInfo object), the librariesCtx contains also a libraries dictionary that maps library name to libraryInfo object.

New API

FUNCTION LOAD

FUNCTION LOAD <ENGINE> <LIBRARY NAME> [REPLACE] [DESCRIPTION <DESCRIPTION>] <CODE>
Create a new library with the given parameters:

  • ENGINE - REPLACE Engine name to use to create the library.
  • LIBRARY NAME - The new library name.
  • REPLACE - If the library already exists, replace it.
  • DESCRIPTION - Library description.
  • CODE - Library code.

Return "OK" on success, or error on the following cases:

  • Library name already taken and REPLACE was not used
  • Name collision with another existing library (even if replace was uses)
  • Library registration failed by the engine (usually compilation error)

Changed API

FUNCTION LIST

FUNCTION LIST [LIBRARYNAME <LIBRARY NAME PATTERN>] [WITHCODE]
Command was modified to also allow getting libraries code (so FUNCTION INFO command is no longer needed and removed). In addition the command gets an option argument, LIBRARYNAME allows you to only get libraries that match the given LIBRARYNAME pattern. By default, it returns all libraries.

INFO MEMORY

Added number of libraries to INFO MEMORY

Commands flags

DENYOOM flag was set on FUNCTION LOAD and FUNCTION RESTORE. We consider those commands as commands that add new data to the dateset (functions are data) and so we want to disallows to run those commands on OOM.

Removed API

Lua engine changes

When the Lua engine gets the code given on FUNCTION LOAD command, it immediately runs it, we call this run the loading run. Loading run is not a usual script run, it is not possible to invoke any Redis command from within the load run. Instead there is a new API provided by library object. The new API's:

  • redis.log - behave the same as redis.log
  • redis.register_function - register a new function to the library

The loading run purpose is to register functions using the new redis.register_function API. Any attempt to use any other API will result in an error. In addition, the load run is has a time limit of 500ms, error is raise on timeout and the entire operation is aborted.

redis.register_function

redis.register_function(<function_name>, <callback>, [<description>])
This new API allows users to register a new function that will be linked to the newly created library. This API can only be called during the load run (see definition above). Any attempt to use it outside of the load run will result in an error. The parameters pass to the API are:

  • function_name - Function name (must be a Lua string)
  • callback - Lua function object that will be called when the function is invokes using fcall/fcall_ro
  • description - Function description, optional (must be a Lua string).

Example

The following example creates a library called lib with 2 functions, f1 and f1, returns 1 and 2 respectively:

local function f1(keys, args)
    return 1
end

local function f2(keys, args)
    return 2
end

redis.register_function('f1', f1)
redis.register_function('f2', f2)

Notice: Unlike eval, functions inside a library get the KEYS and ARGV as arguments to the functions and not as global.

Technical Details

On the load run we only want the user to be able to call a white list on API's. This way, in the future, if new API's will be added, the new API's will not be available to the load run unless specifically added to this white list. We put the while list on the library object and make sure the library object is only available to the load run by using lua_setfenv API. This API allows us to set the globals of a function (and all the function it creates). Before starting the load run we create a new fresh Lua table (call it g) that only contains the library API (we make sure to set global protection on this table just like the general global protection already exists today), then we use lua_setfenv to set g as the global table of the load run. After the load run finished we update g metatable and set __index and __newindex functions to be _G (Lua default globals), we also pop out the library object as we do not need it anymore. This way, any function that was created on the load run (and will be invoke using fcall) will see the default globals as it expected to see them and will not have the library API anymore.

An important outcome of this new approach is that now we can achieve a distinct global table for each library (it is not yet like that but it is very easy to achieve it now). In the future we can decide to remove global protection because global on different libraries will not collide or we can chose to give different API to different libraries base on some configuration or input.

Notice that this technique was meant to prevent errors and was not meant to prevent malicious user from exploit it. For example, the load run can still save the library object on some local variable and then using in fcall context. To prevent such a malicious use, the C code also make sure it is running in the right context and if not raise an error.

  • Implement FUNCTION LIST pattern matching
  • Count libraries memory overhead
  • Functions flags, decide if we want it and which flags we want - delayed to followup PR and will be discussed on Redis Functions Flags #10025

Libraries purpose is to provide a better code sharing between functions by allowing
to create multiple functions in a single command. Functions that were created together
can safely share code between each other without worrying about compatibility issues
and versioning.

Creating a new library is done using 'FUNCTION LOAD' command (full API is described below)

This PR introduces a new struct called libraryInfo, libraryInfo holds information
about a library:
* name - name of the library
* engine - engine used to create the library
* code - library code
* description - library description
* functions - the functions exposed by the library

When Redis gets the `FUNCTION LOAD` command it creates a new empty libraryInfo.
Redis passes the `CODE` to the relevant engine alongside the empty libraryInfo.
As a result, the engine will create one or more functions by calling 'libraryCreateFunction'.
The new funcion will be added to the newly created libraryInfo. So far Everything is
happening locally on the libraryInfo so it is easy to abort the operation (in case
of an error) by simply freeing the libraryInfo. After the library info is fully
constructed we start the joining phase by which we will join the new library to
the other libraries currently exist on Redis. The joining phase make sure
there is no function collision and add the library to the librariesCtx (renamed
from functionCtx). LibrariesCtx is used all around the code in the exact same way
as functionCtx was used (with respect to RDB loading, replicatio, ...). The
only difference is that apart from function dictionary (maps function name to
functionInfo object), the librariesCtx contains also a libraries dictionary that
maps library name to libraryInfo object.

`FUNCTION LOAD <ENGINE> <LIBRARY NAME> [REPLACE] [DESCRIPTION <DESCRIPTION>] <CODE>`
Create a new library with the given parameters:
* ENGINE - REPLACE Engine name to use to create the library.
* LIBRARY NAME - The new library name.
* REPLACE - If the library already exists, replace it.
* DESCRIPTION - Library description.
* CODE - Library code.

Return "OK" on success, or error on the following cases:
* Library name already taken and REPLACE was not used
* Name collision with another existing library (even if replace was uses)
* Library registration failed by the engine (usually compilation error)

`FUNCTION LIST [LIBRARYNAME <LIBRARY NAME PATTERN>] [WITHCODE]`
Command was modified to also allow getting libraries code (so `FUNCTION INFO` command
is no longer needed and removed). In addition the command gets an option argument,
`LIBRARYNAME` allows you to only get libraries that match the given `LIBRARYNAME`
pattern. By default, it returns all libraries.

* FUNCTION CREATE
* FUNCTION INFO

When the Lua engine gets the code given on `FUNCTION LOAD` command, it immediately runs
it, we call this run the loading run. Loading run is not a usual script run, it is not
possible to invoke any Redis command from within the load run. The only Redis API provided
to the load run is:
* redis.log
* redis.sha1hex
* redis.error_reply
* redis.status_reply
* redis.register_function

The loading run purpose is to register functions using the new `redis.register_function` API.
Any attempt to use any other API will result in an error.

`redis.register_function(<function_name>, <callback>, [<description>])`
This new API allows users to register a new function that will be linked to the newly created library.
This API can only be called during the load run (see definition above). Any attempt to use it
outside of the load run will result in an error. The parameters pass to the API are:
* function_name - Function name (must be a Lua string)
* callback - Lua function object that will be called when the function is invokes using fcall/fcall_ro
* description - Function description, optional (must be a Lua string).

The following example creates a library called `lib` with 2 functions, `f1` and `f1`, returns 1 and 2 respectively:
```
local function f1(keys, args)
    return 1
end

local function f2(keys, args)
    return 2
end

redis.register_function('f1', f1)
redis.register_function('f2', f2)
```

Notice: Unlike `eval`, functions inside a library get the KEYS and ARGV as arguments to the functions
and not as globals.
@oranagra oranagra linked an issue Dec 26, 2021 that may be closed by this pull request
@oranagra oranagra added 7.0-must-have state:major-decision Requires core team consensus labels Dec 26, 2021
@oranagra
Copy link
Member

@MeirShpilraien i've set this to resolve the issue that describes it, assuming that all the ideas specified there are handled here.
please skim over the description and discussion there, and list here (maybe in todo section of the top comment) if there's something not handled yet.

@MeirShpilraien
Copy link
Author

@oranagra function flags was missing, added it to todo list.

meir added 2 commits December 26, 2021 23:01
1. Added WITHCODE option that allows to get library code
2. Added LIBRARY name that allows to get only libraries that
   matches the given library name pattern
1. Added libraries memory overhead to total functions overhead.
2. Added number of libraries to 'info memory'
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern i have about naming, is the use of the term "library" without a context.
in the past i argued the same about "engine".
the term "function" is a lost cause (in the context of redis, a "function" is a "script function".
but maybe "library" should be called "functions library" (or "functionsLib")?
same as we called "engine" a "script engine"

Meir Shpilraien (Spielrein) and others added 4 commits December 28, 2021 09:35
Co-authored-by: Oran Agra <oran@redislabs.com>
1.  Nullify iterators so in case we have other (later) goto done, they won't break.
2.  On librariesCtx join, used tmp list to keep old libraries instead on tmp libraraiesCtx.
3.  On libraryJoin, if replace is used, do not search for collids libraries because we
    there is no such.
4.  Avoid nullify dictionary entry variable.
5.  Added a comment about functions name collision to FUNCTION HELP
6.  Better naming (li -> old_li)
7.  Documented 'luaEngineLoadHook'
8.  Added argument description to function-list.json
9.  Avoid using setDeferredArrayLen if possible
10. Change error message to be generic to scripts (when running Lua api outside of script invocation).
11. Used global if statement to decide which argument to use to invoke 'lua_pcall'
@MeirShpilraien
Copy link
Author

MeirShpilraien commented Dec 28, 2021

@oranagra fixed the comments.

One concern i have about naming, is the use of the term "library" without a context.
in the past i argued the same about "engine".
the term "function" is a lost cause (in the context of redis, a "function" is a "script function".
but maybe "library" should be called "functions library" (or "functionsLib")?
same as we called "engine" a "script engine"

I think its ok to call it just library, but I also thought its ok to just say engine so it make sense we disagree :)
that said, I do not mind renaming. How about:

  • libraryInfo -> functionLibInfo
  • librariesCtx -> functionLibsCtx

Let me know if you are OK with it.

@oranagra
Copy link
Member

@MeirShpilraien i'm not sure if just renaming these two is enough. maybe there are other functions (procedures) that should be renamed, comments that should be updated, variable names or command arguments, etc.

i suppose that in the context of a FUNCTION command, the term "library" is ok.
and same goes for anything in functions.c.
if you wanna consider this change, please grep the diff in this pr for the word "library". skip functions.c and all the FUNCTION json files, and see what you're left with.

Meir Shpilraien (Spielrein) and others added 4 commits December 28, 2021 18:18
Co-authored-by: Oran Agra <oran@redislabs.com>
* libraryInfo -> functionLibInfo
* librariesCtx -> functionsLibCtx
* lib_ctx* -> functions_lib_ctx*
* All functions api: librariesCtx* -> functionsLib*
* curr_lib_ctx -> curr_functions_lib_ctx
* libraries -> functions_libraries
@MeirShpilraien
Copy link
Author

@oranagra let me know what you think on the last PR (renaming)

@oranagra
Copy link
Member

LGTM (only looked at the commit comment)
i assume you used this method to spot them:

if you wanna consider this change, please grep the diff in this pr for the word "library". skip functions.c and all the FUNCTION json files, and see what you're left with.

@MeirShpilraien
Copy link
Author

@oranagra I actually look everywhere where any function API is used and then find out what need to be renamed. Will do another round to make sure I did not miss anything.

@oranagra
Copy link
Member

@redis/core-team this is ready for merge. please read the top comment and approve

@oranagra oranagra added approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus and removed state:major-decision Requires core team consensus labels Dec 29, 2021
MeirShpilraien pushed a commit to MeirShpilraien/redis-doc that referenced this pull request Dec 30, 2021
The PR add Redis Programability documentation:
* Documentation to all FUNCTION commands (notice that FUNCTION INFO is not documented because its about to be dropped):
  * FUNCTION LOAD (current named FUNCTION CREATE but will be renamed after [this](redis/redis#10004) will be merged)
  * FUNCTION DELETE
  * FUNCTION DUMP
  * FUNCTION FLUSH
  * FUNCTION KILL
  * FUNCTION LIST
  * FUNCTION RESTORE
  * FUNCTION FLUSH
* Introduction to Redis Functions page
* Rearrangement of `EVAL` command page
  * All the information on `EVAL` command page was moved and split between 2 pages:
    * Introduction to Eval scripts
    * Lua API for Redis Scripts
* New Redis programibality page

The PR also require changes on redis-io that was made [here](redis/redis-io#261)
Copy link
Collaborator

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with one issue.

@MeirShpilraien Perhaps we should completely eliminate redis from the namespace when handling the load run? I think it might help prevent some confusion if we just expose a single function creation function.

@MeirShpilraien
Copy link
Author

@yossigo there are API's on redis object that we do allow on FUNCTION LOAD context, for example redis.log. Do you think we should remove it?

@oranagra
Copy link
Member

we'll soon probably wanna add redis.version() so the lib can register different functions / flags depending on the version.

@yossigo
Copy link
Collaborator

yossigo commented Dec 30, 2021

@MeirShpilraien I think we should opt-in on what we provide and not opt-out, and using a different name might make better distinction about these two run modes.

@MeirShpilraien
Copy link
Author

@yossigo I see your point and I agree, so I guess we can introduce a new API, library, that will be available only on the load run and we can put there everything that is available to the load run (library.register_function, library.log, ...). WDYT?

Copy link
Member

@itamarhaber itamarhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't CRed, concept approved

@oranagra oranagra requested a review from madolson January 3, 2022 08:14
The load run with only have 'library' API that contains only:
* register_function
* log
@MeirShpilraien
Copy link
Author

@oranagra @yossigo , added library API and update top comment, let me know what you think.

@MeirShpilraien
Copy link
Author

MeirShpilraien commented Jan 3, 2022

@yossigo @oranagra @itamarhaber @madolson
Please notice that I added another commit (the last commit) that add DENYOOM flag to FUNCTION LOAD and FUNCTION RESTORE. Because we consider functions as data and those commands potentially will add new functions (data), I believe it make sense. Let me know what you think.

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the top level comment and read some of the code. I agree with all the decisions made.

Meir Shpilraien (Spielrein) and others added 2 commits January 6, 2022 10:36
Co-authored-by: Oran Agra <oran@redislabs.com>
1. rename 'library->redis'
2. better documentation for 'luaRegisterGlobalProtectionFunction'
@MeirShpilraien
Copy link
Author

@oranagra rename library -> redis, updated top comment.

Co-authored-by: Oran Agra <oran@redislabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Redis Functions - Libraries

5 participants