Add efficient webhook support to your Lemmy instance. Especially useful for bots and AutoModerators.
- Lemmy Webhooks
Make the docker image part of your docker-compose stack, add this to your compose file:
services:
# ...
redis:
image: redis
ports: # you don't need to bind ports if you don't want to
- 6379:6379
webhooks:
image: ghcr.io/rikudousage/lemmy-webhook:latest
environment:
- LEMMY_HOST=postgres # the hostname of the postgres database
- REDIS_HOST=redis # the hostname of the redis server, you can use the above redis container if you define it as part of this stack
- LEMMY_PASSWORD=superSecr3t # the password to the postgres database
- API_REGISTRATION_ENABLED=1 # whether to allow users to register themselves via the api
- CORS_ALLOW_ORIGIN=^.*$$ # a regex for cors (you need to escape $ with another $)
- LARGE_PAYLOAD_SIZE=1024 # payloads larger than this size (in bytes) will be stored in a temporary table instead of fed directly to the consumer, default is 4096. If set to 0, all payloads will be stored.
ports:
- 8080:80 # you can skip this, if you don't use the management api
volumes:
- ./volumes/database:/opt/database # bind a directory where the SQLite database will be createdAfterwards, run docker-compose up -d and you're done!
The
LARGE_PAYLOAD_SIZEis important to avoid "payload string too long" errors in Postgres. By default, Postgres allows 8000 bytes in the payload. You can set this to 0 to send every payload into the table first.
You can either use the api, or insert webhooks directly into the database. You can read more on the api at a separate readme.
The table is quite simple and consists of these fields:
url- the URL of the webhookmethod- can beGET,POST,PATCH,DELETE,PUT(taken from the RequestMethod enum)body_expression(optional) - an expression that will be converted to JSON and sent as a body of the request, more on expressions belowfilter_expression(optional) - an expression that must evaluate to true if this webhook is to run, more on expressions belowobject_type- the type of object this webhook is interested in, currently:postcommentinstanceprivate_message(onlyINSERToperation)personregistration_applicationprivate_message_reportlocal_usercommunity_follower- a subscription by a user to a community
operation(optional) - the kind of operation this webhook is interested in, can beINSERT,UPDATE,DELETE(taken from the DatabaseOperation enum)headers(optional) - a JSON object with keys as header names and values as header valuesenhanced_filter(optional) - an expression that must evaluate to true if this webhook is to run, more on expressions belowenabled- whether the webhook is enabled or not
Expressions allow better interaction with the webhooks, for example filtering and setting the request body.
The basic syntax is very similar to JavaScript.
In every expression you have access to the data variable which contains the fields of the object the webhook was triggered for.
This is an example data object:
{
"timestamp": {
"date": "2024-01-05 23:15:09.811926",
"timezone_type": 1,
"timezone": "+00:00"
},
"operation": "INSERT",
"schema": "public",
"table": "comment",
"data": {
"id": 4763628,
"creatorId": 2,
"postId": 4435272,
"content": "teeest",
"removed": false,
"deleted": false,
"apId": "http://changeme.invalid/52570b072a832e6a986330de",
"local": true,
"distinguished": false,
"path": "0.123.456"
},
"previous": null
}Note that the
timestampproperty is in fact a PHP DateTimeImmutable object, including its methods and properties, the above is just its JSON representation.
So for example, if you only want to trigger a webhook for comments by local users, you would use this as your filter expression:
data.data.local
The timestamp, operation, schema and table properties have the same structure for every type of object, but the data property varies
based on what you're being notified about. Here's a list of all table values currently possible and link to the DTO that will be passed as the
data property:
post- PostDatacomment- CommentDatainstance- InstanceDataprivate_message- PrivateMessageDataperson- PersonDataregistration_application- RegistrationApplicationDataprivate_message_report- PrivateMessageReportDatalocal_user- LocalUserDatacommunity_follower- CommunitySubscriptionData
If the operation is an UPDATE, you'll also get access to the previous property which contains the data from the previous version of the object.
If the operation is not an UPDATE, the previous property is null.
There are two kinds of expressions, basic and enhanced. Enhanced expressions have access to additional functions
for interacting with the database, while simple expressions are limited to accessing only the data variable and
a few simple functions.
Simple expressions have access to these functions:
lowercase(text)- returns the string converted to lowercasetransliterate(text)- returns the string transliterated to standard latin characters:- example:
transliterate("Hélľö, hów ärě ýöů?")->Hello, how are you? - example:
transliterate("𝐻𝐞𝒍𝓁𝓸 𝔱𝕙𝖊𝗋𝚎!")->Hello there!
- example:
merge(arrayOrDictionary1, arrayOrDictionary2, ..., arrayOrDictionaryN)- recursively merges an arbitrary number of arrays or dictionariescomment_parent_id(commentOrPath)- returns the comment's parent id as an integer or null if it's a top level comment, can accept either the whole comment data object, or just the path
Note: Previous version contained the function
string_contains. The function still exists for backwards compatibility, but shouldn't be used for new stuff, instead use the built-incontainslike this:"some string" contains "another string", e.g.data.data.content contains '@my_bot@my_instance'
Enhanced expressions, in addition to the above, have access to these functions:
community(communityId)- returns the CommunityData DTO for community with given ID (or null if no such community exists)instance(instanceId)- returns the InstanceData DTO for instance with given ID (or null if no such instance exists)post(postId)- returns the PostData DTO for post with given ID (or null if no such post exists)person(personId)- returns the PersonData DTO for a person with given ID (or null if no such person exists)comment(commentId)- returns the CommentData DTO for a comment with given ID (or null if no such comment exists)local_user(userId)- returns the LocalUserData DTO for a local user with given ID (or null if no such user exists)private_message(privateMessageId)- returns the PrivateMessageData DTO for a private message with given ID (or null if no such private message exists)private_message_with_content(privateMessageId)- returns the PrivateMessageWithContentData DTO for a private message with given ID. Be careful with this as it can contain sensitive data. If you use scopes, this function requires a separate scope (private_message_content) than theprivate_messagefunction or webhookglobal_ban(personId)- returns a ModBanData DTO for the given user ornullif no ban exists
note that in all the cases above, null will also be returned if you don't have permission to access any of the given object types
Simple expressions can be used everywhere, but enhanced expressions cannot be used in the filter_expression field.
That's because filter_expression runs synchronously on the main thread and could potentially block further processing if it took too long.
If you need to filter on more complex expressions, you can use the enhanced_filter field. You can also use both fields,
it will be first filtered based on filter_expression on the main thread and then on the enhanced_filter in the worker thread.
The filter expressions use the Symfony ExpressionLanguage, read more on the syntax in the official documentation.
data.data.local
!data.data.local
data.data.creatorId === 2
lowercase(data.data.content) contains "@chatgpt@lemmings.world" (I use that one for my ChatGPT bot)
data.data.content !== data.previous.content
Lemmy first creates the comment with placeholder values, for example
pathis always0for INSERT. You can use this expression to only trigger when the final path has been resolved.
data.data.path !== data.previous.path
data
{title: data.data.name, hasUrl: data.data.url !== null}
{id: data.data.id, banReason: global_ban(data.data.id)?.reason}
{
title: data.data.name,
community: community(data.data.communityId).name,
instance: instance(community(data.data.communityId).instanceId).domain
}
{commentId: data.data.id, mentionedBot: "ChatGPT@lemmings.world"} (I use that one for my ChatGPT bot)
instance(community(post(data.data.postId).communityId).instanceId).domain === 'my.instance.org'
The webhooks work by first filtering based on your operation and type criteria, meaning if a new post is created,
all webhooks that are created with post as the value of object_type and INSERT as operation (or without any operation
specified) will be fetched.
Afterwards all webhooks are checked for their filter_expression, if it evaluates to true, the webhook is triggered in a worker.
The worker then checks for the result of enhanced_filter expression and continues only if it evaluates to true.
A http request is then constructed with optional body (from body_expression) and headers.
So, this is a full SQL insert for getting only new local posts using a POST request:
INSERT INTO webhooks (url, method, body_expression, filter_expression, object_type, operation, headers, enhanced_filter)
VALUES ('https://example.com/webhook', 'POST', 'data.data', 'data.data.local', 'comment', 'INSERT', null, null);You can use RabbitMQ as a queue instead of Redis:
This is the example from the top of this README modified to include RabbitMQ
services:
# ...
redis: # Redis is still needed because it's used for cache
image: redis
ports:
- 6379:6379
rabbitmq:
image: rabbitmq:4-management
ports: # you don't need to bind ports if you don't want to
- 5672:5672
- 15672:15672
webhooks:
image: ghcr.io/rikudousage/lemmy-webhook:latest
environment:
- LEMMY_HOST=postgres
- REDIS_HOST=redis
- LEMMY_PASSWORD=superSecr3t
- API_REGISTRATION_ENABLED=1
- CORS_ALLOW_ORIGIN=^.*$$
- LARGE_PAYLOAD_SIZE=1024
- MESSENGER_TRANSPORT_DSN=amqp://guest:guest@rabbitmq:5672/%2f/lemmy_webhook_queue # read more at https://symfony.com/doc/current/messenger.html#amqp-transport
ports:
- 8080:80
volumes:
- ./volumes/database:/opt/databaseIf you're developing an app that relies on these webhooks, you can create a config file with all your webhooks that can easily be imported into the graphical UI.
The config file is a yaml file which must have a top-level field webhooks, which is an array of the following properties:
uniqueMachineName- required - a unique name for the webhook. If you import a webhook with the same unique name, it will overwrite the previous webhook instead of adding iturl- required - the target webhook URLmethod- required - the HTTP method, must be one of the values in RequestMethodobjectType- required - must be a valid object typeoperation- required - the database operation to react to, must be one of the values in DatabaseOperationbodyExpressionfilterExpressionenhancedFilterExpressionheadersenabled- whether the webhook is enabled or not
For example:
webhooks:
- uniqueMachineName: demo.some_webhook
url: https://example.com
method: POST
objectType: comment
operation: INSERT
bodyExpression: '{title: data.data.name, hasUrl: data.data.url !== null}'
filterExpression: data.data.local
enhancedFilterExpression: "instance(community(post(data.data.postId).communityId).instanceId).domain === 'my.instance.org'"
headers:
X-Custom-Header: Some-value
Accept: application/json
enabled: true