Add ManticoresearchAdapter#103
Conversation
ab0ccb5 to
b74f9aa
Compare
e85cd0b to
7a97e4c
Compare
8cff708 to
2d70635
Compare
Hi. I'm a member of Manticore team. Please let me know if we can help with this. |
c95518c to
c508f6c
Compare
|
@sanikolaev your help is really welcome here. Maybe we can do this step by step, first would be nice if you could help how to map a The abstraction is supporting the following kind of fields for single representation I currently did use the following mapping, I think that should be correct. The datetime / timestamps seems to be representated in Manticoresearch as Number so I did use in our converter to convert "2023-..." to a timestamp number presentation like we already using for Apache Solr. So a very basic mapping should look like this hope atleast that is correct:
But now the more difficult part, every Field can be multiple, I'm not yet sure how I can map something else then a
The
While reading the documentation about text / string I'm not sure if a field which contains text would maybe be better to be
All kind of fields can be
The problem with the multiple fields is what currently make the Implementation crashing as I'm not sure how this can be handle with manticore search engine or sphinx:
Form the previous discussion some JSON field maybe would support this, but I'm not sure about correclty defining that types. as it fails there in case of combination with indexed:
As example our test has a |
c508f6c to
38db6ff
Compare
38db6ff to
9087029
Compare
This is only possible using the mysql> drop table if exists t; create table t(string_array json, float_array json, bool_array json); insert into t values(0, '["abc", "def"]', '[1.23, 2.34]', '[true, false]'),(0, '["ghi", "jkl"]', '[3.45, 4.56]', '[true, true]'); select *, any(x = 'abc' for x in string_array), any(x > 3.0 and x < 4.0 for x in float_array), all(x = 1 for x in bool_array) from t;
--------------
drop table if exists t
--------------
Query OK, 0 rows affected (0.01 sec)
--------------
create table t(string_array json, float_array json, bool_array json)
--------------
Query OK, 0 rows affected (0.01 sec)
--------------
insert into t values(0, '["abc", "def"]', '[1.23, 2.34]', '[true, false]'),(0, '["ghi", "jkl"]', '[3.45, 4.56]', '[true, true]')
--------------
Query OK, 2 rows affected (0.00 sec)
--------------
select *, any(x = 'abc' for x in string_array), any(x > 3.0 and x < 4.0 for x in float_array), all(x = 1 for x in bool_array) from t
--------------
+---------------------+---------------+---------------------+--------------+--------------------------------------+-----------------------------------------------+--------------------------------+
| id | string_array | float_array | bool_array | any(x = 'abc' for x in string_array) | any(x > 3.0 and x < 4.0 for x in float_array) | all(x = 1 for x in bool_array) |
+---------------------+---------------+---------------------+--------------+--------------------------------------+-----------------------------------------------+--------------------------------+
| 1515343812221005444 | ["abc","def"] | [1.230000,2.340000] | [true,false] | 1 | 0 | 0 |
| 1515343812221005445 | ["ghi","jkl"] | [3.450000,4.560000] | [true,true] | 0 | 1 | 1 |
+---------------------+---------------+---------------------+--------------+--------------------------------------+-----------------------------------------------+--------------------------------+
2 rows in set (0.00 sec)BTW |
|
@sanikolaev thx for the response, what about
|
|
I tried to skip the attribute and indexed part for the json fields still run into another error this is the manticore field defintions had to use {
"title": {
"type": "text",
"options": [
"indexed"
]
},
"header_image_media": {
"type": "integer",
"options": []
},
"header_video_media": {
"type": "string",
"options": []
},
"article": {
"type": "text",
"options": [
"indexed"
]
},
"blocks_text_title": {
"type": "json",
"options": []
},
"blocks_text_description": {
"type": "json",
"options": []
},
"blocks_text_media": {
"type": "multi",
"options": []
},
"blocks_embed_title": {
"type": "json",
"options": []
},
"blocks_embed_media": {
"type": "json",
"options": []
},
"footer_title": {
"type": "text",
"options": [
"indexed"
]
},
"created": {
"type": "timestamp",
"options": []
},
"commentsCount": {
"type": "integer",
"options": []
},
"rating": {
"type": "float",
"options": []
},
"comments_email": {
"type": "json",
"options": []
},
"comments_text": {
"type": "json",
"options": []
},
"tags": {
"type": "json",
"options": []
},
"categoryIds": {
"type": "multi",
"options": []
},
"_source": {
"type": "string",
"options": []
}
}This is the document: {
"title": "New Blog",
"header_image_media": 1,
"article": "<article><h2>New Subtitle<\/h2><p>A html field with some content<\/p><\/article>",
"blocks_text_title": "[\"Titel\",\"Titel 2\",\"Titel 4\"]",
"blocks_text_description": "[\"<p>Description<\\\/p>\",\"<p>Description 4<\\\/p>\"]",
"blocks_text_media": [
3,
4,
3,
4
],
"blocks_embed_title": "[\"Video\"]",
"blocks_embed_media": "[\"https:\\\/\\\/www.youtube.com\\\/watch?v=iYM2zFP3Zn0\"]",
"footer_title": "New Footer",
"created": "2022-01-24T12:00:00+01:00",
"commentsCount": 2,
"rating": 3.5,
"comments_email": "[\"admin.nonesearchablefield@localhost\",\"example.nonesearchablefield@localhost\"]",
"comments_text": "[\"Awesome blog!\",\"Like this blog!\"]",
"tags": "[\"Tech\",\"UI\"]",
"categoryIds": [
1,
2
],
"_source": "{\"unrelated\":\"Unrelated\"}"
}it is indixed via the PHP client this way: $searchIndex = $this->client->index('test_complex');
$searchIndex->addDocument($aboveDocument, '23b30f01-d8fd-4dca-b36a-4710e360a965');But when try to load that document via: $searchIndex = $this->client->index('test_complex');
$searchIndex->getDocumentById('23b30f01-d8fd-4dca-b36a-4710e360a965');It errors with:
Not sure why this is happening. |
I see. This is right. Manticore doesn't natively support nested objects and the period sign is used for json, e.g.:
Manticore doesn't support string IDs. The ID requirements can be found here https://manual.manticoresearch.com/Creating_a_table/Data_types#Document-ID. |
From the document above we have text which is searchable but are represented by an array of texts, as we did flatten the whole blocks objects. As suggested by you I did now use for this array text fields ( The [
'type' => 'text',
'index' => true,
'fields' => [
'raw' => ['type' => 'keyword'],
],
]So a field |
|
The equivalent of Elasticsearch's in Manticore is |
|
I'm not sure if I did understand you correctly {
"uuid": "23b30f01-d8fd-4dca-b36a-4710e360a965",
"tags": ["UI", "UX"]
}For searchable I think we could use {
"uuid": "23b30f01-d8fd-4dca-b36a-4710e360a965",
"tags": "UI UX"
}But that how we could still get then filterability to work to get document tagged with that tags. That would still I think require a {
"uuid": "23b30f01-d8fd-4dca-b36a-4710e360a965",
"tags": "UI UX",
"tags_raw": ["UI", "UX"]
}PS: we are using the https://github.com/manticoresoftware/manticoresearch-php here so we are not actually do any create table ... statement ourselfs. |
|
Closing this as I'm to unexperiecned with manticore to finish this one. If somebody want to give it try feel free to reopen a merge request. Every adapter should fullfill the tests run against them only thing which not all adapters currently support is the multi index search so that would be fine to skip also for manticore. The main focus should be on the different SearchTests which I could not be able to work correctly with the different expected cases. |


Manticoresearch is a Sphinx Fork providing PHP implementation over https://github.com/manticoresoftware/manticoresearch-php. As requested by some on reddit we are trying to support also this.
TODO
testFindMultipleIndexesSkipped TODO issueExternal