Skip to content

Full text search LINQ support#315

Closed
rwasef1830 wants to merge 2 commits intonpgsql:devfrom
rwasef1830:dev_full_text_search_support
Closed

Full text search LINQ support#315
rwasef1830 wants to merge 2 commits intonpgsql:devfrom
rwasef1830:dev_full_text_search_support

Conversation

@rwasef1830
Copy link
Contributor

Hello,
This resolves #1 and implements full text LINQ query support with all methods and operators. It also adds an explicit casting function (EF.Functions.Cast) for edge cases (when needed). This brings it in feature parity with the EF6 provider.

Thanks @roji for the hints along the way. Hoping for a review and merge 👍

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch 2 times, most recently from 10e748a to efdc703 Compare February 28, 2018 09:35
@rwasef1830
Copy link
Contributor Author

Regressions found. Investigating...

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch from efdc703 to f946701 Compare February 28, 2018 11:31
@rwasef1830
Copy link
Contributor Author

rwasef1830 commented Feb 28, 2018

Was forced to drop array_to_tsvector as there are some internal assumptions in EF Core's expression processing that depend on array expressions not getting server evaluated and it is too fragile to try to distinguish between cases where the expressions can be supported server-side or not. So I decided to drop its support. array_to_tsvector is seldom used anyway, it is a utility function to join lexemes into a single tsvector.

Ready for review @roji

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch from f946701 to e99aa6b Compare March 7, 2018 14:22
@rwasef1830
Copy link
Contributor Author

Rebased on latest dev and squashed full text support with dropping array_to_tsvector

@roji
Copy link
Member

roji commented Mar 7, 2018

Great, so should I review?

@rwasef1830
Copy link
Contributor Author

Yup!

@roji
Copy link
Member

roji commented Mar 7, 2018

OK, will make this the next thing on my plate after I finish my current task, promise.

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch from e99aa6b to 186b86b Compare March 7, 2018 16:05
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for really high-quality work @rwasef1830, I wish all my pull requests looked like this...

Take a look at my comments, all in all it everything is OK, apart possibly from the expression factoring I discuss below.

/// Casts the specified <paramref name="target" /> to the PostgreSQL type equivalent
/// to <typeparamref name="TTarget" />.
/// </summary>
public static TTarget Cast<TTarget>(this DbFunctions _, object target) => throw new NotSupportedException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must be missing something, but why is this needed? There's already NpgsqlConvertTranslator which translates Convert.ToInt32(xxx) to CAST(xxx AS integer).?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a useful utility for emitting a cast on the PostgreSQL side that may not have a conversion operator or method defined on the C# side and thus may not be expressible in LINQ directly.

I have a similar method in the EF6 provider, so this is for feature parity. It is not strictly needed for full text search but may be useful for others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the usefulness of this, but it seems like something that belongs in EF Core rather than Npgsql - it has really nothing to do with this specific provider. EFCore.Relational has some "universal" relational translators such as for Like (as well as some others), and it would make sense to have a Cast translator in there as well.

One more note... in this method you're providing the cast target as a CLR type (TTarget), which is great. But in some cases I found myself needing to specify a store datatype as a string: when you specify as a CLR type, you're going through the provider's type mappings, which in effect limits you to the default store datatype for that CLR type. For example, there's no way to translate to PostgreSQL type jsonb, since its CLR type is string, but the default datatype for that is actually text.

I recently had this issue when fixing #319, where I needed to cast to citext, and added my own ExplicitStoreTypeCastExpression, which is the same as an ExplicitCastExpression but which is provided a store datatype as a string instead of a CLR type, effectively bypassing the type mappings. This also probably belongs in EF Core rather than the provider.

If you end up doing a PR to EF Core, you may want to add the string-based cast option as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roji Seems related to dotnet/efcore#4978 which started off being about Unicode, but in general is about providing type mappings for parts of a query where it is not possible/easy to infer the mapping from some related property. /cc @divega

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ajcvickers. I do think it also makes sense to provide some means for users to force a cast sometimes, as a last resort or workaround. As I said above, I ended up needing a manual explicit cast to a store type (not a CLR type - to bypass the type mapping system), you can check this code out.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divega See dotnet/efcore#11089 where the answer was to force an "invalid" cast into the model using (string)(object)myInt. This works because it is never executed, but the cast call seems cleaner. But my main point was that something like this could allow facets to be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that there are also store types which simply don't exist in C#. For example, CLR string maps to both text and json in PostgreSQL, with text being the default. But in some scenarios I may need to specify a cast to JSON specifically (since PostgreSQL doesn't do implicit casting in several scenarios).

Copy link

@divega divega Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks, I think I get it now. EF.Functions.Cast<X>(y) is useful if the three conditions below are true:

  1. There is no valid conversion
  2. You don't want to write (X)(object)y
  3. The X CLR type provides enough information (e.g. it is not about explicitly casting to store type or specifying a facet)

I think it is interesting, and we could have something like this in EF Core, but perhaps it is more interesting to consider other overloads or variations that can set various facets or the store type. I think it is unlikely that adding this now would prevent us from adding the other variations we want later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwasef1830, I guess you want to open an issue for this feature in the EF Core repo, and remove this commit from the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, got caught up and busy at the moment, will get it done next week hopefully.

public override bool IsEvaluatableMethodCall(MethodCallExpression methodCallExpression) =>
(!methodCallExpression.Method.IsGenericMethod
|| methodCallExpression.Method.GetGenericMethodDefinition() != _cast)
&& methodCallExpression.Method != _tsQueryParse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit my knowledge of the query pipeline and specifically this part is very weak, but can you please explain why it's necessary to include your methods in this EvaluatableExpressionFilter? All other SQL translation of methods happen just fine without this, so what's special about full text search?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because EF Core tries to compile the LINQ query and does a first pass optimization using ParameterExtractingExpressionVisitor eliminating expressions that do not reference any columns and evaluating them on the client side and sending the results directly as a literal.

Without this my unit tests break and it introduces a subtle behavior anomaly where if you do something like this:

context.Query<SomeEntity>(x => 1 + 1);

The 1+1 expression will be evaluated client-side and not translated to SQL.

But if you do this:

context.Query<SomeEntity>(x => x + 1);

It will be properly be translated to SQL.

Entity Framework Core people were aware of this issue from before and had used a workaround for this (FunctionEvaluationDisablingVisitor) to avoid re-linq doing this optimization when not desired, but later on a native way was introduced in re-linq (EvaluatableExpressionFilter) so they removed that visitor and used the new way instead.

Note this affects your Regex implementation as well, you may want to add Regex methods to the filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this. The thing still causing me trouble, is why this isn't happening in EF Core, the SQL Server provider or anywhere else I can see.

EF Core's base EvaluatableExpressionFilter only lists a handful of non-deterministic methods such as DateTime.Now, while RelationalEvaluatableExpressionFilter adds user-defined database functions to that. The SQL Server and Sqlite providers don't have a EvaluatableExpressionFilter at all.

In other words, the entire rich array of methods that EF Core knows how to translate (including SQL Server's FREETEXT method (full-text search) is absent from EvaluatableExpressionFilter. Any idea why that is? We can also ask the EF Core team about this.

Regarding the Regex implementation, isn't it actually an advantage that an expression gets evaluated locally when possible? I mean, if the user writes an expression with regex that doesn't reference any database columns (this would be weird), why should that get sent to the database for evaluation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitpatel, @anpete there are some good query design questions here that I don't feel prepared to answer accurately... Can you weigh in?

In general I do agree with @roji that we prefer evaluating things that are uncorrelated to the server on the client.

Also, not sure you care but this is probably not a safe way to compare reflection objects across all .NET implementations.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this isn't happening in EF Core, the SQL Server provider or anywhere else I can see.

Because, I think, these new functions are unusual in that they may be used in a way where they are not correlated to any range variable in the query (at least that's what it looks like from looking at the tests). In the more normal case, the function call expression contains one or more range variable expressions, and so would never be considered for partial eval.

We may want to allow providers to simply add entries to the base EvaluatableExpressionFilter in order to avoid a service replacement here..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anpete I don't think the these new functions are very special - it's true that the tests currently seem to exercise mostly local data, but the functions obviously make sense mostly with database data (after all we're talking about full-text search)...

Regardless, definitely a +1 for allowing adding entries in EvaluatableExpresisonFilter (like the member/method translators?) - and so also moving it out of Internal.

using System.Diagnostics.CodeAnalysis;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.FullTextSearch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think part of the appeal of the extension methods is to make them discoverable by not using a special namespace, i.e. just have them light up. In other words, let's change this to namespace Microsoft.EntityFrameworkCore (this is also where SqlServer puts its extensions methods). This doesn't need to have an impact on where the source file is located.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will change this.

using System.Diagnostics.CodeAnalysis;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.FullTextSearch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, change namespace to Microsoft.EntityFrameworkCore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

{
public class NpgsqlFullTextSearchMethodTranslator : IMethodCallTranslator
{
private static readonly MethodInfo _tsQueryParse = typeof(NpgsqlTsQuery).GetMethod(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer omitting the private as it's the default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some parts of the codebase omit it and others keep it, I saw in the parts that I changed that the private modifier was being used, so I used it explicitly to keep in line.

If you prefer to remove it, I will remove it from the code in this pull request, but the entire codebase needs to be combed for this issue.

/// return a new weighted tsvector.
/// http://www.postgresql.org/docs/current/static/textsearch-features.html#TEXTSEARCH-MANIPULATE-TSVECTOR
/// </summary>
public static NpgsqlTsVector SetWeight(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add params to lexemes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shied away from that because they are not supposed to be open-ended, they are supposed be an array of exactly 4 elements. There's no way to express that in C#, I even considered making them 4 explicit parameters but I thought of leaving it as an array... someone might get an idea to read its value form another column in a database somewhere :-) I preferred not to complicate that person's life :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

/// return a new weighted tsvector.
/// http://www.postgresql.org/docs/current/static/textsearch-features.html#TEXTSEARCH-MANIPULATE-TSVECTOR
/// </summary>
public static NpgsqlTsVector SetWeight(this NpgsqlTsVector vector, char weight, string[] lexemes) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous reply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

? $"{Left} {Operator} {Right}"
: $"{Operator} {Left}";

public static FullTextSearchExpression TsQueryAnd([NotNull] Expression left, [NotNull] Expression right)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this actually have much value... If/when you you refactor the expression as I suggested, creating a "contains" expression would be a matter of instantiating a FullTextSearchBinaryExpression and passing an enum to the constructor (NpgsqlFullTextOperation.Contains or something similar).

query.Write(builder, true);
builder.Replace("'", "''");
builder[0] = '\'';
builder.Append("'::tsquery");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The literal representations as usually of the form HSTORE 'blablabla' rather than 'blablabla'::hstore, it's probably a good idea to align to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I didn't know this syntax. I will update it to match.

builder.Append(vector);
builder.Replace("'", "''");
builder[0] = '\'';
builder.Append("'::tsvector");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change it.

@rwasef1830
Copy link
Contributor Author

@roji see my replies.

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch from 186b86b to dac2675 Compare March 9, 2018 10:58
@rwasef1830
Copy link
Contributor Author

rwasef1830 commented Mar 9, 2018

@roji Rebased and pushed with fixes as reviewed.

Note: I wasn't able to get the BinaryExpression / UnaryExpression idea to work because the built-in classes are sealed tight with internal constructors and argument validation that prevents using these expressions on arbitrary types (must have declared operators corresponding to the requested expression type).

methodCallExpression.Arguments[0],
methodCallExpression.Arguments[1]);

case nameof(NpgsqlFullTextSearchLinqExtensions.ToNegative):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

public override bool IsEvaluatableMethodCall(MethodCallExpression methodCallExpression) =>
(!methodCallExpression.Method.IsGenericMethod
|| methodCallExpression.Method.GetGenericMethodDefinition() != _cast)
&& methodCallExpression.Method != _tsQueryParse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this. The thing still causing me trouble, is why this isn't happening in EF Core, the SQL Server provider or anywhere else I can see.

EF Core's base EvaluatableExpressionFilter only lists a handful of non-deterministic methods such as DateTime.Now, while RelationalEvaluatableExpressionFilter adds user-defined database functions to that. The SQL Server and Sqlite providers don't have a EvaluatableExpressionFilter at all.

In other words, the entire rich array of methods that EF Core knows how to translate (including SQL Server's FREETEXT method (full-text search) is absent from EvaluatableExpressionFilter. Any idea why that is? We can also ask the EF Core team about this.

Regarding the Regex implementation, isn't it actually an advantage that an expression gets evaluated locally when possible? I mean, if the user writes an expression with regex that doesn't reference any database columns (this would be weird), why should that get sent to the database for evaluation?

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround. Apart from two tiny indentation/whitespace nitpicks below, the only thing pending from my end is fully understand the function of EvaluatableExpressionFilter and to separate out the explicit cast operation as a PR to EF Core (see more detail about these below).

return expression;
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this extra line

@roji
Copy link
Member

roji commented Mar 10, 2018

To make it easy to locate, apart from the two whitespace/line nitpicks there's still https://github.com/npgsql/Npgsql.EntityFrameworkCore.PostgreSQL/pull/315/files#r173125096 and #315 (comment).

@roji
Copy link
Member

roji commented Mar 10, 2018

One last comment - if you're up to it you can also do some documentation on this. I think at the very least we should update http://www.npgsql.org/efcore/mapping-and-translation.html#operation-translation-to-sql (not necessary a line for each and every operation), or even a separate full-text section that points to the PostgreSQL docs, gives a basic example, etc. A note in the release notes would also be great.

If you don't have time (or don't like writing docs) I can do it...

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch 2 times, most recently from 87147e5 to bcdb685 Compare March 17, 2018 16:10
@rwasef1830
Copy link
Contributor Author

rwasef1830 commented Mar 17, 2018

@roji Removed the cast method and all its relevant code, fixed the 2 nits and force pushed. I rebased on latest dev but I am worried about having messed up BuiltInDataTypesNpgsqlTest because there was a pretty huge merge conflict there. I would appreciate double checking that those tests are correct (they pass except for one that fails due to different ansi encoding on my machine it seems).

About documentation, sure I will update the release notes with a line. Don't know what to edit to make the changes though...

dev...rwasef1830:patch-1

Not sure where that other website page is so I can edit it.

@roji
Copy link
Member

roji commented Mar 17, 2018

Thanks @rwasef1830, will take a look ASAP. Yeah, I redid a lot of BuiltInDataTypesNpgsqlTest to sync it with EF Core, will take a look at your changes and make sure everything is OK.

Regarding the docs, it's a bit confusing but the EF Core docs are simply in the doc directory of this repo. The website is generated out of the doc repo which includes the doc directories from npgsql, EF Core and EF6 - this allows us to have a single site consolidating docs from all three projects. Long story short, just make your changes in the doc directory, I'll take of the rest after it's merged.

@rwasef1830 rwasef1830 changed the title Explicit casting and full text search LINQ support Full text search LINQ support Mar 18, 2018
@rwasef1830
Copy link
Contributor Author

@roji Added some documentation.

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch 2 times, most recently from dcf62b4 to d5aeee7 Compare March 18, 2018 13:06
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwasef1830, sorry for the slow turnaround and thanks for the fixes and docs. I really want to get this merged in soon as preview2 is around the corner, take a look at my comments.

If you have any trouble rebasing or anything else, I can correct myself as part of the merge.

using NpgsqlTypes;
using Xunit;

namespace Microsoft.EntityFrameworkCore.Query
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to Npgsql.EntityFrameworkCore.PostgreSQL.Query


namespace Microsoft.EntityFrameworkCore.Query
{
public class FullTextSearchDbFunctionsNpgsqlTest : IClassFixture<NorthwindQueryNpgsqlFixture<NoopModelCustomizer>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite doesn't seem to actually test any query, in the sense that a WHERE clause with a search operation gets generated (that's the point after all...), could you add a few basic ones please?

using Npgsql.EntityFrameworkCore.PostgreSQL.Utilities;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.Storage.Internal.Mapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.Mapping

using Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.Storage.Internal.Mapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.Mapping

using Npgsql.EntityFrameworkCore.PostgreSQL.Utilities;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.Storage.Internal.Mapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.Mapping

using Microsoft.EntityFrameworkCore.Metadata;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.Query.Internal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If kept (see comment below), change to Npgsql.EntityFrameworkCore.PostgreSQL.Query.Internal


namespace Microsoft.EntityFrameworkCore.Query.Internal
{
public class NpgsqlEvaluatableExpressionFilter : RelationalEvaluatableExpressionFilter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not clear as to why we need to make the full-text methods non-evaluatable. Specifically, I'm looking for an answer to #315 (comment), given that I don't see SQL Server's full text (or almost any other method) handled this way.

Is this because trying to call the full-text search methods on a local vector+query will throw a NotSupportedException? If so, I'm not sure that it's a good reason to force server-evaluation for the following reasons:

  • I don't think that it makes much sense to allow users to perform database operations on totally local data, effectively using the database for "computation". In fact, allowing the operation to throw would probably keep users from doing the wrong thing.
  • This doesn't seem to be standard practice in EF Core, i.e. in SQL Server/Sqlite. The only methods I see there are non-deterministic methods (e.g. DateTime.Now, random-related methods) and I'm pretty sure that's there so that the result doesn't get cached locally (and you get the same thing across invocations).
  • RelationalEvaluatableExpressionFilter is internal in EF Core so we should avoid a dependency on it unless there's a good reason.

Copy link
Contributor Author

@rwasef1830 rwasef1830 Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because trying to call the full-text search methods on a local vector+query will throw a NotSupportedException

Yes.

It's because of the query compiler. EF Core before executing or visiting the expression tree attempts to do a first pass compile/optimize with a different set of visitors. It tries to eliminate all calls in LINQ expressions and convert them to constant expressions (evaluated client side) when they don't reference any column, but in case of full text search functions, there is simply no implementation to run client side, so the behavior is incorrect as the expressions would throw and I must force it to evaluate them server-side.

RelationalEvaluatableExpressionFilter comes from EvaluatableExpressionFilter which is a public API from Relinq (not EF Core), so this API will never go away, if EF ever make RelationalEvaluatableExpressionFilter internal or private, we can easily update and wrap/proxy the original object without any redesign (unless they rebase the entire EF core framework on something other than Relinq but that is highly unlikely). So I think my use of it is the 100% intended use case by Relinq folks and it should be safe to keep.

Sql Server folks have made a choice to let exceptions be thrown if the expression doesn't reference any column (like Freetext) because they guess nobody will use it that way (as 'compute'), but in my opinion that is a bug, or perhaps there is no way in Sql Server to call such functions without referencing columns. Their handling of DateTime.Now is the bare minimum implementation and I think more methods should be added to the filter (anything that doesn't have a client-side implementation).

Allowing users to use the relational store for 'compute' may not be an orthodox use case but there might be some fringe uses for it. I don't see why we should arbitrarily create an incorrect behavior/implementation hole that was not there in EF6 (which can handle such queries just fine) just because the Sql server people decided to do that (or maybe didn't test that case).

Let me know what you think :-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See dotnet/efcore#2667 @roji when they started using this API from relinq.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajcvickers, @divega could we get a bit of guidance here?

To summarize the issue, @rwasef1830 is finalizing the full-text support, and proposes to add NpgsqlEvaluatableExpressionFilter in order to ensure that the relevant functions always execute on the server. This is because these functions, like SQL Server's FREETEXT, have no client implementation and would always throw. The main (only?) use case would be to allow running these database functions on client-local data.

My personal opinion is that the usage scenario isn't important enough, but please see @rwasef1830's comment above (and mine). It would be interesting to hear your thoughts on this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roji we discussed this with @ajcvickers and @smitpatel (@anpete was not part of the discussion and he may also have something relevant to add).

In summary, we believe that the behavior @rwasef1830 is proposing is reasonable. We currently are not doing this with the SQL Server provider but in the future we could stop throwing the exception and start issuing new queries.

It will result in N+1 queries being sent to the server, but as long as there are appropriate client evaluation warnings are being reported this is not different from other cases in which other constructs (e.g. subqueries) result into N+1.

There might be a problem with async queries: ideally each individual function evaluation should be evaluated asynchronously not block. I actually haven't looked at the proposed implementation so I am not sure if this is already handled. If it isn't, this is something that I believe @anpete knows how to handle the async composition and he may be able to provide some pointers.

I have created dotnet/efcore#11466 to track the idea of doing this more often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for the guidance. I'll merge this PR as-is, and we'll change the implementation when #11466 is implemented (possibly with a better way to add methods as @anpete suggests).

@rwasef1830 rwasef1830 force-pushed the dev_full_text_search_support branch from d5aeee7 to 73ca134 Compare March 27, 2018 22:16
@rwasef1830
Copy link
Contributor Author

rwasef1830 commented Mar 27, 2018

@roji Rebased and force pushed, added a couple of tests in FullTextSearchDbFunctionsNpgsqlTest and updated namespace names and commented on the evaluatable expression filter (probably got hidden in the "outdated" conversations in github due to the force push).

Btw, the project is depending on nuget package versions that are only available on aspnetcore myget repo (not nuget).

@roji
Copy link
Member

roji commented Mar 27, 2018

OK, great @rwasef1830 and thanks for all your patience. Let's wait and see what the EF Core folks say regarding the evaluatable expression filter question. Other than that I'll merge and amend myself if I run into any more nits.

using Microsoft.EntityFrameworkCore.Query.Expressions.Internal;
using NpgsqlTypes;

namespace Microsoft.EntityFrameworkCore.Query.ExpressionTranslators.Internal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namespace! (may be other files too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this PR was written before The Great Namespace Change, will fix, thanks.

@roji
Copy link
Member

roji commented Mar 29, 2018

Btw, the project is depending on nuget package versions that are only available on aspnetcore myget repo (not nuget).

If you're talking about EF Core packages then that's normal - we're tracking 2.1.0-preview2 which is still on myget only. Unfortunately the upcoming enum support which I'm about to merge as well (see branch enum) currently depends on Npgsql 4 so that's going to probably be another reference to myget.

@roji
Copy link
Member

roji commented Mar 29, 2018

Merged to dev via 05cedfd (and f35bed9), thanks for all this work @rwasef1830! It's great to have PostgreSQL full-text search support!

One question: how would you feel about an overload of Matches() that accepts a string, an implicitly does to_tsquery to convert it to a query? I'm asking because looking at [this test code](https://github.com/npgsql/Npgsql.EntityFrameworkCore.PostgreSQL/blob/05cedfdde0e70e0ee80b21488a24eead7d467e25/test/EFCore.PG.FunctionalTests/Query/FullTextSearchDbFunctionsNpgsqlTest.cs#L696, things seem needlessly verbose with EF.Functions everywhere. There could be other functions where such overloads could make sense.

Also, can you please open an issue to track support for array_to_tsvector, just so we have it in the backlog somewhere?

@roji
Copy link
Member

roji commented Mar 29, 2018

One last thought is that it may be worthwhile to write a separate page describing how to do full-text search, with complete examples (including setting up the index). As it is there's only mention that certain functions are map, but users will likely be a bit bewildered...

Any chance you'd be up for writing that up? If not I'll look into it before release.

@rwasef1830
Copy link
Contributor Author

rwasef1830 commented Mar 29, 2018

@roji

One question: how would you feel about an overload of Matches() that accepts a string, an implicitly does to_tsquery to convert it to a query? I'm asking because looking at [this test code] things seem needlessly verbose with EF.Functions everywhere. There could be other functions where such overloads could make sense.

I considered this, but decided not to do it because there are native overloads for @@ that accept strings but the semantics are somewhat surprising:

The @@ operator also supports text input, allowing explicit conversion of a text string to tsvector or tsquery to be skipped in simple cases. The variants available are:

tsvector @@ tsquery
tsquery  @@ tsvector
text @@ tsquery
text @@ text

The first two of these we saw already. The form text @@ tsquery is equivalent to to_tsvector(x) @@ y. The form text @@ text is equivalent to to_tsvector(x) @@ plainto_tsquery(y).

I didn't implement the overload that takes text in place of tsvector because the Matches C# method stub is implemented as an extension on NpgsqlTsVector (I would have to make an extension method for string which I didn't want to do because of extension method pollution).

I did implement an overload which takes string for Matches for the tsquery parameter but I send it as-is to postgresql so that it can convert it using plain_totsquery. Doing an implicit to_tsquery would introduce a semantic difference between us and the native behavior.

@roji
Copy link
Member

roji commented Mar 29, 2018

OK, I actually missed the string overload that you already implemented. Everything looks good then.

@rwasef1830
Copy link
Contributor Author

@roji

One last thought is that it may be worthwhile to write a separate page describing how to do full-text search, with complete examples (including setting up the index). As it is there's only mention that certain functions are map, but users will likely be a bit bewildered...

I actually have a couple of extension methods for migrations (for creating the update trigger and the index), but not sure where I can put them. Do we have a samples project ?

@roji
Copy link
Member

roji commented Mar 29, 2018

@rwasef1830 take a look at the builder extension methods (e.g. NpgsqlEntityTypeBuilderExtensions, NpgsqlIndexBuilderExtensions). An example is ForNpgsqlUseXminAsConcurrencyToken.

Some builder extension methods to simplify the setup sound like a great idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Full-text search support for EF Core

6 participants