Skip to content

Adding queries for collecting QDS data for SQL DB#9150

Closed
masree wants to merge 47 commits intoinfluxdata:masterfrom
masree:qds_db
Closed

Adding queries for collecting QDS data for SQL DB#9150
masree wants to merge 47 commits intoinfluxdata:masterfrom
masree:qds_db

Conversation

@masree
Copy link
Copy Markdown
Contributor

@masree masree commented Apr 19, 2021

Required for all PRs:

  • Updated associated README.md.
  • Wrote appropriate unit tests.

resolves #

Summary of PR:

Igor started the PR to collect data from query store runtime and wait statistics for SQL DB and SQL MI. Continuing his PR to completion as he does not own this feature anymore. There are a few review comments that needed to be addressed from old PR. Steps to address them:

  1. Split the PR into 2 - SQL DB and SQL MI
  2. Address the review comments related to SQL DB
  3. Get it reviewed from community
  4. Merge SQL DB PR
  5. Create a new PR for SQL MI as this needs more work (we need to iterate the individual DBs in a managed instance)
  6. Repeat 2-4

IgorKuchmienko and others added 10 commits November 23, 2020 23:49
…tore wait statistics, for Azure SQL DB and MI

Add ability to store in a hashtable a custom value associated with the query result, and use it when the query is run next time.
For query store queries, we store the time when the query was run last, and in the next run we only return the data between the last run time and the current time, rounded down to a multiple of the reporting interval.
Copy link
Copy Markdown
Contributor

@telegraf-tiger telegraf-tiger bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤝 ✅ CLA has been signed. Thank you!

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Apr 19, 2021
@masree masree force-pushed the qds_db branch 5 times, most recently from 570ada4 to b9af066 Compare April 27, 2021 04:57
@andrewzayats
Copy link
Copy Markdown

@denzilribeiro @dimitri-furman could you please review?
End-to-end tests were done using telegraf standalone configuration and Azure SQL Insights solution.

Comment thread plugins/inputs/sqlserver/README.md Outdated
Comment thread plugins/inputs/sqlserver/README.md Outdated
Comment thread plugins/inputs/sqlserver/README.md Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go Outdated
Comment thread plugins/inputs/sqlserver/azuresqlqueries.go
Comment thread plugins/inputs/sqlserver/sqlserver.go
Copy link
Copy Markdown
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @masree, thanks for the PR. I have only a minor comment but I'm, to be honest, not the right person to review the SQL statements.
However, can you please comment on Stevens question regarding a "feature-enable list" instead of using booleans to switch on/off features? I think it is much more extensible...

@srebhan srebhan self-assigned this Jun 30, 2021
andrewzayats and others added 2 commits July 1, 2021 21:53
@denzilribeiro
Copy link
Copy Markdown
Contributor

However, can you please comment on Stevens question regarding a "feature-enable list" instead of using booleans to switch on/off features? I think it is much more extensible...

@ssoroka , @srebhan -- This is no different than turning queries on/off right aka feature-enable list or include_queries list is the same ? Unless the feature-enable groups a set of queries into a "feature" that you turn on/off. The difference with query store is that it is generally different in terms of granularity of collection and the query is a bit more heavy duty than say the other light weight queries and isn't really a live monitoring metric but lets you analyze queries and perf that contributed to resource usage in time.

The crux of the problem is we wanted to disable by default and have someone explicitly enable, so we wanted a disable by default, explicitly opt in behavior.

Take for example someone who had telegraf prior to this PR, doesn't change their config files to exclude this explicitly, they get this collector turned on by default and will have to explicitly add to exclude_query list -- if that is fine then we can just use the exclude_list . That isn't just for query store related collector only but should agree upon an approach for any of the more "advanced" collectors or bit more heavy weight collectors. If exclude_query is the way to go that is fine just we would introduce new things for existing customers that don't change their config file but update telegraf. Query store being a bit of a different type of collector thought of using explicit flag but I get the point that a flag for a feature in time can explode the conf section.

We do understand that the same could be achieved with this ( and just use exclude list) but it won't be disable by default, rather onus is on customers.
exclude_query = ["SQLServerAvailabilityReplicaStates", "SQLServerDatabaseReplicaStates","AzureSQLDBQueryStoreRuntimeStatistics","AzureSQLDBQueryStoreWaitStatistics"]

@srebhan
Copy link
Copy Markdown
Member

srebhan commented Jul 12, 2021

@denzilribeiro the difference is in the extensibility of both schemes. Imagine adding another "thing to collect".

With the boolean approach you will need to extend the config (sampleConfig, README, struct) with another option that is say disabled by default.

With the feature list approach, you just implement the feature and add the "feature enable string" to the sampleconfig and README documentation. No need to change the actual configuration. Furthermore, if you have more than a handful of those feature enable/disable toggles, a include list is much more readable than a bulky list of true/false options.
You can still keep the backward compatibility by carefully crafting the feature_include/feature_exclude lists.

So yes, both approaches are equivalent feature-wise, but the latter is much more extensible/readable in the long-run.

@denzilribeiro
Copy link
Copy Markdown
Contributor

You can still keep the backward compatibility by carefully crafting the feature_include/feature_exclude lists.
So yes, both approaches are equivalent feature-wise, but the latter is much more extensible/readable in the long-run.

Right now feature_include/feature_exclude is the same as exclude_query /include_query which is already implemented. So if the decision is still to revert to that, then it is fine

At some point look at this a bit more holistically. The issue I have with include/exclude you don't have a notion of "default collectors" for those that don't specify the include/exclude lists and so that moves the onus to customers.

Say we have queries a,b,c,d in existing version and now we introduce a query e that we want enabled by default, and a query f that we want disabled by default
a. Customers that have an include list in existing conf file have to add query e
b. Customers with an exclude list in their conf file will have to add query f to exclude it
c. Customers that do not have an include/exclude list will get both e and f.

Sure we can change the default conf to reflect it, but a customer that already has a conf and doesn't update it won't get it, maybe I am missing something :)

@srebhan
Copy link
Copy Markdown
Member

srebhan commented Jul 12, 2021

Right now feature_include/feature_exclude is the same as exclude_query /include_query which is already implemented. So if the decision is still to revert to that, then it is fine

I'm fine with using the existing lists.

At some point look at this a bit more holistically. The issue I have with include/exclude you don't have a notion of "default collectors" for those that don't specify the include/exclude lists and so that moves the onus to customers.

I'm not sure I understand your comment correctly. You can still define a default include or exclude set in your plugin and put that into the sample config. If the user chooses to modify it, he/she will start with the default and adapt to the needs.

Say we have queries a,b,c,d in existing version and now we introduce a query e that we want enabled by default, and a query f that we want disabled by default

Then I would specify:

# query_include = ["a", "b", "c", "d", "e"]
# query_exclude = []

in your sample config and set these options when setting up the plugin in init() (lowercase i):

func init() {
	inputs.Add("sqlserver", func() telegraf.Input {
		return &SQLServer{
			Servers: []string{defaultServer},
			IncludeQuery: []string{"a", "b", "c", "d", "e"},
		}
	})
}

a. Customers that have an include list in existing conf file have to add query e

Right, but that is the semantics behind an include list. You explicitly whitelist a set of features. So you don't want to get new stuff automagically. :-)

b. Customers with an exclude list in their conf file will have to add query f to exclude it

No because you did not include it in the first place, so you don't need to exclude it. Note you are free to craft the include list such that only ever backward-compatible features are enabled. The exclude list will then superseed the include list.

c. Customers that do not have an include/exclude list will get both e and f.

See above. Just do not specify e and f in your default include list.

Sure we can change the default conf to reflect it, but a customer that already has a conf and doesn't update it won't get it, maybe I am missing something :)

Right, but again, if somebody sets an include list, he/she will get exactly the stuff he/she specified. That's the intention, no surprises. ;-)

@denzilribeiro
Copy link
Copy Markdown
Contributor

Will change to include/exclude lists for now and revisit any other changes later.

Just to clarify concern wasn't ever of someone using include lists as they get exactly what they want, no more, no less. Concern was folks that weren't using include lists.. or using exclude lists in getting a potentially heavier collector since the default for all collectors as premise is by default all collectors enabled.

My experience has been in my limited experience with telegraf that folks don't necesarily update their conf files often yet update telegraf major versions but then I maybe wrong

@srebhan
Copy link
Copy Markdown
Member

srebhan commented Jul 12, 2021

[...] Concern was folks that weren't using include lists.. or using exclude lists in getting a potentially heavier collector since the default for all collectors as premise is by default all collectors enabled.

Well almost. Premise is be backward-compatible. That is, if you add a new feature, you now have to explicitly define the include list (instead of "use all") to exclude the new feature. If you had feature a to d in the first version of telegraf and default was "include all" and you now add feature e you have to define include=["a", "b", "c", "d"] as the new default. This way, people not using the include-list will still get what they got before. Everybody else will get what they defined.

My experience has been in my limited experience with telegraf that folks don't necesarily update their conf files often yet update telegraf major versions but then I maybe wrong

No you are right and that's exactly the point. Please craft the include list in a way that you get the same results with the old and new version of telegraf.

@andrewzayats
Copy link
Copy Markdown

@srebhan, the flag was removed. Are we good to merge this pull request?

Copy link
Copy Markdown
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewzayats, @masree I have some more minor comments.

The most important one is to change the documentation (README.md and sampleConfig) such that the include_query list is specified explicitly to make sure the user can uncomment the option without a change in behavior.

Comment thread plugins/inputs/sqlserver/README.md
Comment thread plugins/inputs/sqlserver/README.md
Comment on lines +41 to +42
HasCachedData bool
DataCache *QueryDataCache
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. do we really need those exported?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean under exported? If you need more details about the purpose of DataCache you can get them from the following comment #9150 (comment)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that those struct members are visible and accessible outside of the plugin if starting with an uppercase letter (see https://golang.org/doc/effective_go#names). To prevent (accidental) messing with those members, you should make them start with a lowercase-letter.

Comment thread plugins/inputs/sqlserver/sqlserver.go
rows, err := pool.Query(query.Script)
if query.HasCachedData {
cachedData, _ := query.DataCache.Get(server)
rows, err = pool.Query(query.Script, sql.Named("p1", cachedData))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssoroka can this be marked resolved?


## A list of queries to include. If not specified, all the above listed queries are used.
# include_query = []
# include_query = ["AzureSQLDBResourceStats", "AzureSQLDBResourceGovernance", "AzureSQLDBWaitStats", "AzureSQLDBDatabaseIO", "AzureSQLDBServerProperties", "AzureSQLDBOsWaitstats", "AzureSQLDBMemoryClerks", "AzureSQLDBPerformanceCounters", "AzureSQLDBRequests", "AzureSQLDBSchedulers"]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @andrewzayats, you changed this in the documentation, but not in the code. With the current code, your new query will automagically appear with old configs. Please initialize IncludeQuery as described here.

@reimda reimda removed the request for review from ssoroka September 24, 2021 20:27
@MyaLongmire MyaLongmire added the waiting for response waiting for response from contributor label Sep 28, 2021
@srebhan
Copy link
Copy Markdown
Member

srebhan commented Oct 28, 2021

@masree any update here?

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Oct 28, 2021
@srebhan srebhan added the waiting for response waiting for response from contributor label Oct 28, 2021
@dimitri-furman
Copy link
Copy Markdown

@srebhan unfortunately we will not be able to continue work on this PR in foreseeable future. My impression is that it is close to meeting the quality bar for merging once your comment is addressed. Perhaps this can be finished by other contributors at some point in the future. Apologies, and thanks for understanding.

CC: @masree

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Nov 3, 2021
@sjwang90
Copy link
Copy Markdown
Contributor

sjwang90 commented Nov 4, 2021

Closing this issue. If anyone is interested in picking this up please feel free to re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/sqlserver feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants