Add oplog metricset to mongodb module#7604
Conversation
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
| used := event["size"].(common.MapStr)["used"].(int64) | ||
| assert.True(t, used > 0) | ||
|
|
||
| first_ts := event["first"].(common.MapStr)["ts"].(int64) |
There was a problem hiding this comment.
don't use underscores in Go names; var first_ts should be firstTs
|
|
||
| // get first and last items in the oplog | ||
| oplog_iter := collection.Find(nil).Sort("$natural").Iter() | ||
| oplog_reverse_iter := collection.Find(nil).Sort("-$natural").Iter() |
There was a problem hiding this comment.
don't use underscores in Go names; var oplog_reverse_iter should be oplogReverseIter
| used := int64(oplogStatus["size"].(float64)) | ||
|
|
||
| // get first and last items in the oplog | ||
| oplog_iter := collection.Find(nil).Sort("$natural").Iter() |
There was a problem hiding this comment.
don't use underscores in Go names; var oplog_iter should be oplogIter
| return false | ||
| } | ||
|
|
||
| func New(base mb.BaseMetricSet) (mb.MetricSet, error) { |
There was a problem hiding this comment.
exported function New should have comment or be unexported
| mb.DefaultMetricSet()) | ||
| } | ||
|
|
||
| type MetricSet struct { |
There was a problem hiding this comment.
exported type MetricSet should have comment or be unexported
| "gopkg.in/mgo.v2/bson" | ||
| ) | ||
|
|
||
| const oplog_col = "oplog.rs" |
There was a problem hiding this comment.
don't use underscores in Go names; const oplog_col should be oplogCol
|
jenkins test this |
|
Could you add a changelog entry? |
jsoriano
left a comment
There was a problem hiding this comment.
Thanks for working on this! :) It looks quite good, I have added some comments, the only serious thing is the failing test.
| var debugf = logp.MakeDebug("mongodb.oplog") | ||
|
|
||
| func init() { | ||
| logp.Info("initializing oplog") |
There was a problem hiding this comment.
We don't use to log initializations.
metricbeat/docs/fields.asciidoc
Outdated
|
|
||
| -- | ||
|
|
||
| *`mongodb.oplog.last.ts`*:: |
There was a problem hiding this comment.
I wouldn't abbreviate timestamps field names, what about first.time or first.timestamp?
| } | ||
|
|
||
| firstTs := int64(first.(bson.M)["ts"].(bson.MongoTimestamp)) | ||
| lastTs := int64(last.(bson.M)["ts"].(bson.MongoTimestamp)) |
There was a problem hiding this comment.
Add checks for type conversions here
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, either express or implied. See the License for the | ||
| // specific language governing permissions and limitations | ||
| // under the License. |
There was a problem hiding this comment.
Add this selector so this is executed only when running integration tests:
// +build integration
I think this is the reason why CI builds are failing.
| // New creates a new instance of the MetricSet | ||
| // Part of new is also setting up the configuration by processing additional | ||
| // configuration entries if needed. | ||
| func New(base mb.BaseMetricSet) (mb.MetricSet, error) { |
There was a problem hiding this comment.
Oh, add also here the experimental warning, something like:
cfgwarn.Experimental("The mongodb oplog metricset is experimental.")
|
@jsoriano Thanks for your comments. I've fixed these issues and pushed them to my branch. |
|
@a3dho3yn unfortunately I cannot see the fixes, could you check that you pushed to the branch used for this PR? |
|
@jsoriano We just finished our work but I have a problem with the integration tests. But tests are failing in the CI due to |
|
I have been trying and the tests fail if they are run just after starting the container and it passes if the container was already started beforehand (or in a previous execution), so this is something that can be probably solved by improving the healthcheck, that currently only checks if the port is open. On the other hand, I have also seen that tests only fail in the |
|
Optimes should be calculated from primray's view. That's why we used the
strong mode.
When there is one node and we initiate the replica set, this member should
become primary and there should be no problem with forcing commands to run
against primary.
May be initiating takes time and we should wait for it. I will check this.
Thanks for your insights:)
…On Tue, Aug 14, 2018 at 7:53 PM Jaime Soriano Pastor < ***@***.***> wrote:
I have been trying and the tests fails if the test is run just after start
the container and it passes if the container was already started
beforehand, so this is something that can be probably solved by improving
the healthcheck, that currently only checks if the port is open.
On the other hand, I have also seen that tests only fail in the replstatus
metricset, the only one setting the session mode to strong with mongoSession.SetMode(mgo.Strong,
true). I wonder if it this is really needed. If this line is removed,
tests pass too.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7604 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFk_xLFKaNQkjdHhLmwSLdOXr9RHBhakks5uQutagaJpZM4VQKYb>
.
|
|
jenkins, test this please |
jsoriano
left a comment
There was a problem hiding this comment.
It is looking good, I see it being merged soon 🙂
Only some small comments left.
| @@ -0,0 +1,30 @@ | |||
| { | |||
| func init() { | ||
| mb.Registry.MustAddMetricSet("mongodb", "replstatus", New, | ||
| mb.WithHostParser(mongodb.ParseURL), | ||
| mb.DefaultMetricSet()) |
There was a problem hiding this comment.
If this requires replicaset to work maybe it'd be better to make it non-default.
| myState, ok := status["myState"].(int) | ||
| t.Logf("Mongodb state is %d", myState) | ||
| if ok && myState == 1 { | ||
| time.Sleep(5 * time.Second) // hack, wait more for replica set to become stable |
There was a problem hiding this comment.
Is there any way to detect this stability? 🙂
We can leave it by now with the sleep in any case and revisit it later.
| if ok && myState == 1 { | ||
| time.Sleep(5 * time.Second) // hack, wait more for replica set to become stable | ||
| break | ||
| } |
There was a problem hiding this comment.
Could we add a sleep after every retry?
There was a problem hiding this comment.
Yes we can, but I see no reason to do this.
As you mentioned before, we should wait for some condition instead of sleeping. First I thought we should wait for a primary node, and I expected this condition to be sufficient for running the test. But then -in action- I figured out it needs something more than a node in the primary state. As I didn't find any condition to wait for, I used this sleep expresion.
If you're worried about blowing up CPU with this loop, I should note that state changes very fast (like 5 3 2 2 1) and it doesn't seem to be an issue.
There was a problem hiding this comment.
Ok, let's leave it like this by now.
|
jenkins, test this |
Add metrics about replication health. Co-authored-by: Hossein Taleghani <a3dho3yn@users.noreply.github.com> Co-authored-by: Bahar Taghavi <bahareh.t@fanap.plus> (cherry picked from commit ca8f56b)
Oplog size and window are two important metrics which show replication health.
With this metric set, we can have this information about replication:
{"mongodb": { "oplog": { "size": { "allocated": 2605587456, "used": 2616684138 }, "first": { "ts": 6515806468564845000 }, "last": { "ts": 6578335797915681000 }, "window": 62529329350836220 } }}