Skip to content

docker metrics (read metrics from cgroups for specified container)#8886

Closed
monsterzz wants to merge 1 commit intomoby:masterfrom
monsterzz:8842-docker-metrics
Closed

docker metrics (read metrics from cgroups for specified container)#8886
monsterzz wants to merge 1 commit intomoby:masterfrom
monsterzz:8842-docker-metrics

Conversation

@monsterzz
Copy link
Contributor

Fixes #8842.

This is first and very simple implementation of docker metrics command. A lot of work needed to push it to production.

Roadmap:

  • make find of cgroup paths more reliable (not just try to search predefined locations, but get exact value from procfs)
  • specify set of metrics (from cpuacct, memory, blkio, ... controllers)
  • store metrics in struct instead of map[string]string (typed data)
  • -f FORMAT support
  • documentation

PS. PR created for further discussion and this branch will be constantly updated.

/cc @shykes @tobegit3hub @thaJeztah

Signed-off-by: Gleb M Borisov <borisov.gleb@gmail.com>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use json.Unmarshal and a struct or map for this, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's copy-paste from CmdInspect. I've marked both functions with TODO (will
extract common things and make their code less ugly).

Thanks for the tip, just a first week with Go :)
On Fri 31 Oct 2014 at 10:22 pm Erik Hollensbe notifications@github.com
wrote:

In api/client/commands.go:

@@ -812,6 +812,46 @@ func (cli *DockerCli) CmdInspect(args ...string) error {
return nil
}

+func (cli *DockerCli) CmdMetrics(args ...string) error {

  • cmd := cli.Subcmd("metrics", "CONTAINER", "Return runtime information on a container")
  • if err := cmd.Parse(args); err != nil {
  •   return nil
    
  • }
  • if cmd.NArg() < 1 {
  •   cmd.Usage()
    
  •   return nil
    
  • }
  • indented := new(bytes.Buffer)

use json.Unmarshal and a struct or map for this, please.


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/8886/files#r19688050.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. thanks. just let us know when it's ready for review again.

@SvenDowideit
Copy link
Contributor

needs documentation in cli.md, a man page (in docs/man) and in runmetrics.md

@jeremyeder
Copy link

Hi @crosbymichael and @monsterzz this is along the lines of what we're doing with nsinit at the moment. Nice to see it potentially in docker itself. This patch currently does not work on RHEL-like systems, and I think @monsterzz eluded to the cgroup path config needing some work, which is fine. I didn't look into it deeply but hope that can be resolved.

    "Metrics": {
        "CpuUsage": "Error: cgroup subsystem 'cpuacct' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryLimit": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryMaxUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'"
    },

I would expect a monitoring system will want a global "watch" on all containers, existing and new. Do you agree ? Basically docker would continuously pump all it's stats to a websocket at a certain interval, and the monitoring system would consume at it's own pace.

You could have i.e. " docker metrics --all" for the cli.

Agreed the stat calculation interval should be tunable; if I could suggest you default to 10s rather than 1s. 1s stat gathering on busy, dense nodes is a costly burden we should reserve for field-debug/troubleshooting. Even 10s might be too aggressive depending on the business.

The other thing is the patch itself decides on arbitrary names that closely resemble their cgroup counterparts. Why not get rid of all ambiguity and use the precise names used by the kernel ?

@vishh
Copy link
Contributor

vishh commented Nov 13, 2014

I wonder if it might be better to fork a separate process from the docker
daemon that handles stats acquisition and processing. This new process can
share code with docker daemon and can remain an internal component. This
might help scale docker daemon in the long run. We can continue to have a
single API and have the docker daemon be the source of metrics. WDYT
@crosbymichael?

On Thu, Nov 13, 2014 at 12:53 PM, Jeremy Eder notifications@github.com
wrote:

Hi @crosbymichael https://github.com/crosbymichael and @monsterzz
https://github.com/monsterzz this is along the lines of what we're
doing with nsinit at the moment. Nice to see it potentially in docker
itself. This patch currently does not work on RHEL-like systems, and I
think @monsterzz https://github.com/monsterzz eluded to the cgroup path
config needing some work, which is fine. I didn't look into it deeply but
hope that can be resolved.

"Metrics": {
    "CpuUsage": "Error: cgroup subsystem 'cpuacct' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryLimit": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryMaxUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'"
},

I would expect a monitoring system will want a global "watch" on all
containers, existing and new. Do you agree ? Basically docker would
continuously pump all it's stats to a websocket at a certain interval, and
the monitoring system would consume at it's own pace.

You could have i.e. " docker metrics --all" for the cli.

Agreed the stat calculation interval should be tunable; if I could suggest
you default to 10s rather than 1s. 1s stat gathering on busy, dense nodes
is a costly burden we should reserve for field-debug/troubleshooting. Even
10s might be too aggressive depending on the business.

The other thing is the patch itself decides on arbitrary names that
closely resemble their cgroup counterparts. Why not get rid of all
ambiguity and use the precise names used by the kernel ?


Reply to this email directly or view it on GitHub
#8886 (comment).

@crosbymichael
Copy link
Contributor

@vishh do you not think the Go scheduler could handle this? As long as this is async I think go should be able to handle the load within one process. We can always break this out into another process later if we find that this is not the case and we are having performance issues.

@tobegit3hub
Copy link

We really need this. Can anyone help to review and merge it 😃

@monsterzz
Copy link
Contributor Author

Unfortunately I have no time to implement push approach at this time. I
think I will have more free time to work on this after Christmas.
On Wed 17 Dec 2014 at 1:02 pm tobe notifications@github.com wrote:

We really need this. Can anyone help to review and merge it [image:
😃]


Reply to this email directly or view it on GitHub
#8886 (comment).

@crosbymichael
Copy link
Contributor

@monsterzz no problem. I'll take care of finishing this feature.

Closing this in favor of #9984

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Get metrics from docker daemon

10 participants