Kubernates network: my frontend cannot reach backend

I have the following docker-compose file that works finely:

version: '3'
services:
myfrontend: 
  image: myregistry.azurecr.io/im1:latest
  container_name: myfrontend
   ports:
  - 80:80
  - 443:443

 mybackend:
image: myregistry.azurecr.io/im2:latest
container_name: mybackend
expose: 
  - 8080

The backend only exposes 8080 to the internal network, the frontend has a modded nginx image with the following configuration (and it works as docker resolves the ip with the container name)

server {
listen 80 default_server;
location / {
    auth_basic "Restricted";
    auth_basic_user_file /etc/nginx/.htpasswd;

    resolver 127.0.0.11 ipv6=off;

    set $springboot "http://mybackend:8080";
    proxy_pass $springboot;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

I migrated the above configuration into kubernates and I get a 502 bad gateway error from nginx, I think because it cannot solve the backend address.

Here’s the kubernates conf, can you give it a look and tell me what am I doing wrong? 😦

apiVersion: apps/v1beta1
kind: Deployment
metadata:
 name: mybackend
spec:
 replicas: 1
 strategy:
   rollingUpdate:
    maxSurge: 1
    maxUnavailable: 1
  minReadySeconds: 5
 template:
   metadata:
  labels:
    app: mybackend
spec:
  nodeSelector:
    "beta.kubernetes.io/os": linux
  containers:
  - name: mybackend
    image: myregistry.azurecr.io/sgr-mybackend:latest
    ports:
    - containerPort: 8080
      name: mybackend
    resources:
      requests:
        cpu: 250m
        limits:
          cpu: 500m
---
apiVersion: v1
kind: Service
metadata:
  name: mybackend
spec:
  ports:
  - port: 8080
  selector:
    app: mybackend
 ---
 apiVersion: apps/v1beta1
 kind: Deployment
 metadata:
 name: myfrontend
 spec:
   replicas: 1 
 template:
 metadata:
  labels:
    app: myfrontend
 spec:
  nodeSelector:
    "beta.kubernetes.io/os": linux
  containers:
  - name: myfrontend
    image: myregistry.azurecr.io/myfrontend:latest
    ports:
    - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: myfrontend
spec:
  type: LoadBalancer
  ports:
  - port: 80
  selector:
    app: myfrontend

Solution:

you need to set your resolver to this:

kube-dns.kube-system.svc.cluster.local

so the kube-dns name\address in your cluster, because nothing on localhost would resolve mybackend to its ip address. I’m not sure you need this at all, because container would know backend address from kubernetes anyway. I’d probably drop that setting

List all the images in docker registry

The command to list all the images in registry is:

curl http://<IP/Hostname>:<Port>/v2/_catalog | python -mjson.tool

See the versions for a specific image (example: aerospike):

curl http://<IP/Hostname>:<Port>/v2/aerospike/tags/list | python -mjson.tool

How to access kubernetes dashboard from outside cluster

Edit kubernetes-dashboard service.

$ kubectl -n kube-system edit service kubernetes-dashboard

You should see yaml representation of the service. Change type: ClusterIP to type: NodePort and save file. If it’s already changed go to next step.

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
...
  name: kubernetes-dashboard
  namespace: kube-system
  resourceVersion: "343478"
  selfLink: /api/v1/namespaces/kube-system/services/kubernetes-dashboard-head
  uid: 8e48f478-993d-11e7-87e0-901b0e532516
spec:
  clusterIP: 10.100.124.90
  externalTrafficPolicy: Cluster
  ports:
  - port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

Next we need to check port on which Dashboard was exposed.

$ kubectl -n kube-system get service kubernetes-dashboard
NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes-dashboard   10.100.124.90   <nodes>       443:31707/TCP   21h

Dashboard has been exposed on port 31707 (HTTPS). Now you can access it from your browser at:
]https://master-ip:31707]

master-ip can be found by executing kubectl cluster-info. Usually it is either 127.0.0.1 or IP of your machine, assuming that your cluster is running directly on the machine, on which these commands are executed.

In case you are trying to expose Dashboard using NodePort on a multi-node cluster, then you have to find out IP of the node on which Dashboard is running to access it. Instead of accessinghttps://master-ip:nodePort

you should access https://node-ip:nodePort.

If the dashboard is still not accessible execute the below:

sudo iptables -P FORWARD ACCEPT

Prometheus pod consuming a lot of memory

I can use ‘–storage.local.memory-chunks’ to limit the memory usage when prometheus 1.X.

 

Prometheus 2.0 uses the OS page cache for data.  It will only use as much memory as it needs to operate. The good news is the memory use is far efficient than 1.x.  The amount needed to collect more data is minimal.

There is some extra memory needed if you do a large amount of queries, or queries that require a large amount of data.
You will want to monitor the memory use of the Prometheus process (process_resident_memory_bytes) and how much page cache the node has left (node_exporter, node_memory_Cached).

Continue reading Prometheus pod consuming a lot of memory

Prometheus giving error for alert rules ConfigMap in kubernetes

When creating a configMap or prometheus alert rules it gives an error as follows:
“rule manager” msg=”loading groups failed” err=”yaml: unmarshal errors:\n line 3: field rules not found in type rulefmt.RuleGroups”

The correct format for the rules to be added is as follows:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-alert-rules
data:
  alert.rules: |-
    groups:
    - name: example
      rules:
      - alert: Lots_Of_Billing_Jobs_In_Queue
        expr: sum (container_memory_working_set_bytes{id="/",kubernetes_io_hostname=~"(.*)"}) / sum (machine_memory_bytes{kubernetes_io_hostname=~"(.*)"}) * 100 > 40
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: container memory high

Kubernetes pod does not get deleted

So i have a pod that died .. the replicaset and deployment are gone , and even that i delete it .. its still there.

The solution to it is to forcefully delete it using:

kubectl delete pod <pod> --force --grace-period=0​

Configure prometheus with kubernetes

Quick start

To quickly start all the things just do this:

kubectl apply \
  --filename https://raw.githubusercontent.com/giantswarm/kubernetes-prometheus/master/manifests-all.yaml

This will create the namespace monitoring and bring up all components in there.

To shut down all components again you can just delete that namespace:

kubectl delete namespace monitoring

Default Dashboards

If you want to re-import the default dashboards from this setup run this job:

kubectl apply --filename ./manifests/grafana/grafana-import-dashboards-job.yaml

In case the job already exists from an earlier run, delete it before:

kubectl --namespace monitoring delete job grafana-import-dashboards

More Dashboards

See grafana.net for some example dashboards and plugins.

  • Configure Prometheus data source for Grafana.
    Grafana UI / Data Sources / Add data source

    • Name: prometheus
    • Type: Prometheus
    • Url: http://prometheus:9090
    • Add
  • Import Prometheus Stats:
    Grafana UI / Dashboards / Import

    • Grafana.net Dashboard: https://grafana.net/dashboards/2
    • Load
    • Prometheus: prometheus
    • Save & Open
  • Import Kubernetes cluster monitoring:
    Grafana UI / Dashboards / Import

    • Grafana.net Dashboard: https://grafana.net/dashboards/162
    • Load
    • Prometheus: prometheus
    • Save & Open

To add a new graph Grafana UI -> Dashboards -> New

Select Graph -> Click on “Panel title” -> Edit. Then In the metrics section in the query box add the following: sum by (status) (irate(kubelet_docker_operations[5m]))

Select source as Prometheus.

You will see graph lines appearing. Thus this way you can add graphs in grafana.

Jenkins pipeline No signature of method: java.util.Collections $UnmodifiableMap.$()

I’m using the Kubernetes Jenkins plugin in order to create Jenkins slaves on demand. The slaves job is to deploy and provision my apps to the Kubernetes cluster.

I created a pipeline project and wrote a very simple Jenkinsfile:

podTemplate(label: 'jenkins-pipeline', containers: [
containerTemplate(name: 'jnlp', image: 'lachlanevenson/jnlp-slave:3.10-1-alpine', args: '${computer.jnlpmac} ${computer.name}', workingDir: '/home/jenkins', resourceRequestCpu: '200m', resourceLimitCpu: '300m', resourceRequestMemory: '256Mi', resourceLimitMemory: '512Mi'),
containerTemplate(name: 'helm', image: 'lachlanevenson/k8s-helm:v2.6.0', command: 'cat', ttyEnabled: true),
containerTemplate(name: 'kubectl', image: 'lachlanevenson/k8s-kubectl:v1.4.8', command: 'cat', ttyEnabled: true),
containerTemplate(name: 'curl', image: 'appropriate/curl:latest', command: 'cat', ttyEnabled: true)
],
volumes:[
    hostPathVolume(mountPath: '/var/run/docker.sock', hostPath: 
'/var/run/docker.sock'),
]){

node ('jenkins-pipeline') {

def pwd = pwd()
def chart_dir = "${pwd}/chart"

checkout([$class: 'SubversionSCM', additionalCredentials: [], excludedCommitMessages: '', excludedRegions: '', excludedRevprop: '', excludedUsers: '', filterChangelog: false, ignoreDirPropChanges: false, includedRegions: '', locations: [[credentialsId: '4041436e-e9dc-4060-95d5-b28be47b1a14', depthOption: 'infinity', ignoreExternalsOption: true, local: '.', remote: 'https://svn.project.com/repo/trunk/RnD/dev/server/src/my-app']], workspaceUpdater: [$class: 'CheckoutUpdater']])

stage ('deploy canary to k8s') {
  container('helm') {
    def version = params.${VERSION}
    def environment = params.${ENVIRONMENT}
    // Deploy using Helm chart

    sh "helm upgrade --install ${version} ${chart_dir} --set imageTag=${version},replicas=1,environment=${environment} --namespace=dev"  

      }
    }
  }
}

The Jenkins slave spins up on Kubernetes but the job fails with this stack trace:

[Pipeline] stage
[Pipeline] { (deploy canary to k8s)
[Pipeline] container
[Pipeline] {
[Pipeline] }
[Pipeline] // container
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // podTemplate
[Pipeline] End of Pipeline
hudson.remoting.ProxyException: groovy.lang.MissingMethodException: No signature of method: java.util.Collections$UnmodifiableMap.$() is applicable for argument types: (org.jenkinsci.plugins.workflow.cps.CpsClosure2) values: [org.jenkinsci.plugins.workflow.cps.CpsClosure2@7d7d26fa]
Possible solutions: is(java.lang.Object), any(), get(java.lang.Object), any(groovy.lang.Closure), max(groovy.lang.Closure), min(groovy.lang.Closure)
    at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:58)
    at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:49)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:18)
    at WorkflowScript.run(WorkflowScript:20)
    at ___cps.transform___(Native Method)
    at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
    at sun.reflect.GeneratedMethodAccessor512.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
    at com.cloudbees.groovy.cps.impl.ClosureBlock.eval(ClosureBlock.java:46)
    at com.cloudbees.groovy.cps.Next.step(Next.java:74)
    at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:154)
    at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:165)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:328)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:80)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:240)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:228)
    at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Finished: FAILURE

I understand that the error comes from a type mismatch but I’m having a hard time understanding in which part of the Jenkinsfile and what I should do about it.

Can anyone please help me?

Solution:

This

def version = params.${VERSION}
def environment = params.${ENVIRONMENT}

Should be this

def version = params."${VERSION}"
def environment = params."${ENVIRONMENT}"

How to install Prometheus with ingress enabled on AWS with Route 53?

For example, my Route 53 Hosted Zone is myzone.com. Created a Kubernetes cluster by kops with cluster full name: earth.myzone.com.

I tried to install Prometheus this way:

helm install prometheus \
  --set alertmanager.ingress.enabled=true \
  --set alertmanager.ingress.hosts=[alertmanager.earth.myzone.com] \
  --set pushgateway.ingress.enabled=true \
  --set pushgateway.ingress.hosts=[pushgateway.earth.myzone.com] \
  --set server.ingress.enabled=true \
  --set server.ingress.hosts=[server.earth.myzone.com]

Got error:

zsh: no matches found: alertmanager.ingress.hosts=[alertmanager.earth.myzone.com]

Or name the subdomain under myzone.com?

helm install prometheus \
  --set alertmanager.ingress.enabled=true \
  --set alertmanager.ingress.hosts=[alertmanager.myzone.com] \
  --set pushgateway.ingress.enabled=true \
  --set pushgateway.ingress.hosts=[pushgateway.myzone.com] \
  --set server.ingress.enabled=true \
  --set server.ingress.hosts=[server.myzone.com]

Also the same error.

If deploy an application by deployment and service manifest files with ELB, create a DNS record is necessary like aws route53 change-resource-record-sets ... first. Then url will like:

app.earth.myzone.com

But if want to deploy Prometheus only, how to do?


Edit

Use @fiunchinho ‘s method to run again, successfully completed:

$ helm install prometheus \
>   --set alertmanager.ingress.enabled=true \
>   --set "alertmanager.ingress.hosts={alertmanager.earth.myzone.com}" \
>   --set pushgateway.ingress.enabled=true \
>   --set "pushgateway.ingress.hosts={pushgateway.earth.myzone.com}" \
>   --set server.ingress.enabled=true \
>   --set "server.ingress.hosts={server.earth.myzone.com}"
NAME:   auxilliary-pronghorn
E0129 01:41:06.224401   15782 portforward.go:303] error copying from remote stream to local connection: readfrom tcp4 127.0.0.1:42993->127.0.0.1:55840: write tcp4 127.0.0.1:42993->127.0.0.1:55840: write: broken pipe
LAST DEPLOYED: Mon Jan 29 01:41:05 2018
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME                                                TYPE       CLUSTER-IP      EXTERNAL-IP  PORT(S)   AGE
auxilliary-pronghorn-prometheus-alertmanager        ClusterIP  100.68.246.60   <none>       80/TCP    1s
auxilliary-pronghorn-prometheus-kube-state-metrics  ClusterIP  None            <none>       80/TCP    1s
auxilliary-pronghorn-prometheus-node-exporter       ClusterIP  None            <none>       9100/TCP  1s
auxilliary-pronghorn-prometheus-pushgateway         ClusterIP  100.69.211.226  <none>       9091/TCP  1s
auxilliary-pronghorn-prometheus-server              ClusterIP  100.71.5.220    <none>       80/TCP    1s

==> v1beta1/DaemonSet
NAME                                           DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
auxilliary-pronghorn-prometheus-node-exporter  2        2        0      2           0          <none>         0s

==> v1beta1/Deployment
NAME                                                DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
auxilliary-pronghorn-prometheus-alertmanager        1        1        1           0          0s
auxilliary-pronghorn-prometheus-kube-state-metrics  1        1        1           0          0s
auxilliary-pronghorn-prometheus-pushgateway         1        1        1           0          0s
auxilliary-pronghorn-prometheus-server              1        1        1           0          0s

==> v1beta1/Ingress
NAME                                          HOSTS                                   ADDRESS  PORTS  AGE
auxilliary-pronghorn-prometheus-alertmanager  alertmanager.earth.myzone.com  80       0s
auxilliary-pronghorn-prometheus-pushgateway   pushgateway.earth.myzone.com   80       0s
auxilliary-pronghorn-prometheus-server        server.earth.myzone.com        80       0s

==> v1/Pod(related)
NAME                                                             READY  STATUS             RESTARTS  AGE
auxilliary-pronghorn-prometheus-node-exporter-kjp25              0/1    ContainerCreating  0         0s
auxilliary-pronghorn-prometheus-node-exporter-r2sfn              0/1    ContainerCreating  0         0s
auxilliary-pronghorn-prometheus-alertmanager-684bb4bf8d-lq5z9    0/2    Pending            0         0s
auxilliary-pronghorn-prometheus-kube-state-metrics-69478d6lwdpq  0/1    ContainerCreating  0         0s
auxilliary-pronghorn-prometheus-pushgateway-6f97d7bc4d-jvj2c     0/1    ContainerCreating  0         0s
auxilliary-pronghorn-prometheus-server-65974d66bc-876rt          0/2    Pending            0         0s

==> v1/ConfigMap
NAME                                          DATA  AGE
auxilliary-pronghorn-prometheus-alertmanager  1     1s
auxilliary-pronghorn-prometheus-server        3     1s

==> v1/PersistentVolumeClaim
NAME                                          STATUS   VOLUME  CAPACITY  ACCESS MODES  STORAGECLASS  AGE
auxilliary-pronghorn-prometheus-alertmanager  Pending  gp2     1s
auxilliary-pronghorn-prometheus-server        Pending  gp2     1s

NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
auxilliary-pronghorn-prometheus-server.default.svc.cluster.local

From outside the cluster, the server URL(s) are:
http://server.earth.myzone.com


The Prometheus alertmanager can be accessed via port 80 on the following DNS name from within your cluster:
auxilliary-pronghorn-prometheus-alertmanager.default.svc.cluster.local

From outside the cluster, the alertmanager URL(s) are:
http://alertmanager.earth.myzone.com


The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster:
auxilliary-pronghorn-prometheus-pushgateway.default.svc.cluster.local

From outside the cluster, the pushgateway URL(s) are:
http://pushgateway.earth.myzone.com

For more information on running Prometheus, visit:
https://prometheus.io/

(I changed my real domain to a fake one here)

But when I tried to access the three services:

All of them can’t been accessed. I don’t know why. How to debug or find the reason?

Solution:

This error

zsh: no matches found:
alertmanager.ingress.hosts=[alertmanager.earth.myzone.com]

is your shell, zsh, complaining because it believes you are trying to execute something related with it. Use quotes to avoid that. Furthermore, Helm expects curly braces for lists.

helm install prometheus \
  --set alertmanager.ingress.enabled=true \
  --set "alertmanager.ingress.hosts={alertmanager.earth.myzone.com}" \
  --set pushgateway.ingress.enabled=true \
  --set "pushgateway.ingress.hosts={pushgateway.earth.myzone.com}" \
  --set server.ingress.enabled=true \
  --set "server.ingress.hosts={server.earth.myzone.com}"

Can't access Prometheus from public IP on aws

Use kops install k8s cluster on AWS.

Use Helm installed Prometheus:

$ helm install stable/prometheus \
    --set server.persistentVolume.enabled=false \
    --set alertmanager.persistentVolume.enabled=false

Then followed this note to do port-forward:

Get the Prometheus server URL by running these commands in the same shell:
  export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
  kubectl --namespace default port-forward $POD_NAME 9090

My EC2 instance public IP on AWS is 12.29.43.14(not true). When I tried to access it from browser:

http://12.29.43.14:9090

Can’t access the page. Why?


Another issue, after installed prometheus chart, the alertmanager pod didn’t run:

ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4       1/2       CrashLoopBackOff   1          9s
ungaged-woodpecker-prometheus-kube-state-metrics-5fd97698cktsj5   1/1       Running            0          9s
ungaged-woodpecker-prometheus-node-exporter-45jtn                 1/1       Running            0          9s
ungaged-woodpecker-prometheus-node-exporter-ztj9w                 1/1       Running            0          9s
ungaged-woodpecker-prometheus-pushgateway-57b67c7575-c868b        0/1       Running            0          9s
ungaged-woodpecker-prometheus-server-7f858db57-w5h2j              1/2       Running            0          9s

Check pod details:

$ kubectl describe po ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Name:           ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Namespace:      default
Node:           ip-100.200.0.1.ap-northeast-1.compute.internal/100.200.0.1
Start Time:     Fri, 26 Jan 2018 02:45:10 +0000
Labels:         app=prometheus
                component=alertmanager
                pod-template-hash=2959465499
                release=ungaged-woodpecker
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff","uid":"ec...
                kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container prometheus-alertmanager; cpu request for container prometheus-alertmanager-configmap-reload
Status:         Running
IP:             100.96.6.91
Created By:     ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Controlled By:  ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Containers:
  prometheus-alertmanager:
    Container ID:  docker://e9fe9d7bd4f78354f2c072d426fa935d955e0d6748c4ab67ebdb84b51b32d720
    Image:         prom/alertmanager:v0.9.1
    Image ID:      docker-pullable://prom/alertmanager@sha256:ed926b227327eecfa61a9703702c9b16fc7fe95b69e22baa656d93cfbe098320
    Port:          9093/TCP
    Args:
      --config.file=/etc/config/alertmanager.yml
      --storage.path=/data
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 26 Jan 2018 02:45:26 +0000
      Finished:     Fri, 26 Jan 2018 02:45:26 +0000
    Ready:          False
    Restart Count:  2
    Requests:
      cpu:        100m
    Readiness:    http-get http://:9093/%23/status delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from storage-volume (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
  prometheus-alertmanager-configmap-reload:
    Container ID:  docker://9320a0f157aeee7c3947027667aa6a2e00728d7156520c19daec7f59c1bf6534
    Image:         jimmidyson/configmap-reload:v0.1
    Image ID:      docker-pullable://jimmidyson/configmap-reload@sha256:2d40c2eaa6f435b2511d0cfc5f6c0a681eeb2eaa455a5d5ac25f88ce5139986e
    Port:          <none>
    Args:
      --volume-dir=/etc/config
      --webhook-url=http://localhost:9093/-/reload
    State:          Running
      Started:      Fri, 26 Jan 2018 02:45:11 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        100m
    Environment:  <none>
    Mounts:
      /etc/config from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ungaged-woodpecker-prometheus-alertmanager
    Optional:  false
  storage-volume:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-wppzm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wppzm
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.alpha.kubernetes.io/notReady:NoExecute for 300s
                 node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                From                                                      Message
  ----     ------                 ----               ----                                                      -------
  Normal   Scheduled              34s                default-scheduler                                         Successfully assigned ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 to ip-100.200.0.1.ap-northeast-1.compute.internal
  Normal   SuccessfulMountVolume  34s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  MountVolume.SetUp succeeded for volume "storage-volume"
  Normal   SuccessfulMountVolume  34s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  MountVolume.SetUp succeeded for volume "config-volume"
  Normal   SuccessfulMountVolume  34s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  MountVolume.SetUp succeeded for volume "default-token-wppzm"
  Normal   Pulled                 33s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Container image "jimmidyson/configmap-reload:v0.1" already present on machine
  Normal   Created                33s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Created container
  Normal   Started                33s                kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Started container
  Normal   Pulled                 18s (x3 over 34s)  kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Container image "prom/alertmanager:v0.9.1" already present on machine
  Normal   Created                18s (x3 over 34s)  kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Created container
  Normal   Started                18s (x3 over 33s)  kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Started container
  Warning  BackOff                2s (x4 over 32s)   kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Back-off restarting failed container
  Warning  FailedSync             2s (x4 over 32s)   kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal  Error syncing pod

Not sure why it FailedSync.

Solution:

When you do a kubectl port-forward with that command it makes the port available on your localhost. So run the command and then hit http://localhost:9090.

You won’t be able to directly hit the prometheus ports from the public IP, outside the cluster. In the longer run you may want expose prometheus at a nice domain name via ingress (which the chart supports), that’s how I’d do it. To use the chart’s support for ingress you will need to install an ingress controller in your cluster (like the nginx ingress controller for example), and then enable ingress by setting --set service.ingress.enabled=true and --set server.ingress.hosts[0]=prometheus.yourdomain.com. Ingress is a fairly large topic in itself, so I’ll just refer you to the official docs for that one:

https://kubernetes.io/docs/concepts/services-networking/ingress/

And here’s the nginx ingress controller:

https://github.com/kubernetes/ingress-nginx

As far as the pod that is showing FailedSync, take a look at the logs using kubectl logs ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 to see if there’s any additional information there.