Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implemented fetching rules from a remote server for conditional gathering #583

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,13 @@ build-debug: ## Compiles the insights operator in debug mode
## Container
## --------------------------------------

.PHONY build-container:
build-container: ## Compiles the insights operator and its container image
$(CONTAINER_RUNTIME) build -t insights-operator -f ./Dockerfile .

.PHONY build-debug-container:
build-debug-container: ## Compiles the insights operator and its container image for debug
$(CONTAINER_RUNTIME) build -t insights-operator -f ./Dockerfile.debug .
$(CONTAINER_RUNTIME) build -t insights-operator -f ./debug.Dockerfile .

## --------------------------------------
## Tools
Expand Down
61 changes: 32 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Insights Operator

This cluster operator gathers anonymized system configuration and reports it to Red Hat Insights. It is a part of the
standard OpenShift distribution. The data collected allows for debugging in the event of cluster failures or
This cluster operator gathers anonymized system configuration and reports it to Red Hat Insights. It is a part of the
standard OpenShift distribution. The data collected allows for debugging in the event of cluster failures or
unanticipated errors.

# Table of Contents
Expand Down Expand Up @@ -60,7 +60,7 @@ Unit tests can be started by the following command:
make test
```

It is also possible to specify CLI options for Go test. For example, if you need to disable test results caching,
It is also possible to specify CLI options for Go test. For example, if you need to disable test results caching,
use the following command:

```shell script
Expand All @@ -72,8 +72,8 @@ VERBOSE=-count=1 make test
# Documentation


The document [docs/gathered-data](docs/gathered-data.md) contains the list of collected data and the API that is used
to collect it. This documentation is generated by the command bellow, by collecting the comment tags located above
The document [docs/gathered-data](docs/gathered-data.md) contains the list of collected data and the API that is used
to collect it. This documentation is generated by the command bellow, by collecting the comment tags located above
each Gather method.

To start generating the document run:
Expand All @@ -86,12 +86,12 @@ make docs

## Generate the certificate and key

Certificate and key are required to access Prometheus metrics (instead 404 Forbidden is returned). It is possible
to generate these two files from Kubernetes config file. Certificate is stored in `users/admin/client-cerfificate-data`
and key in `users/admin/client-key-data`. Please note that these values are encoded by using Base64 encoding,
Certificate and key are required to access Prometheus metrics (instead 404 Forbidden is returned). It is possible
to generate these two files from Kubernetes config file. Certificate is stored in `users/admin/client-cerfificate-data`
and key in `users/admin/client-key-data`. Please note that these values are encoded by using Base64 encoding,
so it is needed to decode them, for example by `base64 -d`.

There's a tool named `gen_cert_key.py` that can be used to automatically generate both files. It is stored in `tools`
There's a tool named `gen_cert_key.py` that can be used to automatically generate both files. It is stored in `tools`
subdirectory.

```shell script
Expand All @@ -100,10 +100,10 @@ gen_cert_file.py kubeconfig.yaml

## Prometheus metrics provided by Insights Operator

It is possible to read Prometheus metrics provided by Insights Operator. Example of metrics exposed by
It is possible to read Prometheus metrics provided by Insights Operator. Example of metrics exposed by
Insights Operator can be found at [metrics.txt](docs/metrics.txt)

Depending on how or where the IO is running you may have different ways to retrieve the metrics.
Depending on how or where the IO is running you may have different ways to retrieve the metrics.
Here is a list of some options, so you can find the one that fits you:

### Running IO locally
Expand Down Expand Up @@ -140,22 +140,20 @@ curl --cert k8s.crt --key k8s.key -k 'https://prometheus-k8s.openshift-monitori

## Debugging Prometheus metrics without valid CA

Get the token
1. Forward the service

```shell script
oc sa get-token prometheus-k8s -n openshift-monitoring
```bash
sudo kubefwd svc -n openshift-monitoring -d openshift-monitoring.svc -l prometheus=k8s
```

Change in `pkg/controller/operator.go` after creating `metricsGatherKubeConfig` (about line #86)
2. Set `INSECURE_PROMETHEUS_TOKEN` environment variable:

```go
metricsGatherKubeConfig.Insecure = true
metricsGatherKubeConfig.BearerToken = "YOUR-TOKEN-HERE"
# by default CAFile is /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
metricsGatherKubeConfig.CAFile = ""
metricsGatherKubeConfig.CAData = []byte{}
```bash
export INSECURE_PROMETHEUS_TOKEN=$(oc sa get-token prometheus-k8s -n openshift-monitoring)
```

3. Run the operator.

# Debugging

## Using the profiler
Expand Down Expand Up @@ -185,7 +183,7 @@ go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30
go tool pprof http://localhost:6060/debug/pprof/heap
```

These commands will create a compressed file that can be visualized using a variety of tools, one of them is
These commands will create a compressed file that can be visualized using a variety of tools, one of them is
the `pprof` tool.

### Analyzing profiling data
Expand Down Expand Up @@ -213,7 +211,7 @@ It uses both the local git and GitHub`s API to update the file so:

It can be used 2 ways:

1. Providing no command line arguments the script will update the current `CHANGELOG.md` with the latest changes
1. Providing no command line arguments the script will update the current `CHANGELOG.md` with the latest changes
2. according to the local git state.

> 🚨 IMPORTANT: It will only work with changelogs created with this script
Expand All @@ -222,7 +220,7 @@ It can be used 2 ways:
go run cmd/changelog/main.go
```

2. Providing 2 command line arguments, `AFTER` and `UNTIL` dates the script will generate a new `CHANGELOG.md` within
2. Providing 2 command line arguments, `AFTER` and `UNTIL` dates the script will generate a new `CHANGELOG.md` within
the provided time frame.

```shell script
Expand All @@ -235,17 +233,17 @@ go run cmd/changelog/main.go 2021-01-10 2021-01-20
* ClusterOperator objects
* All non-secret global config (hostnames and URLs anonymized)

The list of all collected data with description, location in produced archive and link to Api and some examples is
The list of all collected data with description, location in produced archive and link to Api and some examples is
at [docs/gathered-data.md](docs/gathered-data.md)

The resulting data is packed in `.tar.gz` archive with folder structure indicated in the document. Example of such
The resulting data is packed in `.tar.gz` archive with folder structure indicated in the document. Example of such
archive is at [docs/insights-archive-sample](docs/insights-archive-sample).

## Insights Operator Archive

### Sample IO archive

There is a sample IO archive maintained in this repo to use as a quick reference. (can be found
There is a sample IO archive maintained in this repo to use as a quick reference. (can be found
at [docs/insights-archive-sample](https://github.com/openshift/insights-operator/tree/master/docs/insights-archive-sample))

To keep it up-to-date it is **required** to update this manually when developing a new data enhancement.
Expand Down Expand Up @@ -311,8 +309,13 @@ the `managedFields` field when it was removed from the IO archive to save space:
./scripts/update_sample_archive.sh <Path of directory with the NEW extracted IO archive> '"managedFields":'
```

The path of the sample archive directory should be constant relative to
the path of the script and therefore does not have to be specified explicitly.
The path of the sample archive directory should be constant relative to the path of the script and therefore does not
have to be specified explicitly.

# Conditional Gathering

See [docs/conditional-gatherer/README.md](https://github.com/openshift/insights-operator/blob/master/docs/conditional-gatherer/README.md)


# Contributing

Expand Down
1 change: 1 addition & 0 deletions config/local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ leaderElection:
interval: "5m"
storagePath: /tmp/insights-operator
endpoint: http://[::1]:8081
conditionalGathererEndpoint: https://console.redhat.com/api/gathering/gathering_rules
impersonate: system:serviceaccount:openshift-insights:gather
gather:
- ALL
1 change: 1 addition & 0 deletions config/pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ leaderElection:
interval: "2h"
storagePath: /var/lib/insights-operator
endpoint: https://cloud.redhat.com/api/ingress/v1/upload
conditionalGathererEndpoint: https://console.redhat.com/api/gathering/gathering_rules
impersonate: system:serviceaccount:openshift-insights:gather
pull_report:
endpoint: https://cloud.redhat.com/api/insights-results-aggregator/v1/clusters/%s/report
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.debug → debug.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM registry.ci.openshift.org/ocp/builder:rhel-8-golang-1.16-openshift-4.8 AS builder
FROM registry.ci.openshift.org/ocp/builder:rhel-8-golang-1.17-openshift-4.11 AS builder
RUN go get github.com/go-delve/delve/cmd/dlv
WORKDIR /go/src/github.com/openshift/insights-operator
COPY . .
RUN make build-debug

FROM registry.ci.openshift.org/ocp/4.8:base
FROM registry.ci.openshift.org/ocp/4.11:base
COPY --from=builder /go/src/github.com/openshift/insights-operator/bin/insights-operator /usr/bin/
COPY --from=builder /usr/bin/dlv /usr/bin/
COPY config/pod.yaml /etc/insights-operator/server.yaml
Expand Down
123 changes: 123 additions & 0 deletions docs/conditional-gatherer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Conditional Gatherer

Conditional gatherer is a special gatherer which uses a set of rules describing which gathering functions to activate.
More details can be found in `pkg/gatherers/conditional/conditional_gatherer.go`.

## Manual Testing

To test that conditional gatherer provides some data, follow the next steps:

1. Downscale CVO:
```bash
oc scale deployment -n openshift-cluster-version cluster-version-operator --replicas=0
```

2. Backup prometheus rules:
```bash
oc get prometheusrule -n openshift-cluster-samples-operator samples-operator-alerts -o json > prometheus-rules.back.json
```

3. Make SamplesImagestreamImportFailing alert to fire by setting `SamplesImagestreamImportFailing`'s
`expr` value to `1 > bool 0` and `for` to `1s`:
```bash
echo '{
"apiVersion": "monitoring.coreos.com/v1",
"kind": "PrometheusRule",
"metadata": {
"name": "samples-operator-alerts",
"namespace": "openshift-cluster-samples-operator"
},
"spec": {
"groups": [
{
"name": "SamplesOperator",
"rules": [
{
"alert": "SamplesImagestreamImportFailing",
"annotations": {
"message": "Always firing"
},
"expr": "1 > bool 0",
"for": "1s",
"labels": {
"severity": "warning"
}
}
]
}
]
}
}' | oc apply -f -
```

4. Wait for the alert to fire:
```
export ALERT_MANAGER_HOST=(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')
export INSECURE_PROMETHEUS_TOKEN=(oc sa get-token prometheus-k8s -n openshift-monitoring)
curl -k -H "Authorization: Bearer $INSECURE_PROMETHEUS_TOKEN" https://$ALERT_MANAGER_HOST/api/v1/alerts | \
jq '.data[] | select(.labels.alertname == "SamplesImagestreamImportFailing")'
```

5. Make metrics work by forwarding the endpoint and setting INSECURE_PROMETHEUS_TOKEN environment variable:
```bash
export INSECURE_PROMETHEUS_TOKEN=(oc sa get-token prometheus-k8s -n openshift-monitoring)
```
```bash
# run this command in a separate terminal
sudo kubefwd svc -n openshift-monitoring -d openshift-monitoring.svc -l prometheus=k8s --kubeconfig $KUBECONFIG
```

6. Run the operator and wait for an archive containing `conditional/` directory.

7. Restore the backup:
```bash
oc apply -f prometheus-rules.back.json
```

8. Fix CVO back
```bash
oc scale deployment -n openshift-cluster-version cluster-version-operator --replicas=1
```

## Using Locally Started Service

1. Run the service following the instructions here
https://github.com/RedHatInsights/insights-operator-gathering-conditions-service
2. Set `conditionalGathererEndpoint` in `config/local.yaml` to `http://localhost:8081/api/gathering/gathering_rules`
3. Enjoy your conditional rules from the local service

## Using a Mock Server

1. Start a mock server:
```bash
git clone https://github.com/RedHatInsights/insights-operator-gathering-conditions.git
cd insights-operator-gathering-conditions/
./build.sh
python3 -m http.server --directory build/
```

2. Set `conditionalGathererEndpoint` in `config/local.yaml` to `http://localhost:8000/rules.json`
3. Enjoy your conditional rules from the mock service

## Using Stage Endpoint

0. Be connected to Red Hat network or configure a proxy for stage version of console.redhat.com
1. Set up the stage endpoint in `config/local.yaml`
2. Configure authentication through support secret
```bash
echo '{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"namespace": "openshift-config",
"name": "support"
},
"type": "Opaque",
"data": {
"username": "'(echo $STAGE_USERNAME | base64 --wrap=0)'",
"password": "'(echo $STAGE_PASSWORD | base64 --wrap=0)'"
}
}' | oc apply -f -
```

3. Enjoy your conditional rules from the stage endpoint
22 changes: 12 additions & 10 deletions pkg/authorizer/clusterauthorizer/clusterauthorizer.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,20 @@ import (
"net/url"
"strings"

"github.com/openshift/insights-operator/pkg/config"
"golang.org/x/net/http/httpproxy"
knet "k8s.io/apimachinery/pkg/util/net"
)

type Configurator interface {
Config() *config.Controller
}
"github.com/openshift/insights-operator/pkg/config/configobserver"
)

type Authorizer struct {
configurator Configurator
configurator configobserver.Configurator
// exposed for tests
proxyFromEnvironment func(*http.Request) (*url.URL, error)
}

// New creates a new Authorizer, whose purpose is to auth requests for outgoing traffic.
func New(configurator Configurator) *Authorizer {
func New(configurator configobserver.Configurator) *Authorizer {
return &Authorizer{
configurator: configurator,
proxyFromEnvironment: http.ProxyFromEnvironment,
Expand All @@ -32,18 +29,23 @@ func New(configurator Configurator) *Authorizer {
// Authorize adds the necessary auth header to the request, depending on the config. (BasicAuth/Token)
func (a *Authorizer) Authorize(req *http.Request) error {
cfg := a.configurator.Config()

if req.Header == nil {
req.Header = make(http.Header)
}

if len(cfg.Username) > 0 || len(cfg.Password) > 0 {
req.SetBasicAuth(cfg.Username, cfg.Password)
return nil
}

token, err := a.Token()
if err != nil {
return err
}
if req.Header == nil {
req.Header = make(http.Header)
}

req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", token))

return nil
}

Expand Down
Loading