使用配置文件方式配置 Datasource


文件 datasource.yml 内容如下

apiVersion: 1
datasources:- name: Prometheustype: prometheus# Access mode - proxy (server in the UI) or direct (browser in the UI).access: proxyurl: http://prometheus 的 IP:9090#url: http://prometheus:9091jsonData:httpMethod: POSTexemplarTraceIdDestinations:# Field with internal link pointing to data source in Grafana.# datasourceUid 的值可以是任意的值,但是需要是全局唯一。并且这个值是 dashboards 中使用到的.- datasourceUid: PBFA97CFB590B2093name: traceID# Field with external link.- name: traceIDurl: 'http://localhost:3000/explore?orgId=1&left=%5B%22now-1h%22,%22now%22,%22Jaeger%22,%7B%22query%22:%22$${__value.raw}%22%7D%5D'

使用配置文件方式配置 Dashboard

将从官方下载好的 json 文件中的 annotations


  "annotations": {"list": [{..."datasource": "-- Grafana --",


"annotations": {"list": [{..."datasource": {"type": "datasource","uid": "grafana"},
sed -ri 's/"-- Grafana --",/{\n          "type": "datasource",\n          "uid": "grafana"\n        },doc /g' nodeExporter.json



修改 为

在 Datasource 中配置的 uid, 这里假设是 PBFA97CFB590B2093



{"type": "prometheus","uid": "PBFA97CFB590B2093"},
sed -i  's#"\${DS_PROMETHEUS}",#{\n          "type": "prometheus",\n          "uid": "PBFA97CFB590B2093"\n        },#gp' blackbox.json


官方文档 https://grafana.com/docs/grafana/latest/alerting/set-up/provision-alerting-resources/file-provisioning/



在此目录下创建 yaml 文件,示例文件如下

# config file version
apiVersion: 1# 要导入或更新的规则组列表
groups:# <int> 组织 ID, default = 1- orgId: 1# <string, required> 规则组的名称name: my_rule_group# <string, required> 规则组将存储在其中的文件夹的名称folder: my_first_folder# <duration, required> 规则检查的时间间隔interval: 60s# <list, required> 属于规则组的规则列表rules:# <string, required> 规则的唯一标识符- uid: my_id_1# <string, required> 将在UI中显示的规则的标题title: my_first_rule# <string, required> 条件应使用哪个查询condition: A# <list, required>应在每次评估中执行的查询对象列表-应通过API获取data:- refId: A# datasourceUid 数据源 IDdatasourceUid: 'PBFA97CFB590B2093'model:# 条件conditions:- evaluator:params:- 3type: gtoperator:type: andquery:params:- Areducer:type: lasttype: querydatasource:type: __expr__uid: '-100'expression: 1==0intervalMs: 1000maxDataPoints: 43200refId: Atype: math# <string> 警报规则应链接到的仪表板的UIDdashboardUid: my_dashboard# <int> 警报规则应链接到的面板的IDpanelId: 123# <string> 未返回数据时警报规则的状态#          可以设置的值: "NoData", "Alerting", "OK", default = NoDatanoDataState: Alerting# <string> 查询执行失败时警报规则的状态#          可以设置的值: "Error", "Alerting", "OK", default = Alerting# <duration, required> 警报规则被触发后持续多久才发出告警信息for: 60s# <map<string, string>> 描述信息,任意数据的 key: valueannotations:some_key: some_value# <map<string, string> 可用于筛选和路由警报的字符串映射labels:team: sre_team_1

告警通道 钉钉



# config file version
apiVersion: 1# List of contact points to import or update
contactPoints:# <int> organization ID, default = 1- orgId: 1# <string, required> name of the contact pointname: dingdingreceivers:# <string, required> unique identifier for the receiver- uid: dingdingtype: dingdingsettings:# <string, required>url: https://oapi.dingtalk.com/robot/send?access_token=xxx# <string> options: link, actionCard# msgType: linkmsgType: actionCard# <string>message: |{{ template "default.message" . }}




# config file version
apiVersion: 1# List of notification policies
policies:# <int> organization ID, default = 1- orgId: 1# <string> name of the contact point that should be used for this routereceiver: dingding# <list> The labels by which incoming alerts are grouped together. For example,#        multiple alerts coming in for cluster=A and alertname=LatencyHigh would#        be batched into a single group.##        To aggregate by all possible labels use the special value '...' as#        the sole label name, for example:#        group_by: ['...']#        This effectively disables aggregation entirely, passing through all#        alerts as-is. This is unlikely to be what you want, unless you have#        a very low alert volume or your upstream notification system performs#        its own grouping.group_by: ['...']# <list> a list of matchers that an alert has to fulfill to match the nodematchers:- alertname = Watchdog- severity =~ "warning|critical"# <list> Times when the route should be muted. These must match the name of a#        mute time interval.#        Additionally, the root node cannot have any mute times.#        When a route is muted it will not send any notifications, but#        otherwise acts normally (including ending the route-matching process#        if the `continue` option is not set)mute_time_intervals:- abc# <duration> How long to initially wait to send a notification for a group#            of alerts. Allows to collect more initial alerts for the same group.#            (Usually ~0s to few minutes), default = 30sgroup_wait: 30s# <duration> How long to wait before sending a notification about new alerts that#            are added to a group of alerts for which an initial notification has#            already been sent. (Usually ~5m or more), default = 5mgroup_interval: 5m# <duration>  How long to wait before sending a notification again if it has already#             been sent successfully for an alert. (Usually ~3h or more), default = 4hrepeat_interval: 4h# <list> Zero or more child routes# routes:# ...


# config file version
apiVersion: 1# List of alert rule UIDs that should be deleted
deleteTemplates:# <int> organization ID, default = 1- orgId: 1# <string, required> name of the template, must be uniquename: my_first_template


# config file version
apiVersion: 1# List of mute time intervals to import or update
muteTimes:# <int> organization ID, default = 1- orgId: 1# <string, required> name of the mute time interval, must be uniquename: mti_1# <list> time intervals that should trigger the muting#        refer to https://prometheus.io/docs/alerting/latest/configuration/#time_interval-0time_intervals:- times:- start_time: '06:00'end_time: '23:59'weekdays: ['monday:wednesday', 'saturday', 'sunday']months: ['1:3', 'may:august', 'december']years: ['2020:2022', '2030']days_of_month: ['1:5', '-3:-1']

