久久久久亚洲AV无码永不,伊人伊成久久人综合网777,久久精品成人欧美大片

【夜鶯監(jiān)控】海王——Categraf

有沒有人和我一樣，遇到同樣的困惑：當(dāng)我使用 prometheus 來搭建監(jiān)控體系的時候，每當(dāng)有一個組件需要監(jiān)控，我就要為其增加一個 exporter，如果有 10 個組件，我就要增加 10 個 exporter，先不說這 10 個 exporter 的質(zhì)量如何(因為大部分 exporter 都是廣大網(wǎng)友自己開發(fā)的)，光學(xué)習(xí)成本、部署成本以及維護(hù)成本都讓人頭疼。

有沒有一個組件，就能搞定大部分指標(biāo)采集的?

Categraf 就是這樣的一個采集器。

驚不驚喜，意不意外?

什么是 Categrf

Categraf 是一個監(jiān)控采集 Agent，類似 Telegraf、Grafana-Agent、Datadog-Agent，希望對所有常見監(jiān)控對象提供監(jiān)控數(shù)據(jù)采集能力，采用 All-in-one 的設(shè)計，不但支持指標(biāo)采集，也希望支持日志和調(diào)用鏈路的數(shù)據(jù)采集。

相比于其他采集器，Categraf 的優(yōu)勢在于：

支持 remote_write 寫入?yún)f(xié)議，支持將數(shù)據(jù)寫入 promethues、M3DB、VictoriaMetrics、InfluxDB
指標(biāo)數(shù)據(jù)只采集數(shù)值，不采集字符串，標(biāo)簽維持穩(wěn)態(tài)結(jié)構(gòu)
采用 all-in-one 的設(shè)計，所有的采集工作用一個 agent 搞定，未來也可以把日志和 trace 的采集納入 agent
純 Go 代碼編寫，靜態(tài)編譯依賴少，容易分發(fā)，易于安裝
盡可能落地最佳實踐，不需要采集的數(shù)據(jù)無需采集，針對可能會對時序庫造成高基數(shù)的問題在采集側(cè)做出處理
常用的采集器，不但提供采集能力，還要整理出監(jiān)控大盤和告警規(guī)則，用戶可以直接導(dǎo)入使用
未來希望作為快貓 SaaS 產(chǎn)品的重要組成部分，引入快貓團(tuán)隊的研發(fā)力量持續(xù)迭代，當(dāng)然，希望更多的公司、更多人研發(fā)人員參與共建，做成國內(nèi)最開放、最好用的采集器

安裝

安裝很簡單，下面簡單介紹二進(jìn)制安裝的方式。

# 下載$ wget https://download.flashcat.cloud/categraf-v0.2.38-linux-amd64.tar.gz# 解壓$ tar xf categraf-v0.2.38-linux-amd64.tar.gz# 進(jìn)入目錄$ cd categraf-v0.2.38-linux-amd64/

修改配置文件，在 conf/config.toml 中，修改的部分如下：

[[writers]]url = "http://127.0.0.1:17000/prometheus/v1/write"[heartbeat]enable = true

然后啟動 Categraf。

$ nohup ./categraf &amp;&gt;categraf.log &amp;

配置詳解我們上面部署 Categraf 的時候沒有指定配置文件，它就會默認(rèn)讀取 conf 目錄下的配置文件，conf 目錄的結(jié)構(gòu)如下：

config.toml # 主配置
logs.toml # logs-agent 配置
prometheus.toml # prometheus agent 配置
traces.yaml # trace-agent 配置
conf/input./.toml 插件配置文件

主配置 config.toml 詳解

[global]# 是否打印配置內(nèi)容print_configs = false# 機(jī)器名，作為本機(jī)的唯一標(biāo)識，會為時序數(shù)據(jù)自動附加一個 agent_hostname=$hostname 的標(biāo)簽# hostname 配置如果為空，自動取本機(jī)的機(jī)器名# hostname 配置如果不為空，就使用用戶配置的內(nèi)容作為hostname# 用戶配置的hostname字符串中，可以包含變量，目前支持兩個變量，# $hostname 和 $ip，如果字符串中出現(xiàn)這兩個變量，就會自動替換# $hostname 自動替換為本機(jī)機(jī)器名，$ip 自動替換為本機(jī)IP# 建議大家使用 --test 做一下測試，看看輸出的內(nèi)容是否符合預(yù)期# 這里配置的內(nèi)容，再--test模式下，會顯示為 agent_hostname=xxx 的標(biāo)簽hostname = ""# 是否忽略主機(jī)名的標(biāo)簽，如果設(shè)置為true，時序數(shù)據(jù)中就不會自動附加agent_hostname=$hostname 的標(biāo)簽omit_hostname = false# 時序數(shù)據(jù)的時間戳使用ms還是s，默認(rèn)是ms，是因為remote write協(xié)議使用ms作為時間戳的單位precision = "ms"# 全局采集頻率，15秒采集一次interval = 15# 配置文件來源，目前支持local和http兩種配置，如果配置為local就讀取本地的配置，如果配置為http，需要在[http]模塊配置http來源providers = ["local"]# 全局附加標(biāo)簽，一行一個，這些寫的標(biāo)簽會自動附到時序數(shù)據(jù)上# [global.labels]# region = "shanghai"# env = "localhost"# 日志模塊[log]# 默認(rèn)的log輸出，到標(biāo)準(zhǔn)輸出(stdout)# 如果指定為文件, 則寫入到指定的文件中file_name = "stdout"# 當(dāng)日志輸出到文件時該配置生效，用于限制日志文件大小max_size = 100# 日志保留天數(shù)max_age = 1# 備份日志個數(shù)max_backups = 1# 是否使用本地時間格式化日志local_time = true# 是否用gzip對日志進(jìn)行壓縮compress = false# 發(fā)給后端的時序數(shù)據(jù)，會先被扔到 categraf 內(nèi)存隊列里，每個采集插件一個隊列# chan_size 定義了隊列最大長度# batch 是每次從隊列中取多少條，發(fā)送給后端backend[writer_opt]batch = 1000chan_size = 1000000# 后端backend配置，在toml中 [[]] 表示數(shù)組，所以可以配置多個writer# 每個writer可以有不同的url，不同的basic auth信息[[writers]]url = "http://127.0.0.1:17000/prometheus/v1/write"# 認(rèn)證用戶，默認(rèn)為空basic_auth_user = ""# 認(rèn)證密碼，默認(rèn)為空basic_auth_pass = ""## 請求頭信息# headers = ["X-From", "categraf", "X-Xyz", "abc"]# 超時配置：單位是 mstimeout = 5000dial_timeout = 2500max_idle_conns_per_host = 100# 如果providers配置為http，就需要在這個地方進(jìn)行配置[http]# 是否開啟enable = false# 地址信息address = ":9100"print_access = falserun_mode = "release"# ibex配置，用于配置ibex-server的地址，用于實現(xiàn)故障自愈[ibex]enable = false## ibex刷新頻率interval = "1000ms"## ibex server 地址servers = ["127.0.0.1:20090"]## 腳本臨時保存目錄meta_dir = "./meta"# 心跳上報給n9e[heartbeat]enable = true# 上報 os version cpu.util mem.util 等元信息url = "http://127.0.0.1:17000/v1/n9e/heartbeat"# 上報頻率，單位是 sinterval = 10# 認(rèn)證用戶basic_auth_user = ""# 認(rèn)證密碼basic_auth_pass = ""## header 頭信息# headers = ["X-From", "categraf", "X-Xyz", "abc"]# 超時配置，單位 mstimeout = 5000dial_timeout = 2500max_idle_conns_per_host = 100

日志采集 logs.toml 配置

[logs]# api_key http模式下生效,用于鑒權(quán), 其他模式下占位符api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"# 是否開啟log-agentenable = false# 日志接收地址，可以配置tcp、http以及kafkasend_to = "127.0.0.1:17878"# 日志發(fā)送協(xié)議：http/tcp/kafkasend_type = "http"# kafka模式下的topictopic = "flashcatcloud"# 是否進(jìn)行壓縮use_compress = false# 是否使用tlssend_with_tls = false# 批量發(fā)送的等待時間batch_wait = 5# 日志偏移量記錄，用于斷點續(xù)傳run_path = "/opt/categraf/run"# 最大打開文件數(shù)open_files_limit = 100# 掃描目錄日志評論scan_period = 10# udp采集的buffer大小frame_size = 9000# 是否采集pod的stdout/stderr日志collect_container_all = true# 全局處理規(guī)則, 該處不支持多行合并。多行日志合并需要在logs.items中配置# [[logs.Processing_rules]]# 日志采集配置[[logs.items]]# 日志類型，支持file/journald/tcp/udptype = "file"# 日志路徑，支持統(tǒng)配符，用統(tǒng)配符，默認(rèn)從最新位置開始采集## 如果類型是file,則必須配置具體的路徑; 如果類似是journald/tcp/udp，則配置端口path = "/opt/tomcat/logs/*.txt"# 日志的label 標(biāo)識日志來源的模塊source = "tomcat"# 日志的label 標(biāo)識日志來源的服務(wù)service = "my_service"

其中，日志采集規(guī)則可以在全部logs.Processing_rules中配置，也可以在logs.items.logs_processing_rules中進(jìn)行配置。

規(guī)則類型主要分為以下幾種：

exclude_at_match：表示不發(fā)送匹配到的日志行。
include_at_match：表示只發(fā)送匹配到的日志行。
mask_sequences：可以在日志發(fā)送前對日志進(jìn)行處理，比如替換日志內(nèi)容。
multi_line：多行合并，不支持全局配置。

(1)不發(fā)送匹配到的日志行

type = "exclude_at_match"name = "exclude_xxx_users"pattern="w+@flashcat.cloud"表示日志中匹配到@flashcat.cloud 的行 不發(fā)送

(2)只發(fā)送匹配到的日志行

type = "include_at_match"name = "include_demo"pattern="^2022*"表示日志中匹配到2022開頭的行 才發(fā)送

(3)對日志內(nèi)容進(jìn)行替換處理

type = "mask_sequences"name = "mask_phone_number"replace_placeholder = "[186xxx]"pattern="186d{8}"表示186的手機(jī)號會被[186xxx] 代替

(4)多行合并

type = "multi_line"name = "new_line_with_date"pattern="d{4}-d{2}-d{2}" （多行規(guī)則不需要添加^ ，代碼會自動添加）表示以日期為日志的開頭，多行的日志合并為一行進(jìn)行采集

指標(biāo)采集 prometheus.toml 配置

Categraf 本身以及可以完成很多指標(biāo)的采集，如果你本身已經(jīng)有了完整的 Promtheus 體系，但是想用 N9e，Categraf 也支持采集 Prometheus 指標(biāo)。

[prometheus]# 是否啟動prometheus agentenable=false# 原來prometheus的配置文件# 或者新建一個prometheus格式的配置文件scrape_config_file="/path/to/in_cluster_scrape.yaml"## 日志級別，支持 debug | warn | info | errorlog_level="info"# 以下配置文件，保持默認(rèn)就好了## wal file storage path ,default ./data-agent# wal_storage_path="/path/to/storage"## wal reserve time duration, default value is 2 hour# wal_min_duration=2

比如這里配置 Prometheus 自動采集 kube-state-metrics 指標(biāo)的 scrape 配置：

global:scrape_interval: 15sexternal_labels:scraper: ksm-testcluster: testscrape_configs:- job_name: "kube-state-metrics"metrics_path: "/metrics"kubernetes_sd_configs:- role: endpointsapi_server: "https://172.31.0.1:443"tls_config:ca_file: /etc/kubernetes/pki/ca.crtcert_file: /etc/kubernetes/pki/apiserver-kubelet-client.crtkey_file: /etc/kubernetes/pki/apiserver-kubelet-client.keyinsecure_skip_verify: truescheme: httprelabel_configs:- source_labels:[__meta_kubernetes_namespace,__meta_kubernetes_service_name,__meta_kubernetes_endpoint_port_name,]action: keepregex: kube-system;kube-state-metrics;http-metricsremote_write:- url: "http://172.31.62.213/prometheus/v1/write"

可以在prometheus.toml配置文件中使用scrape_config_file指令來加載上述文件。

鏈路指標(biāo) traces.toml 配置

鏈路指標(biāo)配置只是對 OpenTelemetry Collector 做了簡單的封裝，用戶可以對接各種系統(tǒng)。

這里不做多的解釋。

插件配置

采集配置

假如我們服務(wù)器上有一個 nginx 進(jìn)程，我們要對其進(jìn)程監(jiān)控，我們要修改conf/input.procstat/procstat.toml配置，如下：

# # collect intervalinterval = 15[[instances]]# # executable name (ie, pgrep <search_exec_substring>)search_exec_substring = "nginx"# # pattern as argument for pgrep (ie, pgrep -f <search_cmdline_substring>)# search_cmdline_substring = "n9e server"# # windows service name# search_win_service = ""metrics_name_prefix="nginx"# # search process with specific user, option with exec_substring or cmdline_substring# search_user = ""# # append some labels for serieslabels = { region="cloud", product="n9e" }# # interval = global.interval * interval_times# interval_times = 1# # mode to use when calculating CPU usage. can be one of 'solaris' or 'irix'# mode = "irix"# sum of threads/fd/io/cpu/mem, min of uptime/limitgather_total = true# will append pid as taggather_per_pid = false#gather jvm metrics only when jstat is ready# gather_more_metrics = [# "threads",# "fd",# "io",# "uptime",# "cpu",# "mem",# "limit",# "jvm"# ]</search_cmdline_substring></search_exec_substring>

我們指定了進(jìn)程名，并且為指標(biāo)增加了nginx的前綴和label。

配置完成后，重啟 Categraf 即可。

然后就可以看到指標(biāo)數(shù)據(jù)，如下：

【夜鶯監(jiān)控】海王——Categraf