ECK Elastic¶
Imported from Confluence
Content may be outdated. Verify before following any procedures. View original | Last updated: September 2024
We had and issue when cluster has yellow state. In order to troubleshoot - Red Yellow Cluster Status
Check cluster status:
elasticsearch@elasticsearch-eck-elasticsearch-es-default-0:~$ curl -X GET "localhost:9200/_cluster/health?filter_path=status,*_shards&pretty"
{
"status" : "yellow",
"active_primary_shards" : 406,
"active_shards" : 477,
"relocating_shards" : 0,
"initializing_shards" : 6,
"unassigned_shards" : 330,
"delayed_unassigned_shards" : 0
}
List unassigned shards
elasticsearch@elasticsearch-eck-elasticsearch-es-default-2:~$ curl -XGET 'http://localhost:9200/_cluster/health'
{"cluster_name":"elasticsearch-eck-elasticsearch","status":"yellow","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":3,"active_primary_shards":406,"active_shards":406,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":407,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":49.938499384993854}elasticsearch@elasticsearch-eck-elasticsearch-es-default-2:~$ curl -XGET 'http://localhost:9200/_cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state'
index shard prirep state node unassigned.reason
fairbid-sdk-events-2999-2024.03.15 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2023.09.26 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2023.12.05 0 r UNASSIGNED NODE_LEFT
.ds-.kibana-event-log-8.9.0-2024.07.01-000018 0 r UNASSIGNED NODE_LEFT
.kibana_task_manager_8.9.0_001 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.02.14 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.03.20 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2023.10.02 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.02.04 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.07.15 0 r UNASSIGNED NODE_LEFT
.fleet-file-data-agent-000001 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.08.18 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2024.04.25 0 r UNASSIGNED NODE_LEFT
fairbid-sdk-events-2999-2023.10.23 0 r UNASSIGNED NODE_LEFT
Check possible issue
elasticsearch@elasticsearch-eck-elasticsearch-es-default-2:~$ curl -X GET "localhost:9200/_cluster/allocation/explain?filter_path=index,node_allocation_decisions.node_name,node_allocation_decisions.deciders.*&pretty" -H 'Content-Type: application/json' -d'
> {
> "index": "fairbid-sdk-events-2999-2024.03.15",
> "shard": 0,
> "primary": false
> }
> '
{
"index" : "fairbid-sdk-events-2999-2024.03.15",
"node_allocation_decisions" : [
{
"node_name" : "elasticsearch-eck-elasticsearch-es-default-0",
"deciders" : [
{
"decider" : "disk_threshold",
"decision" : "NO",
"explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], having less than the minimum required [44.2gb] free space, actual free: [38.4gb], actual used: [86.9%]"
}
]
},
{
"node_name" : "elasticsearch-eck-elasticsearch-es-default-2",
"deciders" : [
{
"decider" : "disk_threshold",
"decision" : "NO",
"explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], having less than the minimum required [44.2gb] free space, actual free: [29.3gb], actual used: [90%]"
}
]
},
{
"node_name" : "elasticsearch-eck-elasticsearch-es-default-1",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to this node [[fairbid-sdk-events-2999-2024.03.15][0], node[lxK3ut4IRpOdgwQ8kBpHBg], [P], s[STARTED], a[id=iRbaVLn1RWmOEVh2V6Qi3g], failed_attempts[0]]"
},
{
"decider" : "disk_threshold",
"decision" : "NO",
"explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], having less than the minimum required [44.2gb] free space, actual free: [26.7gb], actual used: [90.9%]"
},
{
"decider" : "awareness",
"decision" : "NO",
"explanation" : "there are [2] copies of this shard and [3] values for attribute [k8s_node_name] ([gke-gke-core-fairbid-nap-e2-standard--3c098cc9-wkb8, gke-gke-core-fairbid-nap-e2-standard--4755c446-c7p7, gke-gke-core-fairbid-nap-e2-standard--a76eb336-bwma] from nodes in the cluster and no forced awareness) so there may be at most [1] copies of this shard allocated to nodes with each value, but (including this copy) there would be [2] copies allocated to nodes with [node.attr.k8s_node_name: gke-gke-core-fairbid-nap-e2-standard--a76eb336-bwma]"
}
]
}
]
}
I have increased disk size for elastic to solve high watermark issue.