# X-Road: Operational Monitoring Daemon Architecture
Version: 1.4
Document ID: ARC-OPMOND
Date | Version | Description | Author |
---|---|---|---|
0.5 | Initial version | ||
23.01.2017 | 0.6 | Added license text, table of contents and version history | Sami Kallio |
02.02.2018 | 0.7 | Technology matrix moved to the ARC-TEC-file | Antti Luoma |
05.03.2018 | 0.8 | Added terms and abbreviations reference and moved terms to term doc | Tatu Repo |
18.02.2019 | 0.9 | New optional field: xRequestId (string) | Caro Hautamäki |
12.12.2019 | 1.0 | Update appendix A.2 with the updated fields | Ilkka Seppälä |
25.06.2020 | 1.1 | Update section 3.3 with the instructions how to enable JMX | Petteri Kivimäki |
01.06.2023 | 1.2 | Update references | Petteri Kivimäki |
02.10.2024 | 1.3 | Update schema file locations | Justas Samuolis |
05.12.2024 | 1.4 | Add endpoint level statistics gathering support | Eneli Reimets |
# Table of Contents
- License
- 1 Introduction
- 2 Component View
- 3 Protocols and Interfaces
- 4 Deployment View
- Appendix A Store Operational Monitoring Data Messages
# License
This document is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.
# 1 Introduction
The X-Road monitoring solution is conceptually split into two parts: environmental and operational monitoring. The operational monitoring processes operational statistics (such as which services or endpoints have been called, how many times, what was the size of the response, etc.) of the security servers.
This document describes the architecture of the X-Road operational monitoring daemon. It presents an overview of the components of the monitoring daemon and its interfaces.
This document is aimed at technical readers who want to acquire an overview of inner workings of the monitoring daemon.
# 1.1 Overview
The main function of the monitoring daemon is to collect operational data of the X-Road security server(s) and make it available for external monitoring systems (e.g., Zabbix, Nagios) via corresponding interfaces.
The monitoring daemon also depends on central server that provides the global configuration.
# 1.2 Terms and Abbrevations
See X-Road terms and abbreviations documentation [TA-TERMS].
# 1.3 References
ARC-G -- X-Road Architecture. Document ID: ARC-G.
PR-GCONF -- X-Road: Protocol for Downloading Configuration. Document ID: PR-GCONF.
PR-MESS -- X-Road: Message Transport Protocol v4.0. Document ID: PR-MESS.
PR-OPMON -- X-Road: Operational Monitoring Protocol. Document ID: PR-OPMON.
PR-OPMONJMX -- X-Road: Operational Monitoring JMX Protocol. Document ID: PR-OPMONJMX.
PSQL -- PostgreSQL, https://www.postgresql.org/
ARC-TEC -- X-Road technologies. Document ID: ARC-TEC.
TA-TERMS -- X-Road Terms and Abbreviations. Document ID: TA-TERMS.
# 2 Component View
Figure 1 shows the main components and interfaces of the monitoring daemon. The components and the interfaces are described in detail in the following sections.
Technologies used in the operational monitoring daemon can be found here: [ARC-TEC]
Figure 1. Operational monitoring daemon component diagram
# 2.1 Operational Monitoring Daemon Main
The operational monitoring daemon main is a standalone Java daemon application that implements the main functionality of the operational monitoring daemon.
# 2.1.1 Operational Monitoring Database
The operational monitoring database component collects operational monitoring data of the X-Road security server(s) via store operational monitoring data interface. Operational data is stored in a PostgreSQL [PSQL] database. Additionally operational health data statistics are updated and made available via JMXMP.
Outdated data records are deleted periodically from the database according to the monitoring daemon configuration.
# 2.1.2 Operational Monitoring Service
The operational monitoring service receives and processes operational monitoring requests via operational monitoring query interface. There are two requests used by the security server(s) - get operational monitoring data and get operational health data.
In case the sender of the get operational monitoring data request is a regular client, only operational monitoring data records associated with that client are returned. In case the request sender is the central monitoring client (described in the global configuration) or owner of the current security server (described in the global configuration), it has access to all the records.
For performance purposes, the operational monitoring service limits the size of the get operational monitoring data response message. The maximum response size is configurable (however, all the records having the same timestamp as the last queried record are still included into the response). In case some queried records still do not fit into the response, the timestamp of the first excluded record is returned in the response to indicate overflow.
# 2.2 Configuration Client
The configuration client is responsible for downloading remote global configuration files. The source location of the global configuration is taken from the anchor file that was manually copied to the configuration directory of the operational monitoring daemon (or uploaded from the security server user interface in case monitoring daemon is deployed together with the security server).
The component is a standalone Java daemon application.
# 3 Protocols and Interfaces
# 3.1 Store Operational Monitoring Data
This protocol is used by the X-Road security server to store its cached operational monitoring data. The protocol is a synchronous RPC-style protocol based on JSON over HTTP(S). In case a secure connection is configured, the security server uses its internal self-signed TLS certificate and monitoring daemon its internal self-signed TLS certificate. Both client side and server side certificate verification is performed.
The availability of this service to the security server is not critical to operation of X-Road. If this service is unavailable, the security server continues caching in its memory buffer the operational data records. In case buffer overflow the oldest records are deleted.
The storing operational monitoring data is not time-critical, hence asynchronous caching of the records is performed in the security server side.
The JSON messages are described in Appendix A.
# 3.2 Operational Monitoring Query
The operational monitoring query interface is used by the security server to retrieve operational monitoring data. The asynchronous RPC-style X-Road operational monitoring protocol [PR-OPMON] (based on [PR-MESS]) is used. In case a secure connection (HTTPS) is configured, the security server uses its internal self-signed TLS certificate and monitoring daemon its internal self-signed TLS certificate. Both client side and server side certificate verification is performed.
The monitoring of the security servers is not the main functionality of the X-Road system, therefore the availability and responsiveness of this service is not paramount. Operational data records are held in the database and are available for configured days.
# 3.3 Operational Monitoring JMX
This interface is used by a local monitoring system (e.g. Zabbix) to gather local operational health data of the security server via JMXMP. The interface is described in more detail in [PR-OPMONJMX].
With the default configuration, JMX is disabled. JMX is enabled by adding the required configuration in /etc/xroad/services/local.properties
file. The file is opened for editing and changes are made on the XROAD_OPMON_PARAMS
variable value. After the XROAD_OPMON_PARAMS
variable value has been updated, the xroad-opmonitor
service must be restarted.
The example configuration below enables JMX, binds it to port 9011
on any available interface with SSL and password authentication enabled:
XROAD_OPMON_PARAMS=-Djava.rmi.server.hostname=0.0.0.0 -Dcom.sun.management.jmxremote.port=9011 -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=true
The monitoring of the security servers is not the main functionality of the X-Road system, therefore the availability and responsiveness of this service is not paramount.
# 3.4 Download Configuration
The operational monitoring daemon downloads the generated global configuration files from a configuration source.
The configuration download interface is a synchronous interface that is required by the operational monitoring daemon. It is provided by a configuration source such as a central server or a configuration proxy.
The interface is described in more detail in [ARC-G] and [PR-GCONF].
# 4 Deployment View
Figure 2 shows the deployment diagram.
Figure 2. Operational monitoring daemon deployment
# Appendix A Store Operational Monitoring Data Messages
# A.1 JSON-Schema for Store Operational Monitoring Data Request
The schema is located in the file src/op-monitor-daemon/core/src/main/resources/store_operational_data_request_schema.yaml of the X-Road source code.
# A.2 Example Store Operational Monitoring Data Request
The first record of the store request reflects successfully mediated SOAP request, the second one successfully mediated REST request and the third one unsuccessfully mediated request.
{
"records": [
{
"monitoringDataTs": 1576133363,
"securityServerInternalIp": "fd42:2642:2cb3:31ac:216:3eff:fedf:85c%eth0",
"securityServerType": "Client",
"requestInTs": 1576133360081,
"requestOutTs": 1576133361160,
"responseInTs": 1576133361818,
"responseOutTs": 1576133361876,
"clientXRoadInstance": "FI",
"clientMemberClass": "COM",
"clientMemberCode": "111",
"clientSubsystemCode": "CLIENT",
"serviceXRoadInstance": "FI",
"serviceMemberClass": "COM",
"serviceMemberCode": "111",
"serviceSubsystemCode": "SERVICE",
"serviceCode": "getRandom",
"serviceVersion": "v1",
"messageId": "1234",
"messageUserId": "1234",
"messageIssue": "1234",
"messageProtocolVersion": "4.x",
"clientSecurityServerAddress": "ss1",
"serviceSecurityServerAddress": "ss1",
"requestSize": 1226,
"responseSize": 1539,
"requestAttachmentCount": 0,
"responseAttachmentCount": 0,
"succeeded": true,
"xRequestId": "d4490e7f-305e-44c3-b869-beaaeda694e7",
"serviceType": "WSDL"
},
{
"monitoringDataTs": 1733404603,
"securityServerInternalIp": "fd42:2642:2cb3:31ac:216:3eff:fedf:85c%eth0",
"securityServerType": "Client",
"requestInTs": 1733404602876,
"requestOutTs": 1733404602884,
"responseInTs": 1733404602970,
"responseOutTs": 1733404603005,
"clientXRoadInstance": "FI",
"clientMemberClass": "COM",
"clientMemberCode": "111",
"clientSubsystemCode": "CLIENT",
"serviceXRoadInstance": "FI",
"serviceMemberClass": "COM",
"serviceMemberCode": "111",
"serviceSubsystemCode": "SERVICE",
"serviceCode": "pets",
"restMethod": "GET",
"restPath": "/cat",
"messageId": "1234",
"messageProtocolVersion": "1",
"clientSecurityServerAddress": "ss1",
"serviceSecurityServerAddress": "ss1",
"requestSize": 214,
"responseSize": 462,
"requestAttachmentCount": 0,
"responseAttachmentCount": 0,
"succeeded": true,
"statusCode": 200,
"xRequestId": "1244d018-9300-4f1b-8c2b-9b7f2bc4e933",
"serviceType": "REST"
},
{
"monitoringDataTs": 1576134508,
"securityServerInternalIp": "fd42:2642:2cb3:31ac:216:3eff:fedf:85c%eth0",
"securityServerType": "Client",
"requestInTs": 1576134507705,
"requestOutTs": 1576134507840,
"responseInTs": 1576134508040,
"responseOutTs": 1576134508045,
"clientXRoadInstance": "FI",
"clientMemberClass": "COM",
"clientMemberCode": "111",
"serviceXRoadInstance": "FI",
"serviceMemberClass": "COM",
"serviceMemberCode": "111",
"serviceCode": "getSecurityServerHealthData",
"serviceVersion": "v1",
"messageId": "1234",
"messageProtocolVersion": "4.x",
"clientSecurityServerAddress": "ss1",
"serviceSecurityServerAddress": "ss1",
"requestSize": 1767,
"requestAttachmentCount": 0,
"succeeded": false,
"faultCode": "Server.ServerProxy.OpMonitor.InvalidClientIdentifier",
"faultString": "Missing required subsystem code",
"xRequestId": "2c51b181-47cd-4ff2-b5df-6463f968fd0c",
"serviceType": "WSDL"
}
]
}
# A.3 JSON-Schema for Store Operational Monitoring Data Response
The schema is located in the file src/op-monitor-daemon/core/src/main/resources/store_operational_data_response_schema.yaml of the X-Road source code.
# A.4 Example Store Operational Monitoring Data Responses
- Example of response indicating success.
{
"status": "OK"
}
- Example of response indicating failure.
{
"status": "Error",
"errorMessage": "Internal error"
}