Monitoring long-running processes

One of the most powerful features of chalk exec is its ability to generate periodic “heartbeat” reports that continue for as long as your application runs.

Setting Up Heartbeat Monitoring

Let’s first create a configuration file that enables heartbeat reporting:

# Create a heartbeat configuration file
$ cat << EOF > heartbeat-config.c4m
# Enable heartbeat and set interval to 10 seconds
exec.heartbeat: true
exec.heartbeat_rate: <<10 seconds>>

# Define what to include in heartbeat reports
report_template heartbeat_report {
  key._PROCESS_PID.use                        = true
  key._PROCESS_STATE.use                      = true
  key._PROCESS_CWD.use                        = true
  key._OP_TCP_SOCKET_INFO.use                 = true
  key._TIMESTAMP.use                          = true
  key._DATETIME.use                           = true
}

# Use this template for heartbeat operations
outconf.heartbeat.report_template: "heartbeat_report"
EOF

Now, let’s create a long-running program to monitor:

# Create a script that runs for a while
$ cat << EOF > long-running.sh
#!/bin/bash
echo "Starting long-running process..."
count=1
while [[ \$count -le 5 ]]; do
  echo "Iteration \$count"
  sleep 15
  ((count=count+1))
done
echo "Process complete."
EOF

$ chmod +x long-running.sh

Load our heartbeat configuration and run the program:

# Load the heartbeat configuration
$ chalk load heartbeat-config.c4m

# Run with exec and heartbeat enabled
$ chalk exec -- ./long-running.sh

You’ll see your script’s output along with periodic heartbeat reports:

Starting long-running process...
[
  {
    "_OPERATION": "exec",
    "_DATETIME": "2024-03-05T14:45:10.123-05:00",
    "_PROCESS_PID": 12346,
    "_PROCESS_STATE": "running",
    "_PROCESS_CWD": "/home/user/chalk-demo",
    ...etc...
  }
]
Iteration 1
[
  {
    "_OPERATION": "heartbeat",
    "_DATETIME": "2024-03-05T14:45:20.456-05:00",
    "_PROCESS_PID": 12346,
    "_PROCESS_STATE": "running",
    "_PROCESS_CWD": "/home/user/chalk-demo",
    ...etc...
  }
]
...etc...

The heartbeat reports continue to be generated every 10 seconds as specified in the configuration, providing regular snapshots of your application’s state as it runs.

Customizing Exec Reports

You can customize what data gets collected and how it’s reported by configuring Chalk’s reporting templates. Let’s explore a few examples:

Customizing Output Location

Let’s create a configuration that sends exec reports to a specific file:

# Create a file output configuration
$ cat << EOF > file-output-config.c4m
# Define a sink for file output
sink_config exec_file_output {
  sink:     "file"
  enabled:  true
  filename: "./exec-reports.log"
}

# Subscribe our new sink to the report topic
subscribe("report", "exec_file_output")
EOF

# Load the configuration
$ chalk load file-output-config.c4m

# Run a command with the new configuration
$ chalk exec -- ls -la

Now check the contents of exec-reports.log to see the exec report that was written to the file.