Linux Logging & Troubleshooting — journalctl, rsyslog, logrotate & Fixing Problems - Infrastructure, Cloud, Security & Automation

Linux Logging & Troubleshooting — journalctl, rsyslog, logrotate & Fixing Problems

1 post • Page 1 of 1

Murali Krishna: Posts: 14; Joined: Wed Jun 10, 2026 8:34 am

Linux Logging & Troubleshooting — journalctl, rsyslog, logrotate & Fixing Problems

Quote

Post by Murali Krishna » Sat Jun 13, 2026 2:03 pm

Linux Logging & Troubleshooting — journalctl, rsyslog, logrotate & Fixing Problems
A clear, practical guide to reading logs and diagnosing services, boot and network issues, with copy-ready commands (AlmaLinux 9 / RHEL 9)

─────────────────────────────────────────

Logs are the system talking to you.
When something breaks, the answer is almost always written down somewhere. The skill of troubleshooting is mostly knowing where to look and how to filter the noise. This guide covers the two logging systems (journald and rsyslog), how to keep logs from filling your disk, and a calm, step-by-step way to diagnose service, boot and network problems.

─────────────────────────────────────────

1 journalctl — the systemd journal

On AlmaLinux 9, systemd collects logs into a central journal. journalctl is the tool to read and filter it — it's usually your first stop.

Everyday queries

Code: Select all

journalctl -e                # jump to the newest entries
journalctl -f                # live tail (follow new logs as they arrive)
journalctl -u sshd           # only logs from one service
journalctl -u sshd -f        # live tail one service

Filter by time and priority

Code: Select all

journalctl --since "1 hour ago"
journalctl --since "2026-06-13 09:00" --until "2026-06-13 10:00"
journalctl -p err -b          # only errors since this boot
journalctl -p warning..err    # a range of priority levels

Check journal size and trim it

Code: Select all

journalctl --disk-usage              # how much space the journal uses
journalctl --vacuum-size=500M        # keep only the most recent 500 MB
journalctl --vacuum-time=2weeks      # delete entries older than 2 weeks

Tip: Priorities run from emerg (0) down to debug (7). "-p err" shows error and worse, which is the fastest way to spot what actually went wrong.

─────────────────────────────────────────

2 rsyslog — the classic text logs in /var/log

Alongside the journal, rsyslog writes traditional plain-text log files into /var/log. Many tools and admins still rely on these, and they're easy to grep.

The files you'll use most

/var/log/messages — general system messages
/var/log/secure — logins, sudo, SSH authentication
/var/log/maillog — mail server activity
/var/log/cron — cron job activity

Reading and searching them

Code: Select all

tail -f /var/log/messages          # watch live
grep "Failed password" /var/log/secure   # find failed SSH logins
less /var/log/messages             # scroll/search with / inside less

Send your own message into the log (handy in scripts)

Code: Select all

logger "Backup script started"     # appears in the system log

Tip: rsyslog config lives in /etc/rsyslog.conf and /etc/rsyslog.d/. That's where you'd add rules to forward logs to a central server — useful if you run a SIEM or log pipeline.

─────────────────────────────────────────

3 logrotate — stop logs eating your disk

Logs grow forever unless something trims them. logrotate rotates (renames), compresses and deletes old logs on a schedule, so /var doesn't fill up and crash the box.

Where it's configured

Code: Select all

/etc/logrotate.conf            # global defaults
/etc/logrotate.d/              # per-application rules (one file each)

Example rule for a custom app log
Create /etc/logrotate.d/myapp:

Code: Select all

/var/log/myapp/*.log {
    daily
    rotate 14
    compress
    delaycompress
    missingok
    notifempty
    copytruncate
}

This keeps 14 days of daily logs, compressed, and won't error if a log is missing or empty.

Test before trusting it

Code: Select all

logrotate -d /etc/logrotate.d/myapp     # dry run - shows what WOULD happen
logrotate -f /etc/logrotate.d/myapp     # force a rotation now

Tip: "copytruncate" is safest for apps that hold the log file open and can't be told to reopen it. For apps that handle a signal, prefer "postrotate ... systemctl reload" instead — it avoids losing the few lines written during the copy.

─────────────────────────────────────────

4 Boot Logs — what happened at startup

When a machine won't boot cleanly or a service fails early, the boot logs hold the story.

View this boot and previous boots

Code: Select all

journalctl -b                  # logs from the current boot
journalctl -b -1               # the PREVIOUS boot (great after a crash)
journalctl --list-boots        # list all recorded boots

Kernel and hardware messages

Code: Select all

dmesg | less                   # kernel ring buffer (hardware, drivers)
dmesg -T | grep -i error       # human-readable timestamps, just errors

Find what's slowing the boot down

Code: Select all

systemd-analyze                # total boot time
systemd-analyze blame          # slowest services, worst first

Tip: After an unexpected reboot, "journalctl -b -1 -p err" shows the errors from the boot that crashed — often the single most useful command for post-mortem.

─────────────────────────────────────────

5 Service Troubleshooting — a calm method

When a service won't start, resist the urge to randomly restart. Work the evidence.

Step 1 — What does systemd say?

Code: Select all

systemctl status nginx         # running? failed? recent log lines
systemctl --failed             # list ALL failed units at once

Step 2 — Read its actual logs

Code: Select all

journalctl -u nginx -e         # newest log lines for this service
journalctl -xeu nginx          # extra explanation + service logs

Step 3 — Validate config before restarting

Code: Select all

nginx -t                       # most services have a config test
sshd -t                        # (varies per service)

Step 4 — Common culprits to check

Config typo — caught by the test in step 3
Port already in use — ss -tulpn | grep :80
Permissions / SELinux — ausearch -m avc -ts recent
Missing dependency — shown in the status/journal output

Tip: "journalctl -xeu servicename" is the single best troubleshooting command — it combines the service's logs with systemd's own explanation of why it gave up.

─────────────────────────────────────────

6 Network Troubleshooting — layer by layer

Diagnose from your own machine outward, ruling out one layer at a time.

Step 1 — Do I have an IP and is the link up?

Code: Select all

ip addr                        # my addresses
ip link                        # is the interface UP?

Step 2 — Can I reach the gateway, then the internet?

Code: Select all

ip route                       # find the default gateway
ping -c 4 192.168.1.1          # reach the gateway
ping -c 4 8.8.8.8              # reach the internet by IP

Step 3 — Is it a DNS problem?

Code: Select all

ping -c 4 google.com           # works by IP but not by name = DNS
nslookup google.com

Step 4 — Ports, connections and the path

Code: Select all

ss -tulpn                      # what's listening locally
ss -tn state established       # current active connections
traceroute google.com          # where along the path it breaks

Tip: If ping to 8.8.8.8 works but google.com fails, it's almost always DNS — check /etc/resolv.conf and your nmcli ipv4.dns setting. This one pattern explains a huge share of "the internet is down" reports.

─────────────────────────────────────────

Quick Reference Cheat Sheet

Live journal — journalctl -f
Service logs — journalctl -u name -e
Errors this boot — journalctl -p err -b
By time — journalctl --since "1 hour ago"
Journal size — journalctl --disk-usage
Trim journal — journalctl --vacuum-time=2weeks
Auth log — grep "Failed password" /var/log/secure
Log a message — logger "text"
Test logrotate — logrotate -d /etc/logrotate.d/file
Previous boot — journalctl -b -1
Kernel log — dmesg -T
Boot time — systemd-analyze blame
Failed units — systemctl --failed
Best service debug — journalctl -xeu name
Listening ports — ss -tulpn
Network path — traceroute host

─────────────────────────────────────────

What's your go-to first command when a server acts up? Share your troubleshooting workflow below.

1 post • Page 1 of 1

Return to “Performance & Troubleshooting”