iTranslated by AI
Week 3 on the Job: Cleaning Up Black-Box Cron Jobs and Automating Monitoring
Introduction
On the Monday of the third week, the "invisible anxiety" finally begins to ease a bit—once you finish the inventory (Week 1 article) and the Top 3 emergency measures (Week 2 article), a holistic view of the system takes shape in your mind. It is only at this stage that you are ready to perform "environment housekeeping."
Having been in the Linux field for over 20 years, I still remember the tension I felt when I first tackled "environment housekeeping" in the third week. I'm talking about "cron jobs nobody knows about" and "daily visual checks." Getting these two things under control will visibly reduce the operational load from the following week onwards.
This article outlines the "environment housekeeping and automation" tasks for the third week. The target audience consists of those who have grasped the overall picture in the first two weeks and have completed the Top 3 emergency measures from the second week.
What this article covers (and what it doesn't)
What it covers:
- How to organize uncommented cron jobs and move them to
/etc/cron.d/ - Decision-making flows for deleting dead services and unnecessary users
- Auditing sudo privileges (especially NOPASSWD) and criteria for recommending deletion
- A minimal monitoring script that sends disk/memory/CPU/uptime information via email every morning
- How to create a network diagram (ASCII art)
What it does not cover:
- Implementing full-scale monitoring platforms (Zabbix, Prometheus, etc.)
- Details on migrating to systemd timers (will be covered in a separate article)
- Building log aggregation infrastructure
Command examples assume operation on Ubuntu 22.04 / RHEL 9 systems.
Organizing Black-Boxed Cron Jobs: Start by Reading the Uncommented Lines
As mentioned in the Week 2 article, there is almost always a "cron job that nobody knows who wrote or when" in any environment. Since you created a list during the inventory in Week 1, Week 3 is the time to enter the stage of "reading and organizing" them.
Criteria for evaluating cron jobs
Evaluate each cron job based on the following perspectives:
- Execution Timing: Is it still appropriate, or was it set based on outdated requirements?
- Execution User: Is the privilege minimized? Is it truly necessary to run as root?
- Command Content: Can you identify the script path, the owner of the executable, and read the content?
- Last Execution Result: Are there recent errors, or are logs being preserved?
Here is the specific confirmation procedure using commands:
# Check the existence and owner of the command path for each cron job
# Example: If /etc/cron.d/backup contains "/usr/local/bin/backup.sh"
ls -la /usr/local/bin/backup.sh
stat /usr/local/bin/backup.sh
# Read the script content
sudo cat /usr/local/bin/backup.sh
# Recent execution history (RHEL family)
sudo grep -i "backup.sh" /var/log/cron /var/log/cron-*.log 2>/dev/null | tail -30
# Recent execution history (Debian/Ubuntu family)
sudo grep -i "backup.sh" /var/log/syslog /var/log/syslog.* 2>/dev/null | tail -30
# Via journalctl (if it's a cron job under systemd)
sudo journalctl -t CRON --since "7 days ago" | grep -i "backup.sh"
For jobs whose purpose you understand, move them to /etc/cron.d/ as described below. For jobs you don't understand, do not delete them, and do not disable them yet; simply record them as they are for now.
The "Leave an Intent" Rule for Uncommented Cron Jobs
In cases where you understand the meaning but there are no comments in the original file, adding a single-line comment above the job will save your future self six months down the line.
# Before
0 3 * * * root /usr/local/bin/mysqldump-all.sh
# After
# Daily full DB backup (output to /mnt/backup/daily) - Owner: infra team
# Last checked: 2026-04-22 Miyazaki
0 3 * * * root /usr/local/bin/mysqldump-all.sh
Just one line of comment marks it as a "job that is safe to touch." Spending half a day adding these to all jobs is well worth the effort.
Procedure for migrating to /etc/cron.d/
If production jobs are located in individual user crontabs, move them under /etc/cron.d/. This offers the following benefits:
- Can be managed as files (easy to manage via git)
- The execution user is explicitly stated
- Immune to accidents with the
crontabcommand (e.g., deleting everything withcrontab -r)
Steps:
# Back up the existing crontab
sudo crontab -u appuser -l > /tmp/appuser-crontab.bak
# Create a destination file under /etc/cron.d/
sudo vi /etc/cron.d/appuser-jobs
How to write the content (note that a username column is required):
# /etc/cron.d/appuser-jobs
# Jobs for appuser (migrated from crontab on 2026-04-22)
SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=admin@example.com
# Daily log aggregation (formerly appuser crontab)
0 4 * * * appuser /usr/local/bin/aggregate-logs.sh
Set the permissions correctly (read-only, owned by root).
sudo chown root:root /etc/cron.d/appuser-jobs
sudo chmod 644 /etc/cron.d/appuser-jobs
After migration, operate in parallel for a few days before deleting the original user crontab. Ensure beforehand that running the job twice (e.g., log rotation) won't cause issues during the parallel operation. If issues occur, disable the original job first before activating the new setting.
Decision-Making Flow for Deleting Dead Services and Unnecessary Users
You should have already created a list of "services that are running but not enabled" and "accounts of unknown purpose" during the inventory in Week 1. In Week 3, you will decide one by one whether these can be deleted.
Decision-making flow for service deletion
[ Service is running ]
↓
[ Are there any traces of activity in the logs in the last month? ]
├─ YES → Keep, interview the person in charge
└─ NO → Proceed to next
↓
[ Dependencies (is it required by other services?) ]
├─ YES → Keep
└─ NO → Proceed to next
↓
[ First, stop it and observe for 1 week ]
↓
[ If no issues, disable ]
↓
[ If no issues after another week, it can be deleted ]
Specific commands:
# Recent 1-month logs of the service
sudo journalctl -u suspicious.service --since "1 month ago" | tail -50
# Dependencies of other units (is anything requiring this service?)
sudo systemctl list-dependencies --reverse suspicious.service
# Stop (do not make persistent yet, just stop manually)
sudo systemctl stop suspicious.service
# After 1 week of observation, if no issues, disable automatic startup
sudo systemctl disable suspicious.service
# After another week with no issues, consider removing the package
# Debian/Ubuntu
sudo apt-get remove --purge <package-name>
# RHEL family
sudo dnf remove <package-name>
The key is to spend at least two weeks on the "stop → disable → delete" process. Deleting it in a single day often leads to accidents where it is discovered the following month that the service was used for a monthly batch job.
Decision-making for deleting unnecessary users
Only accounts that satisfy all four conditions found during the inventory—"no last login," "no SSH keys," "no crontab," and "no systemd User definition"—are candidates for deletion. If even one of these conditions is not met, keep the account.
# Check if a user "olduser" is in use
USER=olduser
# Last login
lastlog -u "$USER"
# SSH keys
sudo ls -la /home/"$USER"/.ssh/authorized_keys 2>/dev/null
# Crontab
sudo crontab -u "$USER" -l 2>/dev/null
# Is it specified in a systemd unit under User=
sudo grep -rlE "^User=$USER\b" /etc/systemd/system /usr/lib/systemd/system 2>/dev/null
# Are there any active processes?
ps -u "$USER" 2>/dev/null
If all four checks return empty (nothing output), first lock it with usermod -L and observe for one week. If no problems occur, delete it with userdel. It is safer to archive the home directory before deleting it.
# First, lock (disable password, not delete)
sudo usermod -L olduser
# After 1 week of observation, archive the home directory
sudo tar -czf /var/backups/olduser_home_$(date +%Y%m%d).tar.gz /home/olduser
# Delete the account
sudo userdel olduser
Auditing sudo Privileges (NOPASSWD) and Criteria for Recommended Deletion
You should have read the files under /etc/sudoers and /etc/sudoers.d/ on Day 5 of the first week. In Week 3, you will perform an audit, especially of NOPASSWD entries.
The risk of leaving NOPASSWD
NOPASSWD is a setting that bypasses password prompts when executing sudo. It is often used for automation scripts, but once set, it is almost always forgotten who set it.
# Extract NOPASSWD entries
sudo grep -rE 'NOPASSWD' /etc/sudoers /etc/sudoers.d/ 2>/dev/null
Output example:
/etc/sudoers.d/deploy:deployuser ALL=(ALL) NOPASSWD: ALL
/etc/sudoers.d/backup:backupuser ALL=(root) NOPASSWD: /usr/local/bin/mysqldump-all.sh
Criteria for recommended deletion
Evaluate using the following checklist:
-
Is the command scope set to
ALL?-
NOPASSWD: ALLis a high-priority deletion candidate. - Restrict it to specific commands instead (e.g.,
NOPASSWD: /usr/local/bin/mysqldump-all.sh).
-
-
Is the target user a real person or an automated system account?
- NOPASSWD for human users should generally be deleted.
- Keep restricted NOPASSWD for system accounts (deploy, backup, etc.).
-
Is the user currently active?
- Check if it is used via last login, cron, or systemd units (steps from the previous section).
-
Will business operations be affected if we start asking for passwords?
- If so, restrict it to limited NOPASSWD.
- If not, remove NOPASSWD.
Always use visudo for changes. Since sudo itself will stop working if there is a syntax error, be sure to pass visudo's syntax check.
# Also edit individual files under /etc/sudoers.d/ using visudo
sudo visudo -f /etc/sudoers.d/deploy
Recording changes
sudo privileges are a common target for audits. Keep a record of changes in ~/ops-docs/sudo-changes.md.
# sudo privilege change history
## 2026-04-22
- /etc/sudoers.d/deploy: Changed `deployuser ALL=(ALL) NOPASSWD: ALL` to `deployuser ALL=(root) NOPASSWD: /usr/local/bin/deploy.sh`
- Reason: NOPASSWD for all commands is excessive; restricted to deployment script only
- Impact verification: Confirmed that the corresponding script executes normally
Minimal Monitoring Script for Every Morning: Email Notifications are Enough
Introducing a new monitoring platform is beyond the scope of Week 3. However, you can start "looking at the main metrics every morning" today.
The script itself
sudo vi /usr/local/bin/daily-health-report.sh
Content:
#!/bin/bash
# /usr/local/bin/daily-health-report.sh
# Send a server status report every morning via email
set -eu
HOSTNAME=$(hostname)
TO="admin@example.com"
SUBJECT="[$HOSTNAME] Daily Health Report $(date +%Y-%m-%d)"
# Assemble the body with a heredoc
BODY=$(cat <<EOF
Server: $HOSTNAME
Date: $(date -Iseconds)
--- Uptime ---
$(uptime)
--- Load Average (1m/5m/15m) ---
$(cut -d' ' -f1-3 /proc/loadavg)
--- Memory ---
$(free -h)
--- Disk Usage ---
$(df -hT | grep -vE 'tmpfs|devtmpfs')
--- Top 5 CPU processes ---
$(ps aux --sort=-%cpu | head -6)
--- Top 5 Memory processes ---
$(ps aux --sort=-%mem | head -6)
--- Failed systemd units ---
$(systemctl --failed --no-legend || echo "none")
--- Last 5 login attempts ---
$(last -n 5 -F)
EOF
)
echo "$BODY" | mail -s "$SUBJECT" "$TO"
Grant execution permissions and register it in cron.
sudo chmod +x /usr/local/bin/daily-health-report.sh
# Execute manually once to check if email arrives
sudo /usr/local/bin/daily-health-report.sh
Register in /etc/cron.d/daily-health:
# /etc/cron.d/daily-health
SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
# Execute every morning at 7:00 AM
0 7 * * * root /usr/local/bin/daily-health-report.sh
If email delivery does not work
If the company's SMTP server is unavailable, change the configuration to use curl to send to a Slack or Teams Webhook instead of the mail command. I have used email here as a minimal configuration, but the destination can be anything. The essence is that "metrics reach you automatically every morning."
Why is this beneficial?
If you get into the habit of looking at this report every morning, you will start noticing "unusual values." Even without installing a monitoring platform, there are many anomalies that can be detected by just using your eyes and brain. I have personally noticed, via this morning email, that "the load was 1.5 yesterday but 8.0 this morning," which helped me catch a database abnormality early.
Network Diagram: ASCII Art that Conveys the Big Picture in 5 Minutes
What is most effective for handovers and troubleshooting is a "one-page A4 network diagram." Any tool can be used to draw it, but using ASCII art allows you to manage it in git and edit it with a text editor.
Template
[Internet]
|
[ Router ]
192.168.0.1
|
+----------------+----------------+
| | |
[L2 Switch] [L2 Switch] [Firewall]
|
+--------+--------+ [ DMZ ]
| | | |
[web01] [db01] [file01] [reverse proxy]
192.168 192.168 192.168 |
.1.10 .1.20 .1.30 (Public Web, etc.)
Legend:
[ ] = Host or device
- = L2 connection
| = L2 connection (vertical)
Notes:
- web01: Nginx + PHP-FPM / Business Web front
- db01: MySQL 8.0 / Business DB
- file01: Samba / Departmental file share
- reverse proxy: Nginx / Entry for public services
- Management VLAN: 192.168.100.0/24 (accessed from admin PC only)
This level of detail is sufficient. It is more important that it conveys "what is connected to what" and "how to gain access" rather than the exact number of lines. The pass mark is if your successor can understand the overall picture in 5 minutes.
Where to store the file
Save it as ~/ops-docs/network-diagram.txt. Commit it to the same repository as your inventory report.
cd ~/ops-docs
git add network-diagram.txt
git commit -m "Week3: add ascii network diagram"
Since you commit every time you update the diagram, you can track "when the network configuration changed" using git log -p network-diagram.txt. This serves as a secondary benefit of document version control.
Summary
Let's look back at what we did in Week 3.
- Organize black-boxed cron jobs in order of "read → add comments → move to
/etc/cron.d/." - Spend at least two weeks on the "stop → disable → delete" process for dead services.
- Lock unnecessary users only if they meet all conditions: "no last login, no SSH keys, no crontab, no systemd User specified."
- For sudo NOPASSWD, prioritize deletion candidates starting from
NOPASSWD: ALL, and restrict to specific commands. - For minimal monitoring every morning, a single email notification script is enough; postpone the introduction of a monitoring platform.
- Write network diagrams as ASCII art managed by git, at a granularity that conveys the big picture in 5 minutes.
The essence of Week 3 is preparation for housekeeping and automation. Proceed with the feeling that you are implementing a minimal configuration that will make things easier for your future self six months from now, without aiming for perfection from the start.
I will quote here the words my superior said to me in my 20s: "If it fails, make sure you can restore it immediately." For cron migration, service deletion, and sudo privilege changes, confirm the restoration procedure first before taking action. By sandwiching a parallel operation period and leaving records in commit logs, these two things alone will make your operations restorable.
Next time, we will move on to how to write monthly reports to report these achievements to management.
About the Author
Tomohiro Miyazaki / Representative of E-Net Mercury Co., Ltd.
Over 20 years of experience in Linux operation and education. Operates the education brand "LinuxMaster.JP" (IT seminars, email, blog) for IT engineers working in the field (including solo sysadmins in small and medium-sized enterprises).
If you are interested, please check out the free newsletter that delivers Linux know-how for practical work:
Please feel free to leave comments or questions regarding the content of the article.
Discussion