Building a bulletproof family photo archive: Flickr, Synology, and Hetzner Storage Box

Or: how I stopped trusting a single cloud provider with fifteen years of irreplaceable memories and built something that would survive a small apocalypse.
The problem with "it just works"
For years, iCloud was the answer to the question nobody was asking out loud: "What happens to my photos if Apple decides I've violated some obscure policy, or if their servers have a bad Tuesday?" iCloud works beautifully; right until it doesn't. And when it doesn't, the support ticket experience is roughly equivalent to shouting into a sock.
The real issue isn't reliability. iCloud is reliable. The issue is control. With iCloud, you have none. You can't sync your photos to an external backup without jumping through hoops. You can't run a script that pulls your originals at 3am and stores them somewhere you own. You are a tenant, and the landlord doesn't give you a key to the building.
So I went looking for an alternative. The requirements were simple, or so I thought:
- Store original files, not compressed versions
- Accessible via a proper API
- Stable enough to still exist in five years
- Supports automatic upload from iPhone
- Reasonably priced
That last point ruled out a lot of options fast....
Why Flickr?
This might raise an eyebrow. Flickr? The platform that peaked in 2008 and then spent a decade slowly dissolving? Yes, that Flickr. And it turns out the rumours of its death were greatly exaggerated.
Here is what changed the calculation:
SmugMug acquired Flickr in 2018. SmugMug is a profitable, independently owned company that has been selling photo hosting since 2002. They are not a VC-backed startup burning cash to reach an exit. They bought Flickr because they actually want to run it.
The Flickr Foundation was established in 2022 with a mandate for long-term digital preservation. Their stated goal is to ensure Flickr exists for at least another 100 years. Whether or not they achieve that is debatable, but it signals intent in a way that "move fast and break things" companies simply don't.
Flickr Pro gives you unlimited original storage for €7/month. Not compressed originals. Not "high quality." The actual original file, exactly as it came off your camera or phone, with full EXIF metadata intact.
The API is comprehensive and uses OAuth 1.0a. This is important. OAuth 1.0a tokens don't expire. You generate them once, store them, and they work forever without refresh loops or token rotation. For an unattended backup script running at 4am, this is a significant advantage over OAuth 2.0's expiring tokens.
Compared to the alternatives:
| Platform | Original files via API | Token stability | Price | Longevity confidence |
|---|---|---|---|---|
| Flickr Pro | ✅ Yes | ✅ Non-expiring | €7/mo | High |
| Google Photos | ❌ No (compressed) | Medium | Free* | Low |
| iCloud | ❌ No API | N/A | €2.99/mo | High |
| Immich (self-hosted) | ✅ Yes | ✅ Yes | Hardware cost | Depends on you |
*Google Photos is free until it isn't, and their API was gutted in 2019 to the point where you cannot retrieve original files. It is useless for backup purposes.
The decision was made. Flickr Pro it is.
The backup architecture
Having your photos in one place, even a good one, is not a backup. A backup is at least two additional copies in different locations. The target architecture:
iPhone
│
│ (auto-upload)
▼
Flickr Pro ──────────────────────────── Primary source of truth
│ │
│ Script 1 (nightly) │ Script 2 (nightly, after script 1)
▼ ▼
Synology NAS Hetzner Storage Box
(local network, (geographically
Samba share, separate,
daily snapshots) Samba share)
Three copies. Three different failure modes. One of them would have to survive essentially anything short of a targeted attack on your personal photo collection, which, unless you are considerably more interesting than the average person, seems unlikely.
Why Synology + Samba
The Synology is already running at home. It has a Samba share. It does daily snapshots with 365-day retention. Adding a dedicated share for photos costs nothing and gains a fast, local, snapshot-protected copy that can be browsed directly from any device on the network.
Why Hetzner Storage Box
Hetzner is a German hosting provider with a strong reputation for reliability and competitive pricing. Their Storage Box product is a dedicated storage volume that supports multiple protocols including SMB/CIFS, SFTP, and WebDAV. It is not object storage, it is a proper filesystem-accessible volume, which means the same Samba mounting approach used for the Synology works here too, with minimal changes.
Storage Boxes are geographically separated from a home network by definition, and being hosted in Germany, they fall under strict European data protection regulations. At €3.81/month for 1TB, the price is competitive with object storage alternatives at any meaningful volume.
The practical consequence of using SMB for both destinations: both scripts are structurally identical. This is a feature, not a limitation.
Why two separate scripts
Both scripts mount a Samba share inside a Docker container, which requires the SYS_ADMIN capability. Despite the architectural similarity, keeping them separate is the right call: they target physically different locations via different network paths, have independent state databases, and fail independently.
More importantly: they are independent. If one fails, the other still runs. If the Synology is offline for maintenance, Flickr and Hetzner still sync. If the home network is unreachable, Hetzner still has yesterday's copy. Independence is the point.
The network layer: Nebula
The Synology is on the home network. The VPS running the scripts is not. Getting from one to the other requires either opening ports on the home router (bad) or using an overlay network (good).
The solution here is Nebula, a mesh VPN by Slack. Each node gets a certificate, traffic is encrypted, and machines can reach each other as if they were on the same LAN regardless of physical location. The Synology is at 10.100.100.4 on the Nebula network, the VPS at 10.100.100.1.
This means the Samba mount path is simply:
//10.100.100.4/flickr/username
No port forwarding. No exposed services. No VPN configuration on the router. It just works, this time genuinely.
The Flickr API and OAuth 1.0a
Here is where things get mildly interesting from a technical standpoint. Flickr uses OAuth 1.0a, which was designed in an era when people still used Internet Explorer without irony. It requires request signing with HMAC-SHA1, which means every API call must include a cryptographic signature constructed from your API credentials, a timestamp, a nonce, and every query parameter, sorted lexicographically, URL-encoded, and combined in a specific order.
It sounds worse than it is. Once you have the signing function working, it works forever. Here is the core of it:
def make_oauth_signature(method: str, url: str, params: dict) -> str:
sorted_params = "&".join(
f"{oauth_encode(k)}={oauth_encode(v)}"
for k, v in sorted(params.items())
)
base_string = "&".join([
method.upper(),
oauth_encode(url),
oauth_encode(sorted_params),
])
signing_key = f"{oauth_encode(API_SECRET)}&{oauth_encode(ACCESS_SECRET)}"
hashed = hmac.new(
signing_key.encode("utf-8"),
base_string.encode("utf-8"),
hashlib.sha1,
)
return base64.b64encode(hashed.digest()).decode("utf-8")
Obtaining the OAuth tokens requires a one-time interactive process. You run a small helper script, it gives you an authorization URL, you open it in a browser, approve access, paste the verification code back, and receive your access token and secret. Store these in your .env file and never think about them again. They don't expire.
Getting photo URLs
Flickr stores multiple sizes of each photo. To get the original, you call flickr.photos.getSizes and look for the size labelled "Original". For videos, you look for "Video Original" first, then fall back to "HD MP4", "Site MP4", and "Mobile MP4" in that order.
One thing worth noting: Flickr's API returns the media type in the media field of photo listings. A video will have media: "video" but its originalformat might say jpg or mp4 — the format field is unreliable for videos. Always use getSizes to determine the actual file type.
Script 1: Flickr to Orion (Synology NAS)
The Orion sync script runs in a Docker container for isolation. It mounts the Samba share, compares the Flickr contents with its local SQLite state database, and downloads anything new or changed.
The state database
The script maintains a SQLite database with two tables:
CREATE TABLE albums (
flickr_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
local_path TEXT NOT NULL,
last_updated INTEGER NOT NULL,
deleted INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE photos (
flickr_id TEXT NOT NULL,
album_id TEXT NOT NULL,
filename TEXT NOT NULL,
last_updated INTEGER NOT NULL,
samba_ok INTEGER NOT NULL DEFAULT 0,
deleted INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (flickr_id, album_id)
);
This database is the script's memory. Between runs, it knows which photos exist, where they are stored, and whether they were successfully written to the Samba share.
Real sync vs. wishful thinking
An important design decision: the script doesn't trust its own database blindly. Before skipping a file (because samba_ok = 1), it checks whether the file actually exists on disk. If someone manually deleted a file from the Samba share, the script will re-download it on the next run.
if samba_ok and known:
if not (album_path / known["filename"]).exists():
samba_ok = 0
log(f" Missing on Samba, re-downloading: {known['filename']}")
This is the difference between a sync tool and a script that just pretends to be one.
The Camera Roll problem
Flickr has a concept of "not in any set" - photos that exist in the library but haven't been added to an album. These are typically auto-uploaded photos from the iPhone that haven't been organized yet. The script handles these as a virtual album called "Camera Roll", using the flickr.photos.getNotInSet API method.
Filename handling for videos
This tripped us up initially. The Flickr API returns originalformat: "jpg" for a video because that's what the photo record says. But when you call getSizes, you get back a URL ending in .mov. The script resolves the URL first, then constructs the filename from the actual URL extension, not the metadata. The database stores the resolved filename (12345678.mov, not 12345678.jpg), and all subsequent file existence checks use the database filename.
Safe downloads
Files are downloaded to a .tmp extension and renamed only when the download completes successfully. On startup, any leftover .tmp files from a previous crash are cleaned up. Since their corresponding database entries were never marked as complete, samba_ok is only set to 1 after a successful rename, the normal sync loop will re-download them automatically on the next run.
The lock file
The script writes a lock file containing the current Unix timestamp when it starts, and removes it when it finishes. If the lock file exists when the script starts, it checks the age. If the lock is older than the configured maximum runtime, it assumes something went wrong and removes it. This prevents zombie processes from blocking future runs indefinitely.
A note on OAuth retries
An important implementation detail: OAuth signatures include a timestamp and a nonce. If you build the signature once before entering a retry loop and then wait 60 seconds between attempts, subsequent retries will carry an expired timestamp and Flickr will reject them. The fix is to rebuild the OAuth parameters (timestamp, nonce, and signature) fresh on every attempt inside the loop.
Additionally, two separate wait constants are used: RATE_LIMIT_WAIT for genuine HTTP 429 responses, and API_ERROR_WAIT for transient API failures. The latter uses exponential backoff starting at 5 seconds. This avoids the situation where a brief Flickr hiccup causes the script to wait 60 seconds per photo across an entire album.
Docker configuration
FROM python:3.12-alpine
RUN apk add --no-cache cifs-utils tzdata && \
pip install --no-cache-dir requests
WORKDIR /app
COPY sync.py .
CMD ["python", "-u", "sync.py"]
The cifs-utils package provides the mount.cifs command for Samba mounting. The SYS_ADMIN capability in docker-compose is required for mounting network filesystems inside a container.
services:
flickr-sync-orion:
build: .
env_file: .env
environment:
- TZ=${TZ}
volumes:
- ./state:/state
cap_add:
- SYS_ADMIN
security_opt:
- apparmor:unconfined
restart: "no"
Script 2: Flickr to Hetzner Storage Box
The Hetzner sync script is architecturally identical to the Orion script. It mounts a Samba share, compares Flickr contents with its SQLite state database, and downloads anything new or changed. The only meaningful differences are in the .env configuration and one additional mount option.
Hetzner Storage Box and SMB
Hetzner Storage Boxes expose an SMB/CIFS share at your-account.your-storagebox.de. Unlike the Synology, Hetzner requires SMBv3 explicitly. Passing vers=2 or omitting the version altogether results in a failed mount. The solution is the SMB_EXTRA_OPTS environment variable, which appends arbitrary options to the CIFS mount command:
SMB_EXTRA_OPTS = os.environ.get("SMB_EXTRA_OPTS", "")
# In the mount command:
"-o", f"username={SMB_USER},password={SMB_PASSWORD},uid=0,gid=0{(',' + SMB_EXTRA_OPTS) if SMB_EXTRA_OPTS else ''}",
Set SMB_EXTRA_OPTS=vers=3 in .env and the mount works. The Orion script also reads this variable but leaves it empty, so the Synology mount is unaffected.
Subaccount base directory
Hetzner supports subaccounts with their own credentials and a configurable base directory. If you create a subaccount with base directory /username, that subaccount's SMB share root is already the right place. In that case, SMB_SUBFOLDER can be left empty, and the script handles this gracefully:
SMB_PATH = f"//{SMB_HOST}/{SMB_SHARE}" + (f"/{SMB_SUBFOLDER}" if SMB_SUBFOLDER else "")
This ensures no trailing slash is appended to the mount path when the subfolder is empty, which CIFS would reject as an invalid argument.
Docker configuration
Identical to the Orion script — same Dockerfile, same docker-compose.yml structure with SYS_ADMIN and apparmor:unconfined. Only the service name differs:
services:
flickr-sync-hetzner:
build: .
env_file: .env
environment:
- TZ=${TZ}
volumes:
- ./state:/state
cap_add:
- SYS_ADMIN
security_opt:
- apparmor:unconfined
restart: "no"
Configuration
Both scripts are configured entirely through environment variables loaded from a .env file. This makes them portable and trivially adaptable for additional users.
Orion .env
# Flickr
FLICKR_API_KEY=your_api_key
FLICKR_API_SECRET=your_api_secret
FLICKR_ACCESS_TOKEN=your_access_token
FLICKR_ACCESS_SECRET=your_access_secret
# Samba
SMB_HOST=your_synology_ip
SMB_SHARE=flickr
SMB_USER=your_samba_user
SMB_PASSWORD=your_samba_password
SMB_SUBFOLDER=username
# Settings
MIN_FREE_SPACE_GB=10
MAX_RUNTIME_HOURS=10
TZ=Europe/Amsterdam
Hetzner .env
# Flickr
FLICKR_API_KEY=your_api_key
FLICKR_API_SECRET=your_api_secret
FLICKR_ACCESS_TOKEN=your_access_token
FLICKR_ACCESS_SECRET=your_access_secret
# Samba
SMB_HOST=your-account.your-storagebox.de
SMB_SHARE=your-account
SMB_USER=your-account
SMB_PASSWORD=your_password
SMB_SUBFOLDER=
# Settings
SMB_EXTRA_OPTS=vers=3
MIN_FREE_SPACE_GB=10
MAX_RUNTIME_HOURS=10
TZ=Europe/Amsterdam
Directory structure
/root/flickr-sync-username/
├── orion/
│ ├── Dockerfile
│ ├── docker-compose.yml
│ ├── .env
│ ├── sync.py
│ └── state/
│ └── flickr_samba.db
└── hetzner/
├── Dockerfile
├── docker-compose.yml
├── .env
├── sync.py
└── state/
└── flickr_samba.db
Adding a second user is a matter of copying the directory, updating the .env files with new credentials, and adding a cron entry. The scripts themselves require no modification.
Scheduling
Both scripts run nightly via cron, with the Orion sync running first and the Hetzner sync starting only if the Orion sync exits successfully:
0 4 * * * cd /root/flickr-sync-username/orion && docker compose run --rm flickr-sync-orion >> /var/log/flickr-sync-username-orion.log 2>&1 && cd /root/flickr-sync-username/hetzner && docker compose run --rm flickr-sync-hetzner >> /var/log/flickr-sync-username-hetzner.log 2>&1
The && operator is doing quiet but important work here: if the Orion sync fails catastrophically, the Hetzner sync doesn't start. This avoids a situation where the Hetzner script cheerfully deletes files from the remote storage because the Orion script incorrectly reported them as deleted from Flickr.
Log rotation
/var/log/flickr-sync-username-orion.log
/var/log/flickr-sync-username-hetzner.log {
daily
rotate 21
compress
missingok
notifempty
copytruncate
}
Three weeks of logs. Enough to diagnose problems. Not enough to fill a disk.
Design decisions and trade-offs
Why not rclone?
Rclone was considered as an alternative for the Hetzner sync. It's a well-maintained tool with good SMB support and excellent built-in encryption via rclone crypt. The problem: rclone has no Flickr backend. It cannot read from Flickr directly. So you'd still need Python to download from Flickr, and rclone would only handle the upload, adding a dependency without simplifying the architecture.
For the encryption discussion: rclone crypt is genuinely excellent and was seriously considered. It encrypts file contents and file names, uses AES-256-CTR with per-file nonces, and recovery is straightforward (rclone copy remote: /local/path with the right passwords). Ultimately the decision was made to skip encryption, reasoning that the threat model doesn't justify the recovery complexity.
If you want encryption, use rclone crypt. It's the right tool for it. Configure it as a remote on top of your Hetzner share, install rclone natively on the VPS (not in Docker), and update the Hetzner script to call rclone copyto instead of mounting the share directly.
Why not Google Photos?
Google Photos has OAuth 2.0 (easier than Flickr's OAuth 1.0a) and a mobile app. It's also free up to 15GB. What it doesn't have is an API that returns original files. Since 2019, the Google Photos API returns compressed versions only, no EXIF, no original filenames, no originals. For a backup tool, this is a dealbreaker. You would be backing up Google's version of your photos, not yours.
Why SQLite over a simple file list?
A flat file of downloaded filenames would work for the simple case. But the state database tracks per-destination success independently. A file can be on Orion but not yet on Hetzner, or the reverse. This matters when one destination is temporarily unavailable; the script retries only what's missing, not everything.
Why not write directly to both destinations simultaneously?
Because that would make the scripts dependent on each other at runtime. If one Samba mount is offline, should the other wait? Independence is more valuable than synchronicity here.
The encryption question
The most interesting rabbit hole in this project. Client-side encryption was evaluated thoroughly:
Python AES-256-GCM: you write it yourself, you test it yourself, and if you get any of the nonce handling wrong, you've created a system that looks secure but isn't. The cryptography library is excellent, but implementing encryption correctly requires careful thought about nonce reuse, authentication tag verification, and key derivation; all of which are easy to get subtly wrong.
rclone crypt: battle-tested, audited, used by hundreds of thousands of people. Recovery requires only rclone and two passwords. File names are also encrypted, so an attacker who obtains your Hetzner share can't even tell you have a folder called "Vacation 2019".
The final decision was no encryption, on the grounds that the threat model doesn't justify the recovery complexity. Hetzner is a European provider subject to GDPR, no more exposed than iCloud or Google Photos, two services where billions of people store photos without a second thought. But if your threat model differs — if you work in a sensitive field, or if you simply have a healthy distrust of anyone who might compel a provider to hand over data —> go with rclone crypt. The recovery story is solid.
The complete scripts
get_token.py — One-time OAuth token generation
Run this once on any machine with Python and internet access to obtain your Flickr OAuth tokens.
#!/usr/bin/env python3
"""
One-time Flickr OAuth token generator.
Run this script once to obtain your access token and secret.
Store the results in your .env file.
"""
import hashlib
import hmac
import time
import urllib.parse
import base64
import requests
API_KEY = input("Enter your Flickr API Key: ").strip()
API_SECRET = input("Enter your Flickr API Secret: ").strip()
def oauth_encode(s):
return urllib.parse.quote(str(s), safe="")
def sign(secret, token_secret, params, url):
sorted_params = "&".join(
f"{oauth_encode(k)}={oauth_encode(v)}"
for k, v in sorted(params.items())
)
base = "&".join(["GET", oauth_encode(url), oauth_encode(sorted_params)])
key = f"{oauth_encode(secret)}&{oauth_encode(token_secret)}"
sig = hmac.new(key.encode(), base.encode(), hashlib.sha1)
return base64.b64encode(sig.digest()).decode()
# Step 1: Request token
params = {
"oauth_callback": "oob",
"oauth_consumer_key": API_KEY,
"oauth_nonce": hashlib.md5(str(time.time()).encode()).hexdigest(),
"oauth_signature_method": "HMAC-SHA1",
"oauth_timestamp": str(int(time.time())),
"oauth_version": "1.0",
}
params["oauth_signature"] = sign(API_SECRET, "", params, "https://www.flickr.com/services/oauth/request_token")
r = requests.get("https://www.flickr.com/services/oauth/request_token", params=params)
parsed = dict(urllib.parse.parse_qsl(r.text))
request_token = parsed["oauth_token"]
request_token_secret = parsed["oauth_token_secret"]
print(f"\nOpen this URL in your browser:")
print(f"https://www.flickr.com/services/oauth/authorize?oauth_token={request_token}&perms=read")
verifier = input("\nEnter the verification code: ").strip()
# Step 2: Access token
params = {
"oauth_consumer_key": API_KEY,
"oauth_nonce": hashlib.md5(str(time.time()).encode()).hexdigest(),
"oauth_signature_method": "HMAC-SHA1",
"oauth_timestamp": str(int(time.time())),
"oauth_token": request_token,
"oauth_verifier": verifier,
"oauth_version": "1.0",
}
params["oauth_signature"] = sign(API_SECRET, request_token_secret, params, "https://www.flickr.com/services/oauth/access_token")
r = requests.get("https://www.flickr.com/services/oauth/access_token", params=params)
parsed = dict(urllib.parse.parse_qsl(r.text))
print(f"\nAdd these to your .env file:")
print(f"FLICKR_ACCESS_TOKEN={parsed['oauth_token']}")
print(f"FLICKR_ACCESS_SECRET={parsed['oauth_token_secret']}")
orion/sync.py — Flickr to Orion (Synology NAS)
#!/usr/bin/env python3
"""
Flickr -> Samba sync script
Runs nightly via cron. Synchronises Flickr albums to a Samba share.
"""
import hashlib
import hmac
import os
import signal
import sqlite3
import sys
import time
import urllib.parse
import base64
import re
import shutil
import subprocess
from datetime import datetime
from pathlib import Path
import requests
# ── Configuration ─────────────────────────────────────────────────────────────
FLICKR_API_KEY = os.environ["FLICKR_API_KEY"]
FLICKR_API_SECRET = os.environ["FLICKR_API_SECRET"]
FLICKR_ACCESS_TOKEN = os.environ["FLICKR_ACCESS_TOKEN"]
FLICKR_ACCESS_SECRET = os.environ["FLICKR_ACCESS_SECRET"]
SMB_HOST = os.environ["SMB_HOST"]
SMB_SHARE = os.environ["SMB_SHARE"]
SMB_USER = os.environ["SMB_USER"]
SMB_PASSWORD = os.environ["SMB_PASSWORD"]
SMB_SUBFOLDER = os.environ["SMB_SUBFOLDER"]
SMB_EXTRA_OPTS = os.environ.get("SMB_EXTRA_OPTS", "")
MIN_FREE_SPACE_GB = int(os.environ.get("MIN_FREE_SPACE_GB", "10"))
MAX_RUNTIME_HOURS = int(os.environ.get("MAX_RUNTIME_HOURS", "10"))
# ── Paths ─────────────────────────────────────────────────────────────────────
STATE_DIR = Path(os.environ.get("STATE_DIR", "/state"))
DB_PATH = STATE_DIR / "flickr_samba.db"
LOCK_FILE = STATE_DIR / "flickr_samba.lock"
MOUNT_POINT = Path("/photos")
# ── Logging ───────────────────────────────────────────────────────────────────
def log(message: str) -> None:
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{timestamp}] {message}", flush=True)
def log_error(message: str) -> None:
log(f"ERROR: {message}")
def log_section(title: str) -> None:
log(f"{'─' * 60}")
log(f" {title}")
log(f"{'─' * 60}")
# ── Lock ──────────────────────────────────────────────────────────────────────
START_TIME = time.time()
def acquire_lock() -> None:
if LOCK_FILE.exists():
try:
lock_time = float(LOCK_FILE.read_text().strip())
age_hours = (time.time() - lock_time) / 3600
if age_hours > MAX_RUNTIME_HOURS:
log(f"Stale lock found ({age_hours:.1f}h old). Cleaning up.")
cleanup_stale_lock()
else:
log_error(f"Script already running (lock {age_hours:.1f}h old). Exiting.")
sys.exit(1)
except (ValueError, OSError):
LOCK_FILE.unlink(missing_ok=True)
LOCK_FILE.write_text(str(time.time()))
log("Lock acquired.")
def release_lock() -> None:
LOCK_FILE.unlink(missing_ok=True)
log("Lock released.")
def cleanup_stale_lock() -> None:
if MOUNT_POINT.exists():
for tmp_file in MOUNT_POINT.rglob("*.tmp"):
try:
tmp_file.unlink()
except OSError:
pass
LOCK_FILE.unlink(missing_ok=True)
# ── Runtime ───────────────────────────────────────────────────────────────────
def check_runtime() -> None:
if (time.time() - START_TIME) / 3600 > MAX_RUNTIME_HOURS:
log_error(f"Maximum runtime of {MAX_RUNTIME_HOURS}h exceeded. Stopping.")
cleanup_and_exit(1)
def handle_signal(signum, frame) -> None:
log(f"Signal {signum} received. Shutting down cleanly.")
cleanup_and_exit(0)
def cleanup_and_exit(exit_code: int) -> None:
unmount_share()
release_lock()
log(f"Script exited with code {exit_code}.")
sys.exit(exit_code)
signal.signal(signal.SIGTERM, handle_signal)
signal.signal(signal.SIGINT, handle_signal)
# ── Samba ─────────────────────────────────────────────────────────────────────
SMB_PATH = f"//{SMB_HOST}/{SMB_SHARE}" + (f"/{SMB_SUBFOLDER}" if SMB_SUBFOLDER else "")
MAX_MOUNT_RETRIES = 3
MOUNT_RETRY_WAIT = 30
def mount_share() -> None:
log_section("Mounting Samba share")
unmount_share()
MOUNT_POINT.mkdir(parents=True, exist_ok=True)
for attempt in range(1, MAX_MOUNT_RETRIES + 1):
log(f"Mount attempt {attempt}/{MAX_MOUNT_RETRIES}: {SMB_PATH} -> {MOUNT_POINT}")
result = subprocess.run(
[
"mount", "-t", "cifs",
SMB_PATH,
str(MOUNT_POINT),
"-o", f"username={SMB_USER},password={SMB_PASSWORD},uid=0,gid=0{(',' + SMB_EXTRA_OPTS) if SMB_EXTRA_OPTS else ''}",
],
capture_output=True, text=True,
)
if result.returncode == 0:
log("Samba share mounted successfully.")
return
log_error(f"Mount failed: {result.stderr.strip()}")
if attempt < MAX_MOUNT_RETRIES:
time.sleep(MOUNT_RETRY_WAIT)
log_error(f"Samba share unreachable after {MAX_MOUNT_RETRIES} attempts.")
release_lock()
sys.exit(1)
def unmount_share() -> None:
if MOUNT_POINT.exists():
subprocess.run(["umount", str(MOUNT_POINT)], capture_output=True)
def check_free_space(silent: bool = False) -> None:
free_gb = shutil.disk_usage(MOUNT_POINT).free / (1024 ** 3)
if free_gb < MIN_FREE_SPACE_GB:
log_error(f"Insufficient disk space: {free_gb:.1f}GB free, minimum {MIN_FREE_SPACE_GB}GB.")
cleanup_and_exit(1)
if not silent:
log(f"Disk space OK: {free_gb:.1f}GB free.")
# ── Database ──────────────────────────────────────────────────────────────────
def get_db() -> sqlite3.Connection:
STATE_DIR.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn.row_factory = sqlite3.Row
conn.executescript("""
CREATE TABLE IF NOT EXISTS albums (
flickr_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
local_path TEXT NOT NULL,
last_updated INTEGER NOT NULL,
deleted INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE IF NOT EXISTS photos (
flickr_id TEXT NOT NULL,
album_id TEXT NOT NULL,
filename TEXT NOT NULL,
last_updated INTEGER NOT NULL,
samba_ok INTEGER NOT NULL DEFAULT 0,
deleted INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (flickr_id, album_id)
);
""")
conn.commit()
return conn
def reset_interrupted_downloads() -> None:
log("Checking for interrupted downloads from previous run...")
count = 0
for tmp_file in MOUNT_POINT.rglob("*.tmp"):
try:
tmp_file.unlink()
count += 1
except OSError:
pass
if count:
log(f"Cleaned up {count} interrupted download(s).")
else:
log("No interrupted downloads found.")
# ── Flickr API ────────────────────────────────────────────────────────────────
FLICKR_API_URL = "https://www.flickr.com/services/rest/"
MAX_API_RETRIES = 5
RATE_LIMIT_WAIT = 60 # used only for HTTP 429 responses
API_ERROR_WAIT = 5 # base wait for transient API errors; doubles each retry
def oauth_encode(s: str) -> str:
return urllib.parse.quote(str(s), safe="")
def make_oauth_signature(method: str, url: str, params: dict) -> str:
sorted_params = "&".join(
f"{oauth_encode(k)}={oauth_encode(v)}"
for k, v in sorted(params.items())
)
base_string = "&".join([method.upper(), oauth_encode(url), oauth_encode(sorted_params)])
signing_key = f"{oauth_encode(FLICKR_API_SECRET)}&{oauth_encode(FLICKR_ACCESS_SECRET)}"
hashed = hmac.new(signing_key.encode(), base_string.encode(), hashlib.sha1)
return base64.b64encode(hashed.digest()).decode()
def base_oauth_params() -> dict:
return {
"oauth_consumer_key": FLICKR_API_KEY,
"oauth_nonce": hashlib.md5(str(time.time()).encode()).hexdigest(),
"oauth_signature_method": "HMAC-SHA1",
"oauth_timestamp": str(int(time.time())),
"oauth_token": FLICKR_ACCESS_TOKEN,
"oauth_version": "1.0",
}
def flickr_call(method: str, extra_params: dict = {}) -> dict:
rate_limit_wait = RATE_LIMIT_WAIT
for attempt in range(1, MAX_API_RETRIES + 1):
check_runtime()
# Fresh OAuth params on every attempt (timestamp and nonce must not be reused)
params = base_oauth_params()
params.update({"method": method, "format": "json", "nojsoncallback": "1"})
params.update(extra_params)
params["oauth_signature"] = make_oauth_signature("GET", FLICKR_API_URL, params)
try:
r = requests.get(FLICKR_API_URL, params=params, timeout=30)
if r.status_code == 429:
log(f"Rate limit reached. Waiting {rate_limit_wait}s...")
time.sleep(rate_limit_wait); rate_limit_wait *= 2; continue
data = r.json()
if data.get("stat") == "fail":
if data.get("code") in (98, 100):
log_error(f"Flickr OAuth error: {data.get('message')}. Check your tokens.")
cleanup_and_exit(1)
if data.get("code") == 1:
return {}
error_wait = API_ERROR_WAIT * (2 ** (attempt - 1))
log_error(f"Flickr API error: {data.get('message')} (attempt {attempt}/{MAX_API_RETRIES})")
time.sleep(error_wait); continue
return data
except requests.RequestException as e:
error_wait = API_ERROR_WAIT * (2 ** (attempt - 1))
log_error(f"Network error: {e} (attempt {attempt}/{MAX_API_RETRIES})")
time.sleep(error_wait)
log_error(f"Flickr API '{method}' failed after {MAX_API_RETRIES} attempts.")
return {}
def sanitize_folder_name(name: str) -> str:
s = re.sub(r'[<>:"/\\|?*\x00-\x1f]', '-', name)
s = re.sub(r'-+', '-', s).strip(' -')
return s or "Unnamed"
def get_all_albums(conn: sqlite3.Connection) -> list:
log_section("Fetching albums from Flickr")
data = flickr_call("flickr.photosets.getList", {"per_page": "500"})
if not data:
log_error("Could not fetch albums from Flickr.")
return []
flickr_albums = {
a["id"]: {"flickr_id": a["id"], "title": a["title"]["_content"],
"last_updated": int(a.get("date_update", 0))}
for a in data.get("photosets", {}).get("photoset", [])
}
log(f"{len(flickr_albums)} album(s) found on Flickr.")
flickr_albums["__camera_roll__"] = {
"flickr_id": "__camera_roll__", "title": "Camera Roll",
"last_updated": int(time.time()),
}
known_albums = {
r["flickr_id"]: r
for r in conn.execute("SELECT * FROM albums WHERE deleted = 0").fetchall()
}
active_ids = []
for flickr_id, album in flickr_albums.items():
raw_title = album["title"]
local_path = str(MOUNT_POINT / sanitize_folder_name(raw_title))
used_paths = {
r["local_path"] for r in conn.execute(
"SELECT local_path FROM albums WHERE flickr_id != ? AND deleted = 0", (flickr_id,)
).fetchall()
}
suffix = 2
base = local_path
while local_path in used_paths:
local_path = f"{base} ({suffix})"; suffix += 1
if flickr_id not in known_albums:
Path(local_path).mkdir(parents=True, exist_ok=True)
conn.execute(
"INSERT INTO albums (flickr_id, title, local_path, last_updated) VALUES (?,?,?,?)",
(flickr_id, raw_title, local_path, album["last_updated"])
)
conn.commit()
log(f" New album: '{raw_title}' -> {local_path}")
else:
db = known_albums[flickr_id]
if db["title"] != raw_title:
if Path(db["local_path"]).exists():
Path(db["local_path"]).rename(local_path)
conn.execute(
"UPDATE albums SET title=?, local_path=?, last_updated=? WHERE flickr_id=?",
(raw_title, local_path, album["last_updated"], flickr_id)
)
conn.commit()
log(f" Album renamed: '{db['title']}' -> '{raw_title}'")
active_ids.append(flickr_id)
for flickr_id in known_albums:
if flickr_id not in flickr_albums:
conn.execute("UPDATE albums SET deleted=1 WHERE flickr_id=?", (flickr_id,))
conn.commit()
log(f" Album deleted on Flickr: '{known_albums[flickr_id]['title']}'")
log(f"Albums processed. {len(active_ids)} active.")
return active_ids
def get_photos_for_album(album_id: str) -> list:
photos, page = [], 1
while True:
check_runtime()
if album_id == "__camera_roll__":
data = flickr_call("flickr.photos.getNotInSet",
{"extras": "last_update,originalformat,media", "per_page": "500", "page": str(page)})
items = data.get("photos", {})
else:
data = flickr_call("flickr.photosets.getPhotos",
{"photoset_id": album_id, "extras": "last_update,originalformat,media",
"per_page": "500", "page": str(page)})
items = data.get("photoset", {})
if not items: break
photos.extend(items.get("photo", []))
if page >= int(items.get("pages", 1)): break
page += 1
return photos
def get_url(photo_id: str, media: str, orig_ext: str) -> tuple:
data = flickr_call("flickr.photos.getSizes", {"photo_id": photo_id})
sizes = data.get("sizes", {}).get("size", []) if data else []
if media == "video":
for label in ["Video Original", "HD MP4", "Site MP4", "Mobile MP4"]:
for s in sizes:
if s["label"] == label:
url = s.get("source") or s.get("url", "")
ext = "mov" if label == "Video Original" else "mp4"
if url:
log(f" Video quality: {label}")
return url, ext
return "", orig_ext
for s in sizes:
if s["label"] == "Original":
return s.get("source", ""), orig_ext
return (sizes[-1].get("source", ""), orig_ext) if sizes else ("", orig_ext)
# ── Download ──────────────────────────────────────────────────────────────────
MAX_DOWNLOAD_RETRIES = 5
DOWNLOAD_RETRY_WAIT = 30
def download_file(url: str, destination: Path) -> bool:
tmp = destination.with_suffix(destination.suffix + ".tmp")
wait = DOWNLOAD_RETRY_WAIT
for attempt in range(1, MAX_DOWNLOAD_RETRIES + 1):
check_runtime()
try:
destination.parent.mkdir(parents=True, exist_ok=True)
with requests.get(url, stream=True, timeout=60) as r:
r.raise_for_status()
with open(tmp, "wb") as f:
for chunk in r.iter_content(chunk_size=1024 * 1024):
f.write(chunk)
tmp.rename(destination)
return True
except Exception as e:
log_error(f"Download failed: {e} (attempt {attempt}/{MAX_DOWNLOAD_RETRIES})")
tmp.unlink(missing_ok=True)
if attempt < MAX_DOWNLOAD_RETRIES:
time.sleep(wait); wait *= 2
return False
# ── Main sync loop ────────────────────────────────────────────────────────────
def sync_album(album_id: str, conn: sqlite3.Connection) -> dict:
album = conn.execute("SELECT * FROM albums WHERE flickr_id=?", (album_id,)).fetchone()
if not album: return {}
album_path = Path(album["local_path"])
stats = {"new": 0, "updated": 0, "deleted": 0, "failed": 0}
log(f"Album: '{album['title']}'")
flickr_photos = get_photos_for_album(album_id)
if not flickr_photos:
log(" No photos or videos found."); return stats
flickr_photo_ids = {p["id"] for p in flickr_photos}
for photo in conn.execute("SELECT * FROM photos WHERE album_id=? AND deleted=0", (album_id,)).fetchall():
if photo["flickr_id"] not in flickr_photo_ids:
local = album_path / photo["filename"]
if local.exists(): local.unlink()
log(f" Deleted: {photo['filename']}")
conn.execute("UPDATE photos SET deleted=1 WHERE flickr_id=? AND album_id=?",
(photo["flickr_id"], album_id))
conn.commit()
stats["deleted"] += 1
for photo in flickr_photos:
check_runtime()
check_free_space(silent=True)
photo_id = photo["id"]
last_updated = int(photo.get("lastupdate", 0))
media = photo.get("media", "photo")
ext = photo.get("originalformat", "jpg")
known = conn.execute(
"SELECT * FROM photos WHERE flickr_id=? AND album_id=?", (photo_id, album_id)
).fetchone()
samba_ok = known["samba_ok"] if known else 0
is_new = known is None
changed = known and known["last_updated"] != last_updated
if samba_ok and known and not (album_path / known["filename"]).exists():
samba_ok = 0
log(f" Missing on Samba, re-downloading: {known['filename']}")
if samba_ok and not changed:
continue
url, ext = get_url(photo_id, media, ext)
filename = f"{photo_id}.{ext}"
if not url:
log_error(f" No URL for {filename}, skipping.")
stats["failed"] += 1; continue
log(f" {'New' if is_new else 'Updated'} ({media}): {filename}")
if download_file(url, album_path / filename):
samba_ok = 1
stats["new" if is_new else "updated"] += 1
else:
log_error(f" Download failed for {filename}, will retry next run.")
stats["failed"] += 1; samba_ok = 0
if is_new:
conn.execute(
"INSERT INTO photos (flickr_id, album_id, filename, last_updated, samba_ok) VALUES (?,?,?,?,?)",
(photo_id, album_id, filename, last_updated, samba_ok)
)
else:
conn.execute(
"UPDATE photos SET filename=?, last_updated=?, samba_ok=? WHERE flickr_id=? AND album_id=?",
(filename, last_updated, samba_ok, photo_id, album_id)
)
conn.commit()
return stats
def main() -> None:
log_section("Flickr -> Samba sync started")
log(f"Timezone: {os.environ.get('TZ', 'not set')}")
acquire_lock()
mount_share()
check_free_space()
conn = get_db()
reset_interrupted_downloads()
active_album_ids = get_all_albums(conn)
if not active_album_ids:
log_error("No albums found. Check your Flickr tokens in .env.")
cleanup_and_exit(1)
totals = {"new": 0, "updated": 0, "deleted": 0, "failed": 0}
for i, album_id in enumerate(active_album_ids, 1):
log(f"\n[{i}/{len(active_album_ids)}]")
for k, v in sync_album(album_id, conn).items():
totals[k] += v
elapsed = (time.time() - START_TIME) / 60
log_section("Summary")
log(f"Downloaded new : {totals['new']}")
log(f"Updated : {totals['updated']}")
log(f"Deleted : {totals['deleted']}")
log(f"Failed : {totals['failed']}")
log(f"Duration : {elapsed:.1f} minutes")
if totals["failed"]:
log("Failed items will be retried automatically on the next run.")
log("Flickr -> Samba sync complete.")
conn.close()
cleanup_and_exit(0)
if __name__ == "__main__":
main()
hetzner/sync.py — Flickr to Hetzner Storage Box
The Hetzner script is structurally identical to the Orion script. The only differences are SMB_EXTRA_OPTS=vers=3 in the .env and an empty SMB_SUBFOLDER. For completeness:
#!/usr/bin/env python3
"""
Flickr -> Samba sync script
Runs nightly via cron. Synchronises Flickr albums to a Samba share.
"""
import hashlib
import hmac
import os
import signal
import sqlite3
import sys
import time
import urllib.parse
import base64
import re
import shutil
import subprocess
from datetime import datetime
from pathlib import Path
import requests
# ── Configuration ─────────────────────────────────────────────────────────────
FLICKR_API_KEY = os.environ["FLICKR_API_KEY"]
FLICKR_API_SECRET = os.environ["FLICKR_API_SECRET"]
FLICKR_ACCESS_TOKEN = os.environ["FLICKR_ACCESS_TOKEN"]
FLICKR_ACCESS_SECRET = os.environ["FLICKR_ACCESS_SECRET"]
SMB_HOST = os.environ["SMB_HOST"]
SMB_SHARE = os.environ["SMB_SHARE"]
SMB_USER = os.environ["SMB_USER"]
SMB_PASSWORD = os.environ["SMB_PASSWORD"]
SMB_SUBFOLDER = os.environ["SMB_SUBFOLDER"]
SMB_EXTRA_OPTS = os.environ.get("SMB_EXTRA_OPTS", "")
MIN_FREE_SPACE_GB = int(os.environ.get("MIN_FREE_SPACE_GB", "10"))
MAX_RUNTIME_HOURS = int(os.environ.get("MAX_RUNTIME_HOURS", "10"))
# ── Paths ─────────────────────────────────────────────────────────────────────
STATE_DIR = Path(os.environ.get("STATE_DIR", "/state"))
DB_PATH = STATE_DIR / "flickr_samba.db"
LOCK_FILE = STATE_DIR / "flickr_samba.lock"
MOUNT_POINT = Path("/photos")
# ── Logging ───────────────────────────────────────────────────────────────────
def log(message: str) -> None:
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{timestamp}] {message}", flush=True)
def log_error(message: str) -> None:
log(f"ERROR: {message}")
def log_section(title: str) -> None:
log(f"{'─' * 60}")
log(f" {title}")
log(f"{'─' * 60}")
# ── Lock ──────────────────────────────────────────────────────────────────────
START_TIME = time.time()
def acquire_lock() -> None:
if LOCK_FILE.exists():
try:
lock_time = float(LOCK_FILE.read_text().strip())
age_hours = (time.time() - lock_time) / 3600
if age_hours > MAX_RUNTIME_HOURS:
log(f"Stale lock found ({age_hours:.1f}h old). Cleaning up.")
cleanup_stale_lock()
else:
log_error(f"Script already running (lock {age_hours:.1f}h old). Exiting.")
sys.exit(1)
except (ValueError, OSError):
LOCK_FILE.unlink(missing_ok=True)
LOCK_FILE.write_text(str(time.time()))
log("Lock acquired.")
def release_lock() -> None:
LOCK_FILE.unlink(missing_ok=True)
log("Lock released.")
def cleanup_stale_lock() -> None:
if MOUNT_POINT.exists():
for tmp_file in MOUNT_POINT.rglob("*.tmp"):
try:
tmp_file.unlink()
except OSError:
pass
LOCK_FILE.unlink(missing_ok=True)
# ── Runtime ───────────────────────────────────────────────────────────────────
def check_runtime() -> None:
if (time.time() - START_TIME) / 3600 > MAX_RUNTIME_HOURS:
log_error(f"Maximum runtime of {MAX_RUNTIME_HOURS}h exceeded. Stopping.")
cleanup_and_exit(1)
def handle_signal(signum, frame) -> None:
log(f"Signal {signum} received. Shutting down cleanly.")
cleanup_and_exit(0)
def cleanup_and_exit(exit_code: int) -> None:
unmount_share()
release_lock()
log(f"Script exited with code {exit_code}.")
sys.exit(exit_code)
signal.signal(signal.SIGTERM, handle_signal)
signal.signal(signal.SIGINT, handle_signal)
# ── Samba ─────────────────────────────────────────────────────────────────────
SMB_PATH = f"//{SMB_HOST}/{SMB_SHARE}" + (f"/{SMB_SUBFOLDER}" if SMB_SUBFOLDER else "")
MAX_MOUNT_RETRIES = 3
MOUNT_RETRY_WAIT = 30
def mount_share() -> None:
log_section("Mounting Samba share")
unmount_share()
MOUNT_POINT.mkdir(parents=True, exist_ok=True)
for attempt in range(1, MAX_MOUNT_RETRIES + 1):
log(f"Mount attempt {attempt}/{MAX_MOUNT_RETRIES}: {SMB_PATH} -> {MOUNT_POINT}")
result = subprocess.run(
[
"mount", "-t", "cifs",
SMB_PATH,
str(MOUNT_POINT),
"-o", f"username={SMB_USER},password={SMB_PASSWORD},uid=0,gid=0{(',' + SMB_EXTRA_OPTS) if SMB_EXTRA_OPTS else ''}",
],
capture_output=True, text=True,
)
if result.returncode == 0:
log("Samba share mounted successfully.")
return
log_error(f"Mount failed: {result.stderr.strip()}")
if attempt < MAX_MOUNT_RETRIES:
time.sleep(MOUNT_RETRY_WAIT)
log_error(f"Samba share unreachable after {MAX_MOUNT_RETRIES} attempts.")
release_lock()
sys.exit(1)
def unmount_share() -> None:
if MOUNT_POINT.exists():
subprocess.run(["umount", str(MOUNT_POINT)], capture_output=True)
def check_free_space(silent: bool = False) -> None:
free_gb = shutil.disk_usage(MOUNT_POINT).free / (1024 ** 3)
if free_gb < MIN_FREE_SPACE_GB:
log_error(f"Insufficient disk space: {free_gb:.1f}GB free, minimum {MIN_FREE_SPACE_GB}GB.")
cleanup_and_exit(1)
if not silent:
log(f"Disk space OK: {free_gb:.1f}GB free.")
# ── Database ──────────────────────────────────────────────────────────────────
def get_db() -> sqlite3.Connection:
STATE_DIR.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn.row_factory = sqlite3.Row
conn.executescript("""
CREATE TABLE IF NOT EXISTS albums (
flickr_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
local_path TEXT NOT NULL,
last_updated INTEGER NOT NULL,
deleted INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE IF NOT EXISTS photos (
flickr_id TEXT NOT NULL,
album_id TEXT NOT NULL,
filename TEXT NOT NULL,
last_updated INTEGER NOT NULL,
samba_ok INTEGER NOT NULL DEFAULT 0,
deleted INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (flickr_id, album_id)
);
""")
conn.commit()
return conn
def reset_interrupted_downloads() -> None:
log("Checking for interrupted downloads from previous run...")
count = 0
for tmp_file in MOUNT_POINT.rglob("*.tmp"):
try:
tmp_file.unlink()
count += 1
except OSError:
pass
if count:
log(f"Cleaned up {count} interrupted download(s).")
else:
log("No interrupted downloads found.")
# ── Flickr API ────────────────────────────────────────────────────────────────
FLICKR_API_URL = "https://www.flickr.com/services/rest/"
MAX_API_RETRIES = 5
RATE_LIMIT_WAIT = 60 # used only for HTTP 429 responses
API_ERROR_WAIT = 5 # base wait for transient API errors; doubles each retry
def oauth_encode(s: str) -> str:
return urllib.parse.quote(str(s), safe="")
def make_oauth_signature(method: str, url: str, params: dict) -> str:
sorted_params = "&".join(
f"{oauth_encode(k)}={oauth_encode(v)}"
for k, v in sorted(params.items())
)
base_string = "&".join([method.upper(), oauth_encode(url), oauth_encode(sorted_params)])
signing_key = f"{oauth_encode(FLICKR_API_SECRET)}&{oauth_encode(FLICKR_ACCESS_SECRET)}"
hashed = hmac.new(signing_key.encode(), base_string.encode(), hashlib.sha1)
return base64.b64encode(hashed.digest()).decode()
def base_oauth_params() -> dict:
return {
"oauth_consumer_key": FLICKR_API_KEY,
"oauth_nonce": hashlib.md5(str(time.time()).encode()).hexdigest(),
"oauth_signature_method": "HMAC-SHA1",
"oauth_timestamp": str(int(time.time())),
"oauth_token": FLICKR_ACCESS_TOKEN,
"oauth_version": "1.0",
}
def flickr_call(method: str, extra_params: dict = {}) -> dict:
rate_limit_wait = RATE_LIMIT_WAIT
for attempt in range(1, MAX_API_RETRIES + 1):
check_runtime()
# Fresh OAuth params on every attempt (timestamp and nonce must not be reused)
params = base_oauth_params()
params.update({"method": method, "format": "json", "nojsoncallback": "1"})
params.update(extra_params)
params["oauth_signature"] = make_oauth_signature("GET", FLICKR_API_URL, params)
try:
r = requests.get(FLICKR_API_URL, params=params, timeout=30)
if r.status_code == 429:
log(f"Rate limit reached. Waiting {rate_limit_wait}s...")
time.sleep(rate_limit_wait); rate_limit_wait *= 2; continue
data = r.json()
if data.get("stat") == "fail":
if data.get("code") in (98, 100):
log_error(f"Flickr OAuth error: {data.get('message')}. Check your tokens.")
cleanup_and_exit(1)
if data.get("code") == 1:
return {}
error_wait = API_ERROR_WAIT * (2 ** (attempt - 1))
log_error(f"Flickr API error: {data.get('message')} (attempt {attempt}/{MAX_API_RETRIES})")
time.sleep(error_wait); continue
return data
except requests.RequestException as e:
error_wait = API_ERROR_WAIT * (2 ** (attempt - 1))
log_error(f"Network error: {e} (attempt {attempt}/{MAX_API_RETRIES})")
time.sleep(error_wait)
log_error(f"Flickr API '{method}' failed after {MAX_API_RETRIES} attempts.")
return {}
def sanitize_folder_name(name: str) -> str:
s = re.sub(r'[<>:"/\\|?*\x00-\x1f]', '-', name)
s = re.sub(r'-+', '-', s).strip(' -')
return s or "Unnamed"
def get_all_albums(conn: sqlite3.Connection) -> list:
log_section("Fetching albums from Flickr")
data = flickr_call("flickr.photosets.getList", {"per_page": "500"})
if not data:
log_error("Could not fetch albums from Flickr.")
return []
flickr_albums = {
a["id"]: {"flickr_id": a["id"], "title": a["title"]["_content"],
"last_updated": int(a.get("date_update", 0))}
for a in data.get("photosets", {}).get("photoset", [])
}
log(f"{len(flickr_albums)} album(s) found on Flickr.")
flickr_albums["__camera_roll__"] = {
"flickr_id": "__camera_roll__", "title": "Camera Roll",
"last_updated": int(time.time()),
}
known_albums = {
r["flickr_id"]: r
for r in conn.execute("SELECT * FROM albums WHERE deleted = 0").fetchall()
}
active_ids = []
for flickr_id, album in flickr_albums.items():
raw_title = album["title"]
local_path = str(MOUNT_POINT / sanitize_folder_name(raw_title))
used_paths = {
r["local_path"] for r in conn.execute(
"SELECT local_path FROM albums WHERE flickr_id != ? AND deleted = 0", (flickr_id,)
).fetchall()
}
suffix = 2
base = local_path
while local_path in used_paths:
local_path = f"{base} ({suffix})"; suffix += 1
if flickr_id not in known_albums:
Path(local_path).mkdir(parents=True, exist_ok=True)
conn.execute(
"INSERT INTO albums (flickr_id, title, local_path, last_updated) VALUES (?,?,?,?)",
(flickr_id, raw_title, local_path, album["last_updated"])
)
conn.commit()
log(f" New album: '{raw_title}' -> {local_path}")
else:
db = known_albums[flickr_id]
if db["title"] != raw_title:
if Path(db["local_path"]).exists():
Path(db["local_path"]).rename(local_path)
conn.execute(
"UPDATE albums SET title=?, local_path=?, last_updated=? WHERE flickr_id=?",
(raw_title, local_path, album["last_updated"], flickr_id)
)
conn.commit()
log(f" Album renamed: '{db['title']}' -> '{raw_title}'")
active_ids.append(flickr_id)
for flickr_id in known_albums:
if flickr_id not in flickr_albums:
conn.execute("UPDATE albums SET deleted=1 WHERE flickr_id=?", (flickr_id,))
conn.commit()
log(f" Album deleted on Flickr: '{known_albums[flickr_id]['title']}'")
log(f"Albums processed. {len(active_ids)} active.")
return active_ids
def get_photos_for_album(album_id: str) -> list:
photos, page = [], 1
while True:
check_runtime()
if album_id == "__camera_roll__":
data = flickr_call("flickr.photos.getNotInSet",
{"extras": "last_update,originalformat,media", "per_page": "500", "page": str(page)})
items = data.get("photos", {})
else:
data = flickr_call("flickr.photosets.getPhotos",
{"photoset_id": album_id, "extras": "last_update,originalformat,media",
"per_page": "500", "page": str(page)})
items = data.get("photoset", {})
if not items: break
photos.extend(items.get("photo", []))
if page >= int(items.get("pages", 1)): break
page += 1
return photos
def get_url(photo_id: str, media: str, orig_ext: str) -> tuple:
data = flickr_call("flickr.photos.getSizes", {"photo_id": photo_id})
sizes = data.get("sizes", {}).get("size", []) if data else []
if media == "video":
for label in ["Video Original", "HD MP4", "Site MP4", "Mobile MP4"]:
for s in sizes:
if s["label"] == label:
url = s.get("source") or s.get("url", "")
ext = "mov" if label == "Video Original" else "mp4"
if url:
log(f" Video quality: {label}")
return url, ext
return "", orig_ext
for s in sizes:
if s["label"] == "Original":
return s.get("source", ""), orig_ext
return (sizes[-1].get("source", ""), orig_ext) if sizes else ("", orig_ext)
# ── Download ──────────────────────────────────────────────────────────────────
MAX_DOWNLOAD_RETRIES = 5
DOWNLOAD_RETRY_WAIT = 30
def download_file(url: str, destination: Path) -> bool:
tmp = destination.with_suffix(destination.suffix + ".tmp")
wait = DOWNLOAD_RETRY_WAIT
for attempt in range(1, MAX_DOWNLOAD_RETRIES + 1):
check_runtime()
try:
destination.parent.mkdir(parents=True, exist_ok=True)
with requests.get(url, stream=True, timeout=60) as r:
r.raise_for_status()
with open(tmp, "wb") as f:
for chunk in r.iter_content(chunk_size=1024 * 1024):
f.write(chunk)
tmp.rename(destination)
return True
except Exception as e:
log_error(f"Download failed: {e} (attempt {attempt}/{MAX_DOWNLOAD_RETRIES})")
tmp.unlink(missing_ok=True)
if attempt < MAX_DOWNLOAD_RETRIES:
time.sleep(wait); wait *= 2
return False
# ── Main sync loop ────────────────────────────────────────────────────────────
def sync_album(album_id: str, conn: sqlite3.Connection) -> dict:
album = conn.execute("SELECT * FROM albums WHERE flickr_id=?", (album_id,)).fetchone()
if not album: return {}
album_path = Path(album["local_path"])
stats = {"new": 0, "updated": 0, "deleted": 0, "failed": 0}
log(f"Album: '{album['title']}'")
flickr_photos = get_photos_for_album(album_id)
if not flickr_photos:
log(" No photos or videos found."); return stats
flickr_photo_ids = {p["id"] for p in flickr_photos}
for photo in conn.execute("SELECT * FROM photos WHERE album_id=? AND deleted=0", (album_id,)).fetchall():
if photo["flickr_id"] not in flickr_photo_ids:
local = album_path / photo["filename"]
if local.exists(): local.unlink()
log(f" Deleted: {photo['filename']}")
conn.execute("UPDATE photos SET deleted=1 WHERE flickr_id=? AND album_id=?",
(photo["flickr_id"], album_id))
conn.commit()
stats["deleted"] += 1
for photo in flickr_photos:
check_runtime()
check_free_space(silent=True)
photo_id = photo["id"]
last_updated = int(photo.get("lastupdate", 0))
media = photo.get("media", "photo")
ext = photo.get("originalformat", "jpg")
known = conn.execute(
"SELECT * FROM photos WHERE flickr_id=? AND album_id=?", (photo_id, album_id)
).fetchone()
samba_ok = known["samba_ok"] if known else 0
is_new = known is None
changed = known and known["last_updated"] != last_updated
if samba_ok and known and not (album_path / known["filename"]).exists():
samba_ok = 0
log(f" Missing on Samba, re-downloading: {known['filename']}")
if samba_ok and not changed:
continue
url, ext = get_url(photo_id, media, ext)
filename = f"{photo_id}.{ext}"
if not url:
log_error(f" No URL for {filename}, skipping.")
stats["failed"] += 1; continue
log(f" {'New' if is_new else 'Updated'} ({media}): {filename}")
if download_file(url, album_path / filename):
samba_ok = 1
stats["new" if is_new else "updated"] += 1
else:
log_error(f" Download failed for {filename}, will retry next run.")
stats["failed"] += 1; samba_ok = 0
if is_new:
conn.execute(
"INSERT INTO photos (flickr_id, album_id, filename, last_updated, samba_ok) VALUES (?,?,?,?,?)",
(photo_id, album_id, filename, last_updated, samba_ok)
)
else:
conn.execute(
"UPDATE photos SET filename=?, last_updated=?, samba_ok=? WHERE flickr_id=? AND album_id=?",
(filename, last_updated, samba_ok, photo_id, album_id)
)
conn.commit()
return stats
def main() -> None:
log_section("Flickr -> Samba sync started")
log(f"Timezone: {os.environ.get('TZ', 'not set')}")
acquire_lock()
mount_share()
check_free_space()
conn = get_db()
reset_interrupted_downloads()
active_album_ids = get_all_albums(conn)
if not active_album_ids:
log_error("No albums found. Check your Flickr tokens in .env.")
cleanup_and_exit(1)
totals = {"new": 0, "updated": 0, "deleted": 0, "failed": 0}
for i, album_id in enumerate(active_album_ids, 1):
log(f"\n[{i}/{len(active_album_ids)}]")
for k, v in sync_album(album_id, conn).items():
totals[k] += v
elapsed = (time.time() - START_TIME) / 60
log_section("Summary")
log(f"Downloaded new : {totals['new']}")
log(f"Updated : {totals['updated']}")
log(f"Deleted : {totals['deleted']}")
log(f"Failed : {totals['failed']}")
log(f"Duration : {elapsed:.1f} minutes")
if totals["failed"]:
log("Failed items will be retried automatically on the next run.")
log("Flickr -> Samba sync complete.")
conn.close()
cleanup_and_exit(0)
if __name__ == "__main__":
main()
Lessons learned
On simplicity vs. robustness: the first version of this script was about 50 lines. It worked fine for a single run. It did not work fine when the network dropped halfway through a 200MB video download, or when the Samba share wasn't mounted yet, or when it ran at 4am while a previous instance was still running. Every feature in the final scripts exists because something broke without it.
On trusting your own database: a state database that says "file is backed up" but never checks whether the file actually exists is not a backup tool, it's an optimistic file registry. The physical existence check is not optional if you want a real sync.
On video support: Flickr's API is inconsistent about videos. The originalformat field for a video record might say jpg. The media field is reliable; the format field is not. Always call getSizes and use the URL's implied extension.
On OAuth 1.0a: it looks terrifying, but it's about 30 lines of code and then you never think about it again. The non-expiring tokens are worth it. Much like a good marriage, the hard part is getting the signature right; after that, things just run quietly in the background at 4am without anyone noticing.
On Docker for Samba mounts: mounting a CIFS share inside a Docker container requires SYS_ADMIN capability and apparmor:unconfined. This looks alarming on a security checklist. It is the correct solution for this use case. The container has no network exposure and runs as a scheduled job, the risk is minimal and the isolation benefits are real.
What this won't save you from
This setup protects you against hardware failure, accidental deletion, provider outages, and the slow heat death of any single cloud service. It does not protect you against:
- Deleting something on Flickr deliberately and then changing your mind later. The Synology snapshots retain a daily snapshot for 365 days, plus the last snapshot of each year for two years beyond that. So you have a surprisingly long window to reconsider your decisions, longer than most people's regret cycles.
- Your Flickr account being compromised. If someone logs into your Flickr and deletes everything, the sync scripts will faithfully propagate those deletions to Orion and Hetzner. However, the Synology snapshots mean your local copy is protected for up to a year regardless of what happens upstream. Still; enable two-factor authentication on your Flickr account. There is no good reason not to.
- The sun expanding to engulf the Earth. At that point, the photos are probably the least of your concerns.
Three copies. Two destinations. One cron job. Sleep well.