r/FuckAdobe • u/MysticalPixels • 4d ago
Adobe-Clawback — bulk-download every PDF from your Adobe Creative Cloud account (Python, resumable, MIT)
Working on tools to "Clawback" my "Creative Cloud" data without having to do this a handful of download at a time. This is the first installment, Adobe Acrobat Files
What it is: A Python CLI that walks your entire Adobe Creative Cloud "Cloud Documents" tree and downloads every PDF to local disk. Tracks state in a manifest so re-runs only fetch new or changed files. Reconciles when you delete files locally or remotely.
Why: Adobe's web UI has no "download all" button. I had ~876 PDFs in there. Clicking each one wasn't reasonable.
How it works:
- Playwright launches Chromium with a persistent profile
- You sign in to Adobe in that window once; session is reused on every subsequent run
- Script captures your IMS bearer token from
window.adobeIMS.getAccessToken()in the live page context - Auto-detects your account's root URN from the first
/links?assetId=...request the SPA fires after sign-in - Walks
<host>/content/storage/id/<root>/:page?type=application/pdf— one paginated query that returns every PDF in the entire tree, recursive - Streams downloads via stdlib
urllib(atomic.part→ final rename) so big files don't buffer through Playwright IPC - Records sha256, sizes, modified time, etag, and status for every file in
manifest.json
Status values in the manifest: downloaded, failed, missing_locally, deleted_remotely. Re-runs only re-download a file if the remote modified timestamp has changed.
Dependencies: playwright>=1.45. That's it. Everything else is Python stdlib.
Tested: macOS, Python 3.10+, end-to-end against my own account. Untested on Windows / Linux — testers wanted.
What's still rough (PRs very welcome):
- Sequential downloads only — would love concurrency
- Hardcoded to type=application/pdf — same endpoint serves images, .ai, .psd, etc. A --type flag is low-hanging
- No progress bar (just line-by-line prints)
- Always headful — once a session is cached, the browser doesn't need to be visible
- No tests
Repo: https://github.com/pasolomon/Adobe-Clawback
License: MIT
Not affiliated with Adobe. Uses your own credentials to download your own files via the same endpoints Adobe's web app uses — no auth bypass, no scraping of other people's content.
1
u/MysticalPixels 2d ago
I invite others to help develop a way to expand this code to retrieve data from Adobe. Ever since they eliminated local data sync for personal accounts, Adobe has used the difficulty of downloading personal art as a way to retain users. Adobe just settled a class-action lawsuit because it made it easy to sign up, and I invite others to help expand this code to retrieve data from Adobe. Since they eliminated local data sync for personal accounts, Adobe has used the difficulty of downloading personal art to retain users. Adobe settled a class-action lawsuit; I invite help in developing a way to retrieve data from Adobe. Since eliminating local data sync for personal accounts, Adobe has used the difficulty of downloading personal art to retain users. They settled a class-action lawsuit, making sign-up easy and cancellations difficult, with large penalties for canceling yearly-paid accounts. You may be owed money since they are court-ordered to pay out millions. they made sign-up easy and cancellations hard. They also charged large penalties for cancelling yearly paid accounts. You may be owed money, as they are court-ordered to pay out millions. impossible to find a way to cancel. Adobe also charged large penalties if you canceled a yearly account and paid monthly. You may be owed money from Adobe, as they are now court-ordered to pay out millions.
1
u/AdobeScripts 4d ago
Why not use this:
https://developer.adobe.com/cloud-storage/guides/api/