We are current in the midst of moving 5 years worth of media to a CDN to help increase page-load speed and I have hit some bumps along the way that I want to write about incase anyone else is running into the same problems.
I have used the WordPress cache plugin WP-SuperCache for something like 4 years now. When I got into the WordPress scene it had just replaced WP-Cache as the preferred caching plugin.
Since then, it seems W3 Total Cache has become the golden-child of the WordPress caching scene.
I have tried to migrate to W3 Total Cache a total of 3 times; every 6-8 months or so. About 2 years ago I figured a plugin with that high of a rating and running on such high-visibility websites would make my life amazing, grow my hair back and triple my traffic just through sheer cachiness!
Every time I tried to move to W3TC I found the setup to be overly confusing and never work easily “out of the box”, it was always an adventure with lots of steps. Regardless, I stuck it out and got everything configured appropriately only to find actual page-load times to be the same or slower than they were with WP-SuperCache. This didn’t make sense to me since W3TC was also doing JS/CSS minification, but I didn’t need to be deploying more complex solutions for the sake of complex solutions, so I would always roll back.
Well it is 2011 and with the world ending next year, I figured I would move over to W3TC and employ the use of a CDN all at the same time.
2 hours later, I’m back on WP-SuperCache with the help of CDN Sync Tool to work around the terrible Amazon S3 syncing experience I had with W3TC.
One of the first things to attract me to W3TC was that it had CDN syncing built in. I fired it up and attempted to sync 5 years worth of images to S3 automaticall.y. The tool says it found 7459 images and began syncing.
After about 2 hours it stopped at exactly 2900 images and just kept Retrying/Processing/Retrying… never making progress. I let it do that for about 1 hr before I decided it was dead. Then I tried Pausing for 10mins (hoping whatever resource went stupid during the upload would be reset or cleared) and resuming; no luck.
I finally killed the upload and restarted it, hoping it would know to check resources existed before trying to re-upload and it did. The upload continued to crawl along at 25 items every 10 seconds or so.
When it got to 2900, it kept going, reporting that every 25 items or so are “Already uploaded”… I thought “hmm that’s odd” and just waited.
2hrs went by and the uploads finished, reporting that EVERYTHING had already been uploaded. No new uploads made… bah, stupid waste of time. I suppose that entire time it was hung at 2900 it was uploading in the background.
I then turned on CDN support on the site and started to investigate. I noticed right away that a handful of thumbnails and images were missing from random posts all over the site. I double checked the CDN and they simply weren’t on it; they had never been uploaded.
I re-ran the ENTIRE re-sync process 2 more times (completing in about 1hr each time) thinking that the tool just needed to fill in the blanks, each time reporting that 7459 images were found.
The results didn’t change, there were always the same missing images.
I posted a question about it to the plugin support forums on WordPress.org and the post was promptly closed with no explanation… awesome!
I finally decided to do a file-count of the /uploads directly to see how many images do actually live there and got back a rough count of 29,000 files.
Jesus christ… W3TC was a bit off and had I rolled out with that, would have been buried in broken image links from here until hell froze over. At that point I decided to go back to WP-SuperCache and try out the new CDN Sync Tool addon.
NOTE: I did try adding the wp-content/uploads directory directly to the “custom paths” field in W3TC to see if it would just upload the raw images and skip whatever meta-data it was checking for the actual files to upload; that didn’t work.
It has been running for about an hour now and has synced thousands and thousands of unsynced images. I know this because when it hits a synced image it reports it as “Already uploaded” and moves on; quickly.
I have no idea what the heuristic W3TC uses for file syncing; even if it was just pulling what was from my Media Library and not the actual directory contents, that is still thousands of files; more than 7500.
Either way, I need a solution that will *literally* mirror directory contents for me and not to pretend to be smarter than me by using some meta-assessment of what files I want up there. CDN Sync Tool seems to be doing that, I’ll keep you guys posted on how the transition goes.
Update #1: CDN Sync Tool is far from perfect. After it finished its upload of a few thousand extra files I still have broken links (namely for thumbnails) that I’m seeing. Upon double-checking it looks like it forgot to upload some files as well even though I ran it manually 3x to resync. Each successive run completing rather quickly because it declared “already up to date” with everything.
Now I’m using the manual “Sync Folder” feature inside of CDN Sync Tool to force the remainder of the /upload contents up to S3.
This has been a very frustrating experience over all. I can’t believe how incomplete/disfunctional all these tools are so far. If I could have scp’ed over the whole directly I would have, but AWS uses it’s own custom API and I’m not aware of handy command line utils for this yet, because I haven’t gotten annoyed enough yet to look.
If this manual sync fails, I’ll go digging for something command-line based or just give up on the idea of a CDN for another 6 months until these plugins get more polished.
Update #2: It has been a few hours now and the CDN Sync Tool, using the “manual folder sync” feature in the right-hand side bar and telling it to manually sync “wp-content/uploads”, is still uploading.
It caught a few duplicates already, but the mass majority of what it is uploading are all brand new files that NEITHER plugin found and uploaded on previous sync steps.
Every one of the missing files seem to be scaled versions of the originals that WordPress created. For example:
Basically any image that was resized using the Insert Media functionality was skipped during the import and now I am having to manually force them from the server over to S3.
I’ll drop the CDN totally if I can’t get CDN Sync Tool to sync those automatically because almost every post I make has a few scaled versions of the images I am inserting.
Update #3: Alright the update finished and the site seems to have all its images.
Apparently both CDN Sync Tool and W3TC use WordPress’s media library to discover the images that should be uploaded and don’t just upload what is in the /uploads folder.
This seems like a near-sighted default behavior, because it automatically excludes all the scaled versions of images that WordPress *itself* creates when you post a story.
Update #4: It turns out CDN Sync Tool does not sync all files it finds in the /uploads folder; I have no idea what its heuristic is, but it results in me chasing broken link images all over the place for new or updated stories.
I don’t have time for this so the CDN rollout will have to wait for another day after this functionality is more mature.