🗄️

BuildDoc: Large-file vault — host the story, not the terabytes

The CloudThe Cloud
builddocstoragemediainfrasecurity

The Cloud has to solve cloud storage, it's pretty much a non-negotiable part of our story. So here we go!

The main thing here is I want to build something that we can actually use, and it solves problems at a great and wider capacity. If we're growing and making a difference, it's a garden we wish to create and attend to.

Each stage is an opportunity to get real. Become present. So where you're holding. Something tell you not to go deeper. Something that is not you. Something that is not calm. Begin to see it. And it will become known to you. The Path will become clear.

Large-file vault. Let people bring terabyte-scale, irreplaceable footage (Sony XAVC video plus its sidecar .xml metadata, family archives, legacy material) into The Cloud, securely, without us or them paying to host the full raw weight. The story: footage that matters too much to lose and is too heavy for anyone to want to store. We make it storable, re-openable, and usable in a new way.

Brand thread: this is a "no one wants to store this... so we made it effortless" moment. Insert the "no one:" meme format in user-facing storytelling. The emotional hook is legacy and memory, not gigabytes.

The problem, precisely

Sony cameras produce large footage files alongside .xml sidecars (clip metadata, timecode, lens and color data). A single shoot can be terabytes. This material is often someone's most important footage: a story, a memory, a legacy. Two truths collide:

  1. They want it safe, in The Cloud, re-openable and usable.

  2. Nobody, us or them, wants to pay to store raw terabytes indefinitely.

So the goal is to take on the value of the footage without taking on the full raw weight of it. Compressed in, stored light, re-opened and used in a new way. "Zip to view" should feel quick.

Directions to weigh (for builders to make real)

This is the part that needs codebase- and infra-level judgment. The doc frames the options; the builders pick and prove the architecture. No room for error here, this is mic-drop infra work.

  • Tiered / cold storage. Raw masters live in cheap cold object storage (Glacier-class), with lightweight proxies hot in The Cloud for instant viewing. Cost drops because hot storage only holds the small proxy, not the master. Retrieval of the master is slower and explicit.

  • Proxy + sidecar model. On upload, generate a compressed, web-playable proxy (the thing people actually watch and work with) and preserve the .xml sidecar and a manifest. The proxy is what The Cloud frames and re-creates around. The master is cold or external.

  • Bring-your-own-storage / pass-through. The user's footage stays in their own bucket or drive; The Cloud holds the proxy, the metadata, and a secure reference, and frames the experience without ever taking custody of the raw terabytes. We host the meaning, not the mass.

  • Content-addressed + dedup. Hash chunks so identical data is stored once, with integrity verification baked in. Helps at scale and guards against silent corruption of irreplaceable material.

  • "Zip to view." A package format that bundles proxy plus sidecar plus manifest so an archive can be stored compact and re-opened fast into a usable Cloud surface, rather than a dead .zip that has to be fully downloaded and unpacked.

These are not mutually exclusive. A likely shape is proxy-hot + master-cold-or-BYO + content-addressed integrity + a zip-to-view package. Builders decide.

Security and trust (non-negotiable)

This is people's legacy. The handling has to be beyond reproach.

  • Encrypted at rest and in transit.

  • Integrity verification so a master can be proven intact years later (hashes, checksums).

  • Clear custody model: the user always knows where their master physically lives and that it is theirs.

  • No silent quality loss: the proxy is explicitly a proxy; the master is preserved or under the user's control, never quietly discarded.

The reframe (why this is more than storage)

The point is not to be a cheaper Dropbox. It is to make heavy, precious footage usable in a new way once it is in The Cloud: framed on a page or profile, played in the floating media window, woven into a story or a memory canvas, shared with specificity via block-level visibility. Storage is the cost center; the re-creation and framing is the product. We carry the story, not the bytes.


Phases

Phase 0 is the builders'. The storage and integrity architecture must be specced from inside the infra, not from outside.

Phase 0 — Builders architect the model

With infra-level judgment, choose and prove the architecture across the directions above: hot/cold tiers, proxy generation pipeline, BYO-storage vs custody, content-addressing and integrity, and the zip-to-view package format. Define cost model per stored hour of footage and the security posture end to end. Output: a technical architecture spec with no room for error.

Phase 1 — Ingest + proxy pipeline

Upload large footage (Sony XAVC plus .xml sidecars). Generate a compressed web-playable proxy, preserve the sidecar and a manifest, verify integrity on the way in.

Done when: a multi-gigabyte clip uploads, a proxy is watchable in The Cloud quickly, and the .xml plus master are accounted for with verified integrity.

Phase 2 — Light storage + master custody

Keep proxies hot; place masters in cold storage or under BYO-storage reference. The user always knows where the master lives. Cost stays low.

Done when: a terabyte archive is browsable and viewable in The Cloud while raw weight is not sitting in expensive hot storage.

Phase 3 — Zip-to-view package + re-open

A package format bundling proxy, sidecar, and manifest that stores compact and re-opens fast into a usable Cloud surface.

Done when: an archive can be packaged, stored small, and re-opened quickly into a working view without a full raw download.

Phase 4 — Frame and re-create

Surface footage as a first-class Cloud object: on pages and profiles, in the floating media window, in memory canvases, shareable via block-level visibility.

Done when: stored footage is usable and shareable across Cloud surfaces, not just retrievable.

Open questions for builders

  • Proxy codec and resolution: what is the right web-playable proxy spec for Sony XAVC sources?

  • Custody: do we ever hold the master, or is the master always cold-tier-ours or BYO-theirs? What does the cost model say?

  • Integrity over time: how do we let a user prove their master is intact years later?

  • Zip-to-view: is this a real container format or a manifest-plus-streaming abstraction? What makes re-open feel instant?

  • Pricing: who pays for cold retrieval of a master, and how is that surfaced honestly?

BuildDoc: Floating media window (persistent, plays anywhere)

BuildDoc: Block-level visibility & sharing (eye icon)



Correction & sharpening — the engine is not a compressor

A note added after discussion, because this is where loving the vision could cost the build. The headline is not "we built a better compressor."

The physics: general-purpose lossless compression buys roughly 2 to 5 percent on already-compressed video like Sony XAVC, because the camera already compressed it. The only way to dramatic size reduction is lossy re-encoding, which throws away original data. For irreplaceable masters that violates the no-quality-loss promise. So compression is one tool in the kit, not the engine.

The real engine is proxy + tiered custody. We do not shrink the footage; we are smart about which copy is hot. A small, beautiful, web-playable proxy lives hot and instant and is what people watch, frame, and share every day. The full-weight master lives cold, or in the user's own storage, retrieved only on the rare occasion the true original is needed. The experience feels weightless and fully present while the expensive raw bytes sit somewhere cheap or somewhere that is not our bill. This is how serious media platforms actually win. None won on a compressor.

Winning against Drive / iCloud / hard drives — the right axis. We do not beat Google on cost per gigabyte; their scale is untouchable. We beat them on the thing they are all bad at: storage as a dead end. Drive is a folder. iCloud is a folder. A hard drive is a folder. Files go in and rot, disconnected from everything. The Cloud's edge is that a file lands in the substrate: framed on a page, played in the floating window, woven into a memory canvas, shared with block-level precision, connected to everything else. That is storage that is alive, which they cannot copy without rebuilding their whole product. Carry the story, not the bytes, and make the story usable.

YouTube as the video proxy delivery tier

A strong, specific tier in the kit: use YouTube to host and stream the video proxy, offloading the single most expensive thing (video playback and bandwidth) to Google's CDN for free.

How it slots in:

  • On publish, The Cloud pushes the proxy to the user's own YouTube (via OAuth, their account, their storage) as private / unlisted / public.

  • YouTube holds and streams that one proxy. There is no second proxy on Cloud infra. The Cloud holds the reference (embed, metadata, .xml sidecar, visibility) and renders it across pages, posts, profiles, the floating media window, and memory canvases.

  • It feels like the video lives in The Cloud and is fully theirs to place and share. That it rides YouTube's pipes is an implementation detail the user never has to think about.

The fences, so builders use it for the right job:

  • Video only. YouTube is not the FileVault storage layer. The master, the .xml sidecars, photos, documents, and the non-video long tail still need the Phase 0 cold / BYO custody answer. YouTube is the video proxy and delivery tier, nothing more.

  • Lossy by definition. YouTube re-encodes on upload. Perfect for the proxy people watch. Never the master. The no-quality-loss promise holds precisely because YouTube only ever holds the proxy.

  • Opportunistic, not guaranteed custody. It is Google's platform and rules: content-ID flags, age-gates, takedowns, API quota, shifting terms. Architecture must treat YouTube as one swappable delivery option with the master safe elsewhere, so nothing irreplaceable is ever at risk and playback can fail over to another proxy source.

Dual-settings — manage and post YouTube from The Cloud

A feature worth having on its own, separate from the storage angle: manage YouTube videos and post to YouTube from inside The Cloud, one control surface.

  • A video's Cloud visibility maps to its YouTube visibility. Private in The Cloud means private on YouTube; shared to specific people maps to an unlisted link shared with those people; public is public.

  • Honest caveat for builders: YouTube "unlisted" is link-anyone-with-it, not true per-person access control. So "shared with specific people" cannot be enforced by YouTube alone. Gate the embed behind Cloud auth and render it in our own player chrome; use unlisted as the transport, not the lock. Otherwise the security promise gets fuzzy.

  • OAuth is per-user: their Google account, their footage, their storage. We provide the control surface and the connective tissue. Fits Frictionless Connect and bring-your-own-storage.

The spine — publish once, reference everywhere

The flow that ties it together: they hit publish in The Cloud. The Cloud generates or hands off the proxy, pushes it to their YouTube per the dual-settings mapping, and gets back a reference. From that moment the proxy is a first-class Cloud object, placeable and shareable across every surface, rendered in Cloud player chrome behind Cloud auth. Three layers, one experience: master safe and heavy somewhere cheap or BYO, proxy hosted free on YouTube, reference and framing alive in The Cloud.

Net for builders: lead with proxy + tiered custody as the engine. Compression and content-addressed dedup are tools in the kit, not the headline. YouTube is the video proxy delivery tier plus a standalone YouTube-management feature. The master and non-video files still live in the Phase 0 custody decision, which remains the keystone.


The Cloud