Last updated: May 2026
This brief describes how documents and data files uploaded to the Graam Harmony platform are stored, encrypted, and access-controlled. It is intended for security and compliance teams evaluating Graam for institutional use.
For institutional / hedge-fund deployments, Graam Harmony provides:
WHERE clause. A caller cannot retrieve another customer's row regardless of how the document UUID was obtained.The remainder of this document explains each control and how a customer can verify it.
When a user uploads to Graam Harmony, the bytes belong to one of three lanes:
| Artifact | Examples | Storage |
|---|---|---|
| Documents | KBRA pre-sale PDFs, prospectuses, term sheets | GCS (PDF bytes) + Postgres (metadata) |
| Data files | Loan tapes (CSV/Excel), performance workbooks, parquet derivatives | GCS (raw + parquet) + Postgres (metadata) |
Postgres holds metadata only — ownership column, upload timestamp, SHA-256 hash for de-duplication, soft-delete marker, and the GCS path. The customer-supplied bytes themselves live only in GCS.
Loan tapes carry the most sensitive content (loan-level borrower data) and receive the same controls as every other artifact — there is no separate "less protected" lane.
Graam Harmony supports two storage modes, selected by deployment configuration:
Shared bucket (default for non-institutional deployments). All user content lives in one operator-managed bucket, namespaced by object key (documents/<user_id>/<document_id>.<ext>). Cross-customer collisions are impossible because each customer's content is keyed under their UUID.
Per-tenant buckets (institutional / hedge-fund tier). Each customer's content lives in a dedicated GCS bucket — one bucket per user identity, named deterministically as <prefix><uuid_no_dashes> (e.g. graam-tenant-d9cf70a51d214ca58f7a98feecb8cbfa). System-ingested public data (EDGAR filings, etc.) stays in the shared system bucket and never mixes with customer content.
The institutional model is the posture asset managers expect: a single shared bucket has a single blast radius, while per-tenant buckets give each customer:
When the platform creates a customer bucket on first upload, it applies a fixed security policy that cannot be silently weakened:
allUsers or allAuthenticatedUsers, even by an operator with high IAM permissions. This is the bucket-level safety net against the failure mode where a misconfigured grant exposes data to the open internet.us-central1; EU and other GCP regions are supported.Customers who require specific retention-lock policies (e.g. 7-year hold for ABS) pre-create their bucket with the lock applied. The platform refuses to silently create a bucket without the required lock when configured to defer to operator provisioning.
All bytes are encrypted at rest by GCS with AES-256. For institutional deployments, Graam Harmony binds each per-tenant bucket to a Customer-Managed Encryption Key (CMEK) in Cloud KMS at bucket creation time. Three configurations are supported:
| Mode | Use case |
|---|---|
| Google-managed keys (GMEK) — default for non-institutional deployments. | AES-256 at rest with Google holding the keys. |
| Single shared customer key. | One Cloud KMS key for the deployment. Operationally simple — one key to rotate, one IAM grant. Appropriate for customer-dedicated Graam instances. |
| Per-tenant customer key. | A separate Cloud KMS key per customer. Each customer's bucket binds to their own key. Cryptographic isolation between customers. Revoking one customer's key has no effect on others. |
In the customer-key modes, the customer (not Graam) creates the KMS key in their own keyring. The customer grants the GCS service account roles/cloudkms.cryptoKeyEncrypterDecrypter on the project. The Graam application's service account does not make this grant on the customer's behalf — that would defeat the entire CMEK threat model. Revoking the grant cryptographically locks Graam (and Google) out of the data, without requiring any code change or deployment action.
The metadata database is hosted on Google Cloud SQL with at-rest encryption enabled. Postgres holds metadata only — file ownership, hashes, paths — never the customer-uploaded bytes.
TLS 1.2 or higher between every link: client ↔ API, API ↔ GCS, API ↔ Postgres. Plaintext HTTP is not accepted.
Every API request authenticates with a JSON Web Token (JWT) signed by a customer-controlled secret or asymmetric key. The API verifies:
exp (expiration) claim.iss (issuer) and aud (audience) claims.A malformed, expired, or wrong-signature token returns 401 Unauthorized. There is no silent fallback — an attacker forging a token gets rejected, not downgraded.
The verified user identity is read from the sub claim and becomes the binding for every downstream access decision in that request.
Customers operating their own JWT-issuing identity provider can plug in their public key (JWKS) and have Graam reject any token not minted by their IdP.
For every access of a customer artifact, the platform:
A spoofed ?user_id=victim URL parameter is irrelevant; the verified JWT is authoritative.
Application-layer authorization checks have a known failure mode: an internal code path can be written that forgets to perform the check. Graam Harmony eliminates this risk by enforcing ownership at the SQL layer.
The data-access methods that read documents and data files require the calling user's id and apply it as a WHERE clause filter. A row whose owner does not match the calling identity is invisible to the query — no row is returned, regardless of how the artifact UUID was obtained.
This means an internal code path that mishandles user identity gets back zero rows, not the wrong customer's data.
A small, audit-tagged set of system pipelines (post-upload parsing, batch enrichment) operate without a calling user. These call explicitly-named bypass methods that produce a unique audit signal — a security review can locate every bypass in one search and review the justifying comment at each site. These bypass methods are never reachable from an API request path.
When the customer runs an analysis cell, the in-platform agent may need to read content from a document the customer has explicitly attached to that cell. The agent runs server-side under the calling customer's identity; its document reads flow through the same SQL ownership filter described above.
If a malicious prompt tells the agent to "read document XYZ" where XYZ is some other customer's document, the agent receives "Document not found" — the row is never returned to the agent's session. Cross-customer read via prompt injection is structurally impossible.
A customer may publish a specific notebook with a shareable link. Shared notebooks expose the notebook content (cells, prose, charts) but never the underlying source documents. Document downloads always require ownership.
Upload endpoints enforce per-IP rate limits backed by Redis (so the limit is consistent across worker processes). Limits are configurable per deployment.
Every action against customer data is logged with a stable trail:
ag.workflow_events keyed to the customer's session context.The verified caller identity is bound to every event, so a post-incident timeline can be reconstructed against a specific customer or a specific request.
__MACOSX/) are rejected. Per-entry size is checked against the declared size.Soft delete is the default — deleting a document or data file marks the row deleted and hides it from every read path. Hard deletion (purging the bytes from GCS) is currently a manual operations procedure; automated hard-delete after a configurable retention window is on the near-term roadmap.
For institutional deployments, customers can specify their own retention policy on the per-tenant bucket via GCS lifecycle rules or a retention lock — these apply to the bytes regardless of the application's logical state.
Two independent layers guard customer data today:
WHERE clause; an internal path that mishandles identity returns no row.A planned third layer — Postgres row-level security with a per-request SET LOCAL of the user identity — extends ownership enforcement to the database role itself. After this lands, even a hypothetical application bug that forgets to thread identity cannot return another customer's row; the database refuses.
We list the work that is in flight rather than implying it is done:
us-central1 by default; EU regions and other GCP regions on request. CMEK keys are customer-controlled in the customer's own GCP project / KMS keyring.gsutil kms info gs://<bucket>).gsutil bucket get-iam-policy, gsutil bucketpolicyonly get).Security questions, vulnerability reports, or audit requests: [email protected].
We respond within 48 hours and treat reports under coordinated disclosure: 90 days from acknowledgment to public disclosure unless the reporter requests otherwise.