LICENLICEN Docs

How It Works

The end-to-end flow from dataset publish to royalty payout, explained in plain language.

How It Works

LICEN has two roles: Dataset Owners (publishers) and AI Researchers (buyers). Here's how the full lifecycle plays out — from first upload to final payout across 0G Storage, 0G Chain, and a simulated 0G Compute lifecycle. The production plan is to replace the simulation with a 0G-compatible confidential compute node that supports attestation-gated key release.

The Flow at a Glance

Dataset Owner                             AI Researcher
     │                                          │
     ▼                                          │
Encrypts dataset locally                        │
     │                                          │
     ▼                                          │
Sets policy (pricing, run caps,                 │
session rules, allowed uses)                    │
     │                                          │
     ▼                                          │
Publishes to 0G Storage ──────────────────► Browses marketplace
     │                                          │
     ▼                                          ▼
DataPolicy contract                    Pays escrow on-chain
anchors dataset on-chain ──────────────► (exact epochs × price)
     │                                          │
     │          ┌── AccessGranted event ────────┘
     │          ▼
     │    Orchestrator picks up the job
     │          │
     │          ▼
     │    Verifies access grant
     │    Simulates 0G Compute dispatch today
     │    Future: releases key only to attested TEE
     │          │
     │          ▼
     │    Training lifecycle runs/simulates securely
     │          │
     │          ▼
     │    Attestation confirmed on-chain
     │          │
     ▼          ▼
Royalty paid ◄── Settlement tx     Researcher receives model
automatically                       (LoRA adapter on 0G Storage)

Step 1: Publish a Dataset

The dataset owner visits the Publish page and uploads their JSONL file.

Before anything leaves the browser:

  • A random AES-256-GCM key is generated locally
  • The dataset is encrypted with that key
  • The key itself is wrapped using the Orchestrator's public key (ECIES envelope)

The encrypted blob and the sealed key envelope are uploaded to 0G Storage. A Merkle root (datasetRoot) is returned — this becomes the dataset's permanent, verifiable identity on-chain.

Your plaintext data never touches LICEN's servers. The encryption happens in your browser before the upload begins.

The owner then sets their policy. This policy is the dataset owner's control surface, not just a pricing form and not just a purpose tag. It defines the conditions under which the dataset may be used:

  • Price per epoch — how much a researcher pays per training pass
  • Max epochs per run — caps how much a single researcher can extract in one session
  • Max runs per requester — lifetime cap per wallet
  • Session TTL — how long an access grant is valid once approved
  • Allowed purposes — the permitted use case, such as medical research only or non-commercial fine-tuning
  • Policy expiry — when the dataset stops accepting new access requests

A single on-chain transaction anchors the datasetRoot, the policy hash, and the pricing rules to the DataPolicy smart contract on 0G Chain.


Step 2: Researcher Requests Access

The researcher finds the dataset on the Marketplace — populated in real time from on-chain events via the Envio indexer.

They choose how many epochs they want to train for. The UI calculates the total escrow:

escrow = royaltyPerEpoch × requestedEpochs

One transaction on 0G Chain locks that amount in the DataPolicy contract. The contract validates the request against the policy (epoch limits, session TTL, allowed purposes) and emits an AccessGranted event.


Step 3: Orchestrator Executes the Job

The Orchestrator — LICEN's backend worker — is continuously polling the Envio indexer for AccessGranted events.

In the current hackathon build, when it picks up the job:

  1. Verifies on-chain that the job state is Granted (not just relying on the indexer)
  2. Fetches the sealed key envelope for that datasetRoot
  3. Coordinates the demo-mode compute lifecycle
  4. Tracks simulated fine-tuning progress
  5. Produces a result reference for the UI and settlement flow
  6. Mirrors the 0G Compute task states researchers would expect
  7. Calls startJob() on-chain — state moves to Running

This demonstrates the product flow while avoiding a false claim that today's public 0G fine-tuning interface supports encrypted dataset key injection.

Production Compute Plan

The production architecture is a 0G-compatible confidential compute node:

  1. The node boots a training container inside a TEE/CVM.
  2. The node generates a remote attestation quote and an ephemeral public key inside the TEE.
  3. LICEN verifies the quote against approved hardware, image hash, and training code hash.
  4. Only then is the dataset AES key released, encrypted to that TEE public key.
  5. The node downloads the encrypted dataset from 0G Storage and decrypts it only inside the TEE.
  6. Training runs inside the attested runtime.
  7. The node encrypts the model output, uploads it to 0G Storage, and signs a result manifest.
  8. The contract settles royalties using the result hash and attestation reference.

This is the path that makes the stronger privacy guarantee defensible: the raw dataset is not handed to the researcher, the web app, or a normal orchestrator process.


Step 4: Training & Settlement

In the current demo, LICEN simulates the training lifecycle and completion signals. In production, the 0G-compatible confidential provider trains inside a measured TEE. When training completes:

  • The LoRA adapter is uploaded to 0G Storage
  • A resultHash and attestationRef are stored on-chain

The Orchestrator calls confirmTrainingComplete(). The DataPolicy contract releases:

  • Royalty to the dataset owner — exactly actualEpochs × royaltyPerEpoch
  • Refund to the researcher — any unspent escrow if actual < requested epochs

The researcher can now download their fine-tuned model from 0G Storage.


Security Properties

What is enforcedHow
Dataset stays encrypted until payment confirmsAES key is only eligible for release after on-chain Granted state is verified
Researcher can't exceed policy limitsDataPolicy contract rejects transactions that violate epoch/run/TTL caps
Royalties are automaticSettlement is a contract state transition — no invoicing, no trust
Training is verifiableDemo task UUID stored as attestationRef today; production upgrade stores a TEE attestation reference
No data leakage to LICEN web backendECIES means the web server cannot read the AES key
Production confidential input handlingPlanned 0G-compatible node releases dataset keys only after TEE attestation

On this page