Designing an AI Procurement RFP for the Omani Public Sector
The folder marked "AI tender, draft v3" lands on a procurement officer's desk with the cover sheet pulled from a 2019 software-licensing RFP. By page eight the template asks for support hours and a maintenance window, by page twelve it asks for source code escrow, and by page sixteen it has run out of vocabulary. Nowhere does it mention model lineage, training data, evaluation methodology, or the exit obligation on fine-tuned weights. This is the problem with reusing generic IT templates for an Oman government AI RFP. The artefact you are buying is not static software, and the clauses that protect a sovereign buyer of static software do not, by themselves, protect a sovereign buyer of a learned model. This piece walks through a defensible RFP skeleton, anchored to on-premise AI for sovereign institutions.
Why generic-IT RFP templates fail for AI
A traditional IT RFP assumes the deliverable is a frozen artefact: a binary, a database schema, a configuration. Three assumptions break for AI workloads.
- The artefact is learned, not authored. Weights are produced from a training corpus the buyer rarely sees. Lineage matters more than line-of-code provenance.
- Behaviour drifts. The same model with the same prompt can produce different outputs as retrieval indexes update or as the runtime is patched. SLA language has to address response quality, not only response time.
- Exit is heavy. Fine-tuned weights, retrieval embeddings, and evaluation harnesses are paid-for assets that buyers routinely fail to extract on contract end, because the generic IT exit clause never asked for them.
The UK Government's Guidelines for AI Procurement, refreshed alongside the Procurement Act 2023 regime, were written precisely because reusing generic IT templates kept producing tenders that institutions could not later govern. Oman's public-sector procurement teams face the same drift, against the additional backdrop of PDPL Article 23 on cross-border data flows.
The 12 mandatory clauses an Omani public-sector AI RFP needs
Each clause below is a single page in the body of the RFP, with measurable acceptance criteria, not aspirational language.
- Data residency. Training data, fine-tuning corpora, retrieval indexes, model weights, prompt and response logs, and operational telemetry all remain inside Oman. The vendor names the data centre, the storage subsystem, and the egress controls.
- Model lineage. Disclosure of base model identity, version, licence, training-data summary, and any subsequent pre-training or instruction-tuning steps applied by the vendor before delivery.
- Fine-tuning intellectual property. Weights produced from the institution's data, including LoRA adapters and full fine-tunes, are the institution's property, exportable on demand and on exit.
- Exit clause. A defined off-boarding path: export of weights, embeddings, configuration, audit logs, and evaluation harnesses in open formats, within a stated number of working days. No retention by the vendor.
- Indemnity. Vendor-side indemnity for training-data infringement claims, third-party patent claims on the model architecture, and regurgitation of training-time personal data, capped at a defined multiple of contract value.
- Service level agreement. Latency, availability, and a quality floor measured against a pinned evaluation suite. Quality regressions trigger remediation, not only outage credits.
- Evaluation methodology. The vendor co-signs a written evaluation harness, the test prompts, the scoring rubric, and the cadence of re-evaluation through the contract life. Methodology is part of the RFP, not a post-award discussion.
- Compliance attestation. A statement of applicability mapping MTCIT, ISO 27001 Annex A, and NIST SP 800-53 Rev. 5 controls, signed by an authorised officer.
- Software bill of materials. A full SBOM extended to AI components: base weights with hashes, fine-tuning datasets with provenance, retrieval indexes, runtime libraries, and CUDA toolchain. CISA's 2025 minimum-elements draft sets the floor, with AI-specific extensions on top.
- Audit rights. The institution and its appointed auditor reserve the right to inspect logs, configurations, model artefacts, and physical premises on reasonable notice, throughout the contract and for a defined retention period after exit.
- Decommission. Cryptographic erasure of weights and indexes on contract end, witnessed and certified, with chain-of-custody for any media leaving the perimeter.
- Training data. Where the vendor proposes to train or fine-tune on institutional data, explicit consent boundaries, retention windows, deletion paths, and a refusal of any cross-customer training pool.
Technical evaluation criteria
Evaluation is structured in three layers, scored before any commercial terms are opened.
- Automated evaluation suite. The institution prepares 200 to 500 representative prompts with reference answers, run against each shortlisted vendor in identical conditions, scored for accuracy, calibrated refusal, and latency. The suite is preserved as a contract artefact for re-runs through the deployment life.
- Human-rated round. The same prompts are scored by three to five domain experts blinded to vendor identity, against a pre-agreed rubric. Inter-rater agreement is itself reported.
- Blind A/B on real workload. A sample of live institutional traffic is routed to each shortlisted vendor in parallel, with vendor identities masked from the end users completing the rating. This is the only step that shows how the model behaves on the institution's actual data distribution.
Common vendor red flags
Five patterns recur in the responses procurement teams send back for clarification.
Cloud region as sovereignty. A Gulf cloud region is named, the control plane and observability backend are not. The cross-border posture cannot be enforced.
Opaque model lineage. The response refuses to name the base model, or names a version that has been deprecated, or quietly bundles a third-party reasoning model with no licence disclosure.
Per-token pricing without a cap. Commercial exposure scales with usage and the institution has no upper bound. A defensible response provides per-token pricing and a contract-level cost ceiling.
Exit clauses that exclude fine-tunes. The vendor agrees to return data but treats the fine-tuned weights as proprietary. This makes the deployment non-portable by construction.
Generic incident playbooks. No mention of prompt injection, training-data poisoning, hallucination cascades, or regurgitation of training-time personal data.
Timeline and approval gates
An indicative path from "we need an AI tender" to a signed contract runs eighteen to twenty-six weeks. Approval gates sit at four points.
- Gate 1, scope and risk. Sponsoring institution and risk owner sign off on the use case, classification, and the twelve-clause RFP draft. Two to four weeks.
- Gate 2, market engagement. Optional pre-tender briefing with two or three candidate vendors to surface clarifications, with a published Q&A. Two to three weeks.
- Gate 3, technical evaluation. Tender publication, response window, automated evaluation suite, human-rated round, blind A/B, technical ranking. Eight to twelve weeks.
- Gate 4, commercial and contract. Commercial envelope opening, contract negotiation, MTCIT and PDPL sign-off, board or ministerial approval. Six to seven weeks.
Hosn ships sovereign AI as an on-premise appliance, with the twelve-clause skeleton, the pre-mapped statement of applicability, and the AI bill of materials produced as standard contract artefacts. Mu'een, Oman's national shared-AI platform, addresses a different layer of the stack and does not overlap with the institutional RFP described here. If your team is drafting an AI tender or evaluating responses already received, the practical step is a one-hour briefing with the twelve-clause RFP skeleton and the evaluation harness in hand. Email [email protected] or message +968 9889 9100. We come to you, in Muscat or anywhere in the GCC. Pricing is by quotation, sized to the deployment.
Frequently asked
Can a generic IT RFP template be reused for an AI procurement?
Not safely. A generic IT template treats software as a static deliverable, which an AI system is not. Model weights drift, training data shapes future behaviour, and prompt logs become evidence in regulatory inquiries. An AI RFP needs explicit clauses for data residency, model lineage, fine-tuning intellectual property, evaluation methodology, exit and decommission, and audit rights. Without those, the buyer has paid for a moving target with no clear way to inspect, replace, or retire it.
What is an AI software bill of materials and why does an Omani RFP need one?
An AI bill of materials extends the classical software bill of materials to learned components. It lists base model weights with version and licence, fine-tuning datasets with provenance, retrieval indexes, evaluation harnesses, and runtime libraries with their hashes. CISA's 2025 minimum-elements draft expressly anticipates this AI extension. For an Omani public-sector buyer, the AI bill of materials is the artefact that makes the model auditable five years after the original tender team has moved on.
How should the technical evaluation be structured?
Three layers. First, an automated evaluation suite of 200 to 500 representative prompts run against each shortlisted vendor in identical conditions, scored on accuracy, refusal behaviour, and latency. Second, a human-rated round on the same prompts by domain experts blinded to the vendor identity. Third, a blind A/B comparison on a sample of real workload, using the institution's actual users, with vendor names masked. Cost and commercial terms come last, after the technical merit ranking is locked.
What are the most useful red flags to spot in vendor responses?
Five recurring patterns. A cloud region cited as a sovereignty answer. Refusal to disclose base model and version. Vague training-data lineage. Pricing per token without a defined volume cap. And exit clauses that exclude fine-tuned weights and retrieval indexes from the deliverables on contract end. Any one of these is grounds to send the response back for clarification before the technical evaluation begins.