Skip to main content

April 23, 2024

Introducing more enterprise-grade features for API customers

Increasing enterprise support with more security features and controls, updates to our Assistants API, and tools to better manage costs.

More Enterprise Grade Features Hero Image

We work with many enterprises like Klarna, Morgan Stanley, Oscar, Salesforce, and Wix to help them build AI solutions from scratch and safely deploy AI across their organizations and products. We’re deepening our support for enterprises with new features that are useful for both large businesses and any developers who are scaling quickly on our platform.

Enhanced enterprise-grade security

We’ve introduced Private Link, a new way that customers can ensure direct communication between Azure and OpenAI while minimizing exposure to the open internet. We’ve also released native Multi-Factor Authentication(opens in a new window) (MFA) to help ensure compliance with increasing access control requirements. These are new additions to our existing stack of enterprise security features(opens in a new window) including SOC 2 Type II certification, single sign-on (SSO), data encryption at rest using AES-256 and in transit using TLS 1.2, and role-based access controls. We also offer Business Associate Agreements(opens in a new window) for healthcare companies that require HIPAA compliance and a zero data retention policy for API customers with a qualifying use case.

Better administrative control

With our new Projects(opens in a new window) feature, organizations will have more granular control and oversight over individual projects in OpenAI. This includes the ability to scope roles and API keys to specific projects, restrict/allow which models to make available, and set usage- and rate-based limits to give access and avoid unexpected overages. Project owners will also have the ability to create service account API keys, which give access to projects without being tied to an individual user.

More Enterprise Grade Features Product Demo-1

Assistants API improvements

We’ve introduced several updates to the Assistants API for more accurate retrieval, flexibility around model behavior and tools used to complete tasks, and better control over costs. These features include:

  • Improved retrieval with ‘file_search’ which can ingest up to 10,000 files per assistant—a 500x increase from the previous file limit of 20. The tool is faster, supports parallel queries through multi-threaded searches, and has enhanced reranking and query rewriting.

  • Streaming support for real-time, conversational responses—one of the top requests from developers and enterprises.
    New ‘vector_store’ objects in the API so files can be added to a vector store and automatically parsed, chunked, and embedded in preparation for file search. Vector stores can be used across assistants and threads, simplifying file management and billing.

  • Control over the maximum number of tokens used per run, plus limits on previous and recent messages used in each run, so you can manage token usage costs.
    New ‘tool_choice’ parameter to select a specific tool (like ‘file_search’, ‘code_interpreter’, or ‘function’) in a particular run.

  • Support for fine-tuned GPT-3.5 Turbo models in the API (to start, we’ll support fine-tunes of ‘gpt-3.5-turbo-0125’).

More Enterprise Grade Features Product Demo-2

More options for cost management

To help organizations scale their AI usage without over-extending their budgets, we’ve added two new ways to reduce costs on consistent and asynchronous workloads:

  • Discounted usage on committed throughput: Customers with a sustained level of tokens per minute (TPM) usage on GPT-4 or GPT-4 Turbo can request access to provisioned throughput to get discounts ranging from 10–50% based on the size of the commitment.

  • Reduced costs on asynchronous workloads: Customers can use our new Batch API(opens in a new window)

    to run non-urgent workloads asynchronously. Batch API requests are priced at 50% off shared prices, offer much higher rate limits, and return results within 24 hours. This is ideal for use cases like model evaluation, offline classification, summarization, and synthetic data generation.

We plan to keep adding new features focused on enterprise-grade security, administrative controls, and cost management. For more information on these launches, visit our
API documentation(opens in a new window) or get in touch with our team to discuss custom solutions for your enterprise.

Related research