> ## Documentation Index
> Fetch the complete documentation index at: https://docs.zeroentropy.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Embed

> Embeds the provided input text with ZeroEntropy embedding models.

The results will be returned in the same order as the text provided. The embedding is such that queries will have high cosine similarity with documents that are relevant to that query.

Organizations will, by default, have a ratelimit of `500000` bytes-per-minute. Ratelimits are refreshed every 15 seconds. If this is exceeded, requests will be throttled into `latency: "slow"` mode, up to `5000000` bytes-per-minute. If even this is exceeded, you will get a `429` error.

The "bytes" used by a request is calculated as `sum(150 + s.encode('utf-8') for s in input)`. Note a baseline overhead of `150` bytes. The maximum per-request payload size is `5000000` bytes.

To increase your ratelimits, subscribe to a higher tier on the [ZeroEntropy dashboard](https://dashboard.zeroentropy.dev/billing). Any payments made for subscriptions in a calendar month will be deducted from your usage charges for that month.

To request even higher ratelimits, please contact [founders@zeroentropy.dev](mailto:founders@zeroentropy.dev) or message us on [Discord](https://go.zeroentropy.dev/discord) or [Slack](https://go.zeroentropy.dev/slack)!



## OpenAPI

````yaml /api-reference/openapi.json post /models/embed
openapi: 3.1.0
info:
  title: ZeroEntropy API
  description: This API provides access to ZeroEntropy's SoTA retrieval pipeline. Enjoy!
  version: 0.1.0
servers:
  - url: https://api.zeroentropy.dev/v1
    description: ZeroEntropy API
  - url: https://eu-api.zeroentropy.dev/v1
    description: ZeroEntropy API (EU datacenters)
security: []
paths:
  /models/embed:
    post:
      tags:
        - Models
      summary: Embed
      description: >-
        Embeds the provided input text with ZeroEntropy embedding models.


        The results will be returned in the same order as the text provided. The
        embedding is such that queries will have high cosine similarity with
        documents that are relevant to that query.


        Organizations will, by default, have a ratelimit of `500000`
        bytes-per-minute. Ratelimits are refreshed every 15 seconds. If this is
        exceeded, requests will be throttled into `latency: "slow"` mode, up to
        `5000000` bytes-per-minute. If even this is exceeded, you will get a
        `429` error.


        The "bytes" used by a request is calculated as `sum(150 +
        s.encode('utf-8') for s in input)`. Note a baseline overhead of `150`
        bytes. The maximum per-request payload size is `5000000` bytes.


        To increase your ratelimits, subscribe to a higher tier on the
        [ZeroEntropy dashboard](https://dashboard.zeroentropy.dev/billing). Any
        payments made for subscriptions in a calendar month will be deducted
        from your usage charges for that month.


        To request even higher ratelimits, please contact
        [founders@zeroentropy.dev](mailto:founders@zeroentropy.dev) or message
        us on [Discord](https://go.zeroentropy.dev/discord) or
        [Slack](https://go.zeroentropy.dev/slack)!
      operationId: embed_models_embed_post
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/EmbedRequest'
        required: true
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EmbedResponse'
        '404':
          description: Not Found
          content:
            application/json:
              example:
                detail: Description of Error
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      security:
        - HTTPBearer: []
components:
  schemas:
    EmbedRequest:
      properties:
        model:
          type: string
          title: Model
          description: 'The model ID to use for embedding. Options are: ["zembed-1"]'
        input_type:
          type: string
          enum:
            - query
            - document
          title: Input Type
          description: The input type. For retrieval tasks, either `query` or `document`.
        input:
          anyOf:
            - type: string
            - items:
                type: string
              type: array
          title: Input
          description: The string, or list of strings, to embed.
        dimensions:
          anyOf:
            - type: integer
            - type: 'null'
          title: Dimensions
          description: >-
            The output dimensionality of the embedding model. For `zembed-1`,
            the available options are: [2560, 1280, 640, 320, 160, 80, 40].
        encoding_format:
          type: string
          enum:
            - float
            - base64
          title: Encoding Format
          description: >-
            The output format of the embedding. If `float`, an array of floats
            will be returned for each embeddings. If `base64`, a f32 little
            endian byte array will be returned, encoded as a base64 string.
            `base64` is significantly more efficient than `float`. The default
            is `float`.
          default: float
        latency:
          anyOf:
            - type: string
              enum:
                - fast
                - slow
            - type: 'null'
          title: Latency
          description: >-
            Whether the call will be inferenced "fast" or "slow". RateLimits for
            slow API calls are orders of magnitude higher, but you can expect
            2-20 second latency. Fast inferences are guaranteed subsecond, but
            rate limits are lower. If not specified, first a "fast" call will be
            attempted, but if you have exceeded your fast rate limit, then a
            slow call will be executed. If explicitly set to "fast", then 429
            will be returned if it cannot be executed fast.
      type: object
      required:
        - model
        - input_type
        - input
      title: EmbedRequest
    EmbedResponse:
      properties:
        results:
          items:
            $ref: '#/components/schemas/EmbedResult'
          type: array
          title: Results
          description: The list of embedding results.
        usage:
          allOf:
            - $ref: '#/components/schemas/EmbedUsage'
          description: Statistics regarding the tokens used by the request.
      type: object
      required:
        - results
        - usage
      title: EmbedResponse
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    EmbedResult:
      properties:
        embedding:
          anyOf:
            - items:
                type: number
              type: array
            - type: string
          title: Embedding
          description: >-
            The embedding of the input text, as an array of floats. If `base64`
            format is requested, the response will be an fp32 little endian byte
            array, encoded as a base64 string.
      type: object
      required:
        - embedding
      title: EmbedResult
    EmbedUsage:
      properties:
        total_bytes:
          type: integer
          title: Total Bytes
          description: >-
            The total number of bytes in the request. This is used for
            ratelimiting.
        total_tokens:
          type: integer
          title: Total Tokens
          description: The total number of tokens in the request. This is used for billing.
      type: object
      required:
        - total_bytes
        - total_tokens
      title: EmbedUsage
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
  securitySchemes:
    HTTPBearer:
      type: http
      description: >-
        The `Authorization` header must be provided in the format `Bearer
        <your-api-key>`.


        You can get your API Key at the
        [Dashboard](https://dashboard.zeroentropy.dev/)!
      scheme: bearer

````