Skip to content

Releases: google-ai-edge/LiteRT

v2.1.4

13 Apr 19:01

Choose a tag to compare

Release 2.1.4

Major Features and Improvements

  • Improved c/cc API surfaces to avoid using deprecated methods
  • Improved CMake build system’s build stability
  • Introduced Environment to Python Compiled Model API to allow passing options

v2.1.3

17 Mar 20:44

Choose a tag to compare

Release 2.1.3

Major Features and Improvements

  • Updated CMake build rules to support both CompiledModel and Interpreter APIs.
      cmake_example/CMakeLists.txt shows how you can use both libraries.

  • Expanded MediaTek NPU support to all applicable Android versions
      (only Android 15 previously) by supporting bundling MediaTek libraries in
      application binary.

  • Added experimental support for multi-threaded CompiledModel creation

  • Fixed GPU support for Python on Windows

  • Individual Options APIs (litert/cc/options) no longer use LiteRT C APIs (LiteRtXXX). They’re updated to use string based serialization and use LrtXXX() functions which can be linked individually.

  • Added basic support for Broadcom VideoCore GPUs.

Bug Fixes and Other Changes

  • Move the experimental API GetProfiler() out of litert::CompiledModel.

  • Fixed a bug that given CPU Buffers are not always synced with GPU Accelerator
      from the second inference.

  • Removed methods from litert::Event which uses C type LiteRtEnvironment.
      All C++ API should uses C++ litert::Environment instead.
      Also removed method CreateFromSyncFenceFd() that doesn't accept
      litert::Environment.

  • Updated litert::Event::Type() to return C++ types instead of C types.

  • The LiteRT headers no longer define the following OpenCL type names in the
      global namespace when OpenCL is not supported: cl_mem, cl_event.
      These have been replaced with the type aliases LiteRtClMem and
      LiteRtClEvent, defined in a new header litert/c/litert_opencl_types.h.
      All of these symbols that didn't include the LiteRt prefix in their name
      were never intended to be part of the LiteRT API, and their presence
      in the global namespace risked conflicts with header files from other
      packages.

  • Likewise, and for the same reason, the LiteRT headers no longer define the
      following WebGPU type names in the global namespace when WebGPU is
      supported: struct WGPUBufferImpl, WGPUBuffer. These have been
      replaced with the type alias LiteRtWGPUBuffer which is defined in a
      new header file litert/c/litert_webgpu_types.h. Alternatively, apps
      using these symbols can get them from WebGPU's webgpu.h header file.

  • Added experimental support for multi-threaded CompiledModel creation

v2.1.2

28 Jan 18:49

Choose a tag to compare

Release 2.1.2

Major Features and Improvements

  • Added Desktop GPU support in the ai-edge-litert Python package for Linux (through WebGPU) and Mac (through Metal).
  • Released Windows ai-edge-litert Python package that supports CPU inference (WebGPU on Windows in Python is coming soon)

v2.1.1

27 Jan 19:07

Choose a tag to compare

Release 2.1.1

Bug Fixes and Other Changes

  • Fixed the MacOS wheel build issue
  • Fixed the Qualcomm options passing issue in Kotlin API

v2.1.0

19 Dec 23:20

Choose a tag to compare

Release 2.1.0

Release 2.1.0 is the LiteRT beta release.

LiteRT APIs are stable and have achieved feature parity. This milestone marks a significant step forward, introducing full feature parity with TensorFlow Lite, stable LiteRT APIs, and critical performance enhancements for GPU and NPU acceleration. With this release, we are officially recommending that developers begin their transition to LiteRT.

Major Features and Improvements

LiteRT Runtime

  • Custom op is supported through custom op dispatcher.
  • CMake Build is supported in addition to Bazel
  • Released LiteRT C++ SDK using prebuilt libLiteRt.so file
  • Added Profiler API in CompiledModel
  • Added ErrorReporter API in CompiledModel
  • Added ResizeInputTensor API in CompiledModel

LiteRT NPU

  • Introduced LiteRT Accelerator Test Suite for coverage and regression testing
  • Introduced LiteRT graph transformation APIs for compiler plugins
  • Qualcomm
    • Added support for Qualcomm Snapdragon Gen5
    • Added support for NPU JIT mode
    • LiteRT Op coverage improvements
  • MediaTek
    • Added support for NPU JIT mode
    • LiteRT Op coverage improvements

LiteRT GPU

  • Increased GPU coverage with WebGPU/Dawn and OpenCL including Android, Linux, MacOS, Windows, iOS, IoT devices
  • Added asynchronous execution to Metal, WebGPU backends
  • Improved performance and memory footprint
  • Added an option to control GPU inference priority
  • Better error handling (without crashing) on Delegation errors

LLM Support

  • Provided Desktop GPU backends prebuilt for Linux (x64, arm64), MacOS (arm64), Windows (x64)
  • Improved memory utilization when executing on GPUs
  • Published new LLMs on https://huggingface.co/litert-community
    • litert-community/FastVLM-0.5B
    • litert-community/Qwen3-0.6B
    • litert-community/embeddinggemma-300m with new NPU precompiled models
    • litert-community/gemma-3-270m-it with new NPU precompiled model
  • Published Function Gemma on https://huggingface.co/google
    • google/functiongemma-270m-it

LiteRT on Android

  • Added Interpreter API (CPU only) in the Maven v2.1.0+ packages
  • Added Instruction to use pre-built CompiledModell C++ API from the Maven package.

Bug Fixes and Other Changes

Fixes Android min SDK version and it’s 23 now.

LiteRT NPU: Fixes partition algorithm when the full model cannot be offloaded to NPU.

Breaking Changes

  • Removed direct C headers usage. Users no longer need to include C headers.
  • TensorBuffer::CreateManaged() requires Environment always.
  • All TensorBuffer creation requires Environment except HostMemory types.
  • LiteRT C++ constructors are hidden. All LiteRT C++ objects should be created by Create() methods.
  • Moved internal only C++ APIs(such as litert_logging.h) to litert/cc/internal
  • Removed Tensor, Subgraph, Signature access from litert::Model. Instead users can access SimpleTensor, SimpleSignature.
  • The CompiledModel::Create() API no longer needs litert::Model. They can be created from filename, model buffers directly.
  • Users can access SimpleTensor and SimpleSignature from CompiledModel.
  • Annotation, Metrics APIs are removed from CompiledModel.
  • Removed individual OpaqueOptions creation. These OpaqueOptions objects are obtained by Options directly.
    • Options::GetCpuOptions()
    • Options::GetGpuOptions()
    • Options::GetRuntimeOptions()

v2.1.0rc1

21 Nov 23:35

Choose a tag to compare

v2.1.0rc1 Pre-release
Pre-release

Release 2.1.0rc1

Major Features and Improvements

  • NPU: Added support for Qualcomm Snapdragon Gen5
  • NPU: Added support for MediaTek Dimensity 9500
  • NPU: Added support for NPU JIT mode on Qualcomm and MediaTek

Bug Fixes and Other Changes

  • Fixes Android min SDK version to 23.
  • NPU: Fixes partition algorithm when the full model cannot be offloaded to NPU.

Breaking Changes

  • Removed direct C headers usage. Users no longer need to include C headers.
  • TensorBuffer::CreateManaged() requires Environment always.
  • All TensorBuffer creation requires Environment except HostMemory types.
  • LiteRT C++ constructors are hidden. All LiteRT C++ objects should be created by Create() methods.
  • Move internal only C++ APIs(such as litert_logging.h) to litert/cc/internal
  • Remove Tensor, Subgraph, Signature access from litert::Model. Instead users can access SimpleTensor, SimpleSignature from CompiledModel.
  • The CompiledModel::Create() API no longer needs litert::Model. They can be created from filename, model buffers directly.
  • Annotation, Metrics APIs are removed from CompiledModel.
  • Removed individual OpaqueOptions creation. These OpaqueOptions objects are obtained by Options directly.
    • Options::GetCpuOptions()
    • Options::GetGpuOptions()
    • Options::GetRuntimeOptions()

v1.4.1

19 Nov 18:59

Choose a tag to compare

Release 1.4.1

Bug Fixes and Other Changes

  • Fixed Android minSDK version to 21

v2.0.3

13 Nov 18:31
b85cdce

Choose a tag to compare

Release 2.0.3

Major Features and Improvements

  • Add Python backend for Google Tensor. The backend doesn't yet register itself, so it's available by default.
  • Change manufacturer to Google and SoC models to include the Tensor_ prefix for Google Tensor.
  • Minor naming changes to some flags for the Google Tensor compiler plugin.

Bug Fixes and Other Changes

  • N/A

v2.0.2

17 Sep 18:34

Choose a tag to compare

Release 2.0.2

Major Features and Improvements

LiteRT GPU Accelerator

  • Added an option to control GPU inference priority.

LiteRT API Refactoring

  • Introduced target litert/cc:litert_api_with_dynamic_runtime This is a convenience Bazel target containing LiteRt C++ and C APIs. Users of this library are responsible to bundle LiteRT C API Runtime libLiteRtRuntimeCApi.so.
  • C++ APIs that need LiteRT C API Runtime are moved to litert/cc/dynamic_runtime/
    Note: This is for internal usage. If you want to use dynamic API, use litert/cc:litert_api_with_dynamic_runtime.
  • All static public C++ APIs (including litert/cc/internal) are moved to litert/cc/
    Note: You shouldn't mix static API targets with dynamic API targets.

Bug Fixes and Other Changes

  • Fixed a segmentation fault error on //litert/tools:apply_plugin_test
  • Refactored example backend compiler plugin and dispatch implementation.
  • Improved LiteRT op coverage for Qualcomm and MediaTek backends.

v2.0.2a1

02 Sep 15:59

Choose a tag to compare

v2.0.2a1 Pre-release
Pre-release

Release 2.0.2a1

LiteRT

Major Features and Improvements

Breaking Changes

  • com.google.ai.edge.litert.TensorBufferRequirements
    • It becomes a data class, so all fields could be accessed directly without getter methods.
    • The type of field strides changes from IntArry to List<Int> to be immutable.
  • com.google.ai.edge.litert.Layout
    • The type of field dimensions and strides changes from IntArry to List<Int> to be immutable.
  • Rename GPU option NoImmutableExternalTensorsMode to NoExternalTensorsMode

Known Caveats

Major Features and Improvements

  • [tflite] Add error detection in TfLiteRegistration::init(). When a Delegate
    kernel returns TfLiteKernelInitFailed(), it is treated
    as a critical failure on Delegate. This error will be detected in
    SubGraph::ReplaceNodeSubsetsWithDelegateKernels() will cause
    Delegate::Prepare() to fail, ultimately leading
    InterpreterBuilder::operator() or Interpreter::ModifyGraphWithDelegate() to
    return an error.
  • Added Profiler API in Compiled Model: source.
  • Added Error reporter API in Compiled Model: source.
  • Added resize input tensor API in Compiled Model: source.

Bug Fixes and Other Changes

  • The Android minSdkVersion has increased to 23.
  • Update tests to provide kLiteRtHwAcceleratorNpu for fully AOT compiled
    models.