Skip to content

anibridge/anibridge-mappings

Repository files navigation

AniBridge Mappings

Mapping Entries Distinct IDs Download Mappings

AniBridge Mappings is an autogenerated dataset that links anime entries across major media providers. It is primarily designed for the AniBridge project to drive cross-database matching and episode sync, but it can be consumed by any tooling that needs consistent ID and episode mappings.

The mapping payload is generated by a Python pipeline that merges multiple trusted sources, applies curated overrides, validates episode ranges, and emits a serialization of the mappings schema.

Releases are published on a rolling bases in the GitHub releases page.

A huge thank you to the primary mappings maintainer, @LuceoEtzio, for contributing over 4,000 mapping edits! ❤️

Key Features

  • Multi-source aggregation: Combines ID and episode data from multiple upstream projects.
  • Episode range mapping: Supports explicit ranges, open-ended ranges, and ratio-based mappings between providers.
  • Manual overrides: Curated edits are applied on top of automated sources.
  • Metadata-informed validation: Episode counts from metadata providers are used to validate and prune invalid ranges and infer missing mappings.
  • Compressed outputs: Supports minified and zstd-compressed payloads for efficient storage and transfer.

How It Works

  1. Fetch sources: Download upstream datasets and metadata feeds.
  2. Build ID graph: Collect cross-database ID links from sources.
  3. Collect metadata: Fetch relevant metadata (episode counts, durations, season info, etc.) from sources.
  4. Build episode graph: Normalize and merge episode mappings from all sources.
  5. Infer mappings: Use techniques like transitive closure and metadata alignment to infer missing episode mappings.
  6. Apply edits: Overlay mapping overrides from mappings.edits.yaml onto the aggregated data.
  7. Validate & prune: Validate episode ranges against metadata and remove invalid, overlapping, or inconsistent mappings.
  8. Emit schema: Serialize to the mappings.schema.json format.

Data Sources

Source Metadata ID Mappings Episode Mappings Providers
Anime-Lists/anime-lists No Yes Yes AniDB, IMDB, TMDB, TVDB
manami-project/anime-offline-database Yes Yes No AniDB, AniList, MAL
notseteve/AnimeAggregations Yes Yes No AniDB, IMDB, MAL, TMDB
varoOP/shinkro-mapping No Yes Yes MAL, TMDB, TVDB
QLever Yes Yes No AniDB, AniList, IMDB, MAL, TMDB, TVDB
AniList GraphQL Yes Not Yet No AniList
TMDB API Yes No No TMDB
TVDB API Yes No No TVDB

Note: "Not Yet" indicates potential future work.

Mappings Schema

For the purposes of this README, we will only cover the subset of the schema that is relevant to this pipeline. To read about the full extent of the schema when used with AniBridge, see the custom mapping docs

mappings.schema.json

The output is a JSON object where each key is a source descriptor and each value is a map of target descriptors. Descriptors use the format:

provider:id[:scope]
  • provider: one of anidb, anilist, imdb_movie, imdb_show, mal, tmdb_show, tmdb_movie, tvdb_show.
  • id: the provider-specific identifier (e.g. AniDB ID 1234 or TMDB ID tt1234567).
  • scope: is optional and used to denote some type of subsetting. For this dataset, most providers use season scopes like s0, s1, etc. AniDB uses episode-type scopes: R (regular), S (specials), O (other), plus any additional AniDB episode types when needed. AniList and MAL omit scopes (null/empty string).

Each target descriptor maps source episode ranges to target ranges:

{
  "anidb:1:S": {
    "tvdb_show:2:s0": {}, // from anidb id 1, specials to tvdb id 2, season 0
    "tmdb_show:3:s1": {}, // from anidb id 1, specials to tmdb id 3, season 1
  },
  "mal:4": {
    "tmdb_show:5:s0": {}, // from mal id 4 (no scope) to tmdb id 5, season 0
  },
}

The key, value of each target descriptor is a map where keys denote a source range and values denote the corresponding target range. For the purposes of this dataset, keys and values will define episode ranges. Ranges use the format:

x[-y][|ratio][,x2[-y2][|ratio2]...]
  • x: starting episode number (1-based).
  • y: optional ending episode number (inclusive). If omitted, the range is open-ended.
  • ratio: optional ratio indicating the 'weight' of each episode in a range. A positive ratio n indicates each episode spans n episodes in the opposing range. A negative ratio -n indicates each episode spans 1/n episodes in the opposing range.
  • Multiple ranges can be comma-separated to denote non-contiguous mappings. Note: non-contiguous ranges are only supported on the target side.
{
  "anidb:5:R": {
    "tvdb_show:6:s0": {
      "1-12": "1-12", // source episodes 1-12 map to target episodes 1-12
      "14-": "13-", // source episodes 14 and onward map to target episodes 13 and onward
    },
    "anilist:7": {
      "1-12": "1-6,8-13", // source episodes 1-12 map to target episodes 1-6 and 8-13 (skipping 7)
      "13-": "14-|2", // source episodes 13 and onward map to target episodes 14 and onward at double granularity
    },
  },
}

Manual Edits

Mapping overrides live in mappings.edits.yaml. The format mirrors the schema structure: a source descriptor maps to target descriptors, which in turn map source ranges to target ranges.

Example:

anilist:12345: # Some comment about this mapping
  tvdb_show:98765:s1:
    "1-12": "1-12"
  tmdb_show:54321:s1:
    "1-12": "1-12"

When the pipeline runs, it removes any existing mappings between the specified source and target scopes and replaces them with your entries.

Running the Pipeline

The CLI entrypoint is main.py. Typical usage:

uv run ./main.py

Options:

  • --out: output file path (default: data/out/mappings.json)
  • --edits: path to the edits file (default: mappings.edits.yaml)
  • --compress: emit minified and zstd-compressed outputs to data/out/
  • --stats: emit stats.json to data/out/
  • --provenance: emit provenance.json with per-mapping timelines
  • --log-level: set logging verbosity (default: INFO)

Note: TMDB and TVDB metadata fetching require API tokens set in TMDB_API_KEY and TVDB_API_KEY environment variables. Without it, Metadata fetching will be skipped.

Contributing

The best way to contribute is by fixing or adding mappings in mappings.edits.yaml. If you need to reference why a mapping was changed, include a comment inside the mapping entry (not at the root level), so the formatter can preserve it.