-
Notifications
You must be signed in to change notification settings - Fork 209
Description
No where in the spec is it clear what the rules are for ID fields. The only "documentation" is in the validators.
For example, it seems reasonable that trip_id would only need to be unique within the combination of service and route IDs in the trips.txt file (in SQL terms, PRIMARY KEY (service_id, route_id, trip_id)). However, validators tell me that trip_id is required to be unique within trips.txt regardless of the combination of other ID fields (PRIMARY KEY (trip_id)), but the other ID fields, route_id and service_id, are not expected to be unique.
Relying on implied rules results in further confusion. For example, the service_id ID is the only ID in calendar_dates.txt, but is not expected to be unique (FOREIGN KEY (service_id)), which you might expect if you assumed that an ID needs to be unique if it's the only ID, or if the ID name is a close match to the file name or file purpose.
Maybe adapting language from relational databases, e.g. primary ID and foreign ID, would clarify the role of each ID in each file. If there are any examples where there is a multi-column primary key (multiple ID fields used to uniquely identify a row) and that constraint is expected by validators, it should be noted in the file information.
For example, the spec for stop_times.txt could look like this:
### stop_times.txt
File: **Required**
PRIMARY KEY(trip_id, stop_sequence)
| Field Name | Type | Required | Description |
| ------ | ------ | ------ | ------ |
| `trip_id` | foreign ID referencing `trips.trip_id` | **Required** | Identifies a trip. |