Interview Guide

30 Advanced GraphQL Backend Interview Questions for Senior Role

June 25, 2025 · 18 min read

GraphQL Backend Q&A Component

Jump to Category

Core Concepts & Schema Design	Performance & Caching
️ Architecture & Microservices	Real-time with Subscriptions
️ Security & Error Handling

Core Concepts & Schema Design

1. What are Union and Interface types in GraphQL, and how do they differ?

Both are abstract types that enable a field to return one of several object types, but they have a key difference in structure.

An Interface defines a contract that implementing types *must* adhere to. All types that implement the interface must contain the fields defined by that interface. It’s useful when different types share a common set of fields (e.g., a `Character` interface with a `name` field implemented by `Human` and `Droid` types).
A Union is a more loose collection of object types that do *not* need to share any common fields. It simply states that a field can return one of the specified types. It’s useful for returning completely unrelated types from a single field (e.g., a `SearchResult` union that can be a `User`, `Post`, or `Comment`).

2. Explain the purpose of custom scalars. Provide a practical use case.

GraphQL has built-in scalar types (`Int`, `Float`, `String`, `Boolean`, `ID`). Custom scalars allow you to define your own atomic leaf types with specific validation and serialization logic.

A practical use case is a `DateTime` scalar. On the server, you would define how to:

Serialize: Convert a server-side date object (e.g., a `Date` object in JavaScript) into a standardized string format (like ISO 8601) for the JSON response.
Parse Value: Parse an incoming string value (from a query variable) into a server-side date object.
Parse Literal: Parse a hardcoded string value from the query’s Abstract Syntax Tree (AST).

This ensures that date/time values are always handled consistently and correctly throughout your API.

3. What are custom directives, and how can they be used on the backend?

Directives (`@directiveName`) are annotations that can be attached to any part of a GraphQL schema or query to influence execution behavior without changing the response shape.

On the backend, you can create custom directives to handle cross-cutting concerns. Examples include:

`@auth(requires: “ADMIN”)`: A schema directive to enforce authorization rules on a field. The resolver logic for this directive would check the user’s role before executing the field’s actual resolver.
`@uppercase`: A field directive to transform the result of a string field to uppercase.
`@log`: A directive to log access to a specific field.

Learn about implementing Schema Directives.

4. Compare schema-first vs. code-first approaches to building a GraphQL API.

Schema-First: You define your API contract by writing the schema in the GraphQL Schema Definition Language (SDL). The schema acts as the source of truth, and your server code (resolvers) is written to implement that schema. This is great for collaboration between frontend and backend teams.
Code-First: You define your schema programmatically using the constructs of a specific programming language (e.g., classes, decorators). The GraphQL schema is then generated from your code. This approach can offer better type safety between your schema and your code and may be faster for developers working in a single language.

5. How would you design a schema for pagination? Compare cursor-based vs. offset-based pagination.

Offset-based pagination (`limit`, `offset`) is simple but has drawbacks: it can be inefficient for large offsets (the database still has to scan and discard rows) and can miss or show duplicate data if items are added or removed while a user is paginating.

Cursor-based pagination is the recommended approach for GraphQL. It’s more robust and performant. You return an opaque `cursor` for each item. To get the next page, the client passes the cursor of the last item it received. The server then queries for items “after” that cursor. This approach is stateless and performs well regardless of page depth. A common implementation follows the Relay Connection Specification.

Read the official guide on Pagination.

6. How should you handle schema evolution and deprecation of fields?

GraphQL schemas are designed to be evolvable without breaking existing clients. The best practice is to never make breaking changes (like removing or renaming a field).

Adding Fields: Always safe. Old clients simply won’t request the new field.
Deprecating Fields: If a field needs to be replaced, mark the old one with the `@deprecated(reason: “…”)` directive. This will show up in introspection tools, signaling to clients that they should migrate away from it, but it will continue to function for older clients.

Performance & Caching

7. What is the N+1 query problem in GraphQL and how is it solved?

The N+1 problem occurs when a query fetches a list of items (1 query) and then, for each of the N items, it makes an additional query to fetch a related piece of data. For example, getting 10 posts and then making 10 separate queries to get the author for each post.

This is solved using the **DataLoader pattern**. DataLoader is a utility that collects all the individual IDs from a single tick of the event loop, batches them into a single query (e.g., `SELECT * FROM users WHERE id IN (…)`), and then redistributes the results back to the individual resolvers that requested them. This turns N+1 queries into just 2 queries.

Explore the DataLoader library on GitHub.

8. How can you protect a GraphQL API from expensive or malicious queries?

Because clients can request deeply nested data, a GraphQL API is vulnerable to Denial-of-Service (DoS) attacks. Protection strategies include:

Query Depth Limiting: Reject queries that exceed a certain nesting depth.
Query Complexity/Cost Analysis: Assign a “cost” to each field in your schema. Before executing a query, calculate its total cost and reject it if it exceeds a predefined threshold. This is more flexible than depth limiting.
Throttling/Rate Limiting: Limit the number of requests a client can make in a given time window.
Query Timeouts: Terminate queries that take too long to execute.

Read the official security guide on Rate Limiting and Cost Analysis.

9. What are persisted queries and what benefits do they offer?

Persisted queries are a technique where instead of sending the full GraphQL query string with each request, the client sends a pre-registered hash or ID. The server maintains a mapping of these IDs to the full query strings.

Benefits:

Reduced Bandwidth: Sending a small ID is much cheaper than a large query string.
Improved Security: You can configure your server to only accept pre-registered queries, effectively creating an allowlist and preventing malicious or exploratory queries from being run in production.
Caching: Simplifies caching at the CDN/edge layer.

10. Describe different caching strategies for a GraphQL API.

Caching GraphQL is complex because of the flexible nature of queries. Strategies include:

Client-Side Caching: Libraries like Apollo Client and Relay maintain a sophisticated normalized cache on the client, which significantly reduces the need to re-fetch data.
HTTP Caching: For `GET` requests, standard HTTP caching (`Cache-Control`, `ETag`) can be used, especially with persisted queries.
Edge/CDN Caching: A CDN can cache full responses for public, non-authenticated queries.
Server-Side Caching: You can cache the results of individual resolver functions (e.g., using Redis). This requires careful cache invalidation logic.

Read the Apollo Server guide on Caching.

11. What is the difference between a query and a mutation, from a conceptual and execution standpoint?

Conceptually, **queries** are for fetching data (read operations) and **mutations** are for changing data (write operations).

From an execution standpoint, the GraphQL spec mandates that fields within a **query** can be executed in parallel. Fields within a **mutation**, however, must be executed serially, in the order they are specified. This ensures that write operations happen in a predictable sequence and prevents race conditions between mutations in the same request.

Architecture & Microservices

12. Compare GraphQL Federation and Schema Stitching. When would you choose one over the other?

Both are strategies for combining multiple GraphQL services into a single, unified graph.

Schema Stitching: An older approach where a gateway server fetches the schemas from downstream services and “stitches” them together by creating new links and resolvers on the gateway itself. The gateway is responsible for a significant amount of coordination logic.
Apollo Federation: A more modern, declarative approach. Each underlying service (a “subgraph”) annotates its schema to indicate how it connects to other subgraphs. A gateway then composes these subgraphs automatically. The responsibility for defining relationships is distributed among the services themselves, making it more scalable and decentralized.

For new projects, **Federation** is generally the recommended approach due to its scalability and better separation of concerns.

Learn about Apollo Federation.

13. What is the “context” object in a GraphQL resolver and what is it used for?

The `context` is an object that is created for each incoming GraphQL request and passed to every resolver that executes for that request. It’s a “bag” of shared data and utilities.

It’s primarily used to hold per-request state, such as:

Authentication information (e.g., the currently logged-in user).
Instances of data sources or services needed by the resolvers (e.g., database connection pools).
Request-scoped caches or instances of DataLoader.

This avoids using global variables and is the primary way to provide dependencies to your resolver functions.

14. What are the four arguments passed to a typical resolver function?

A standard resolver function receives four arguments:

`parent` (or `root`): The result of the parent field’s resolver. For top-level fields, this is the root value provided to the execution.
`args`: An object containing the arguments passed to the field in the query.
`context`: The shared context object for the entire request.
`info`: An object containing information about the query’s execution state, including the schema, query AST, and path. It’s used for advanced use cases like building dynamic queries or implementing field-level caching.

15. How would you design a GraphQL API to serve both a web client and a mobile client with different data needs?

This is a core strength of GraphQL. Instead of creating separate endpoints (like `/api/v1/web/posts` and `/api/v1/mobile/posts`), you would have a single, unified GraphQL schema. The web client would then query for the fields it needs, and the mobile client would query for the slightly different set of fields it needs. This avoids data over-fetching and under-fetching. If there are fields that should *only* be available to one client type, you can use custom directives (e.g., `@client(name: “web”)`) to enforce authorization at the schema level.

16. What is the role of the `__typename` field?

The `__typename` is a meta-field provided by GraphQL’s introspection system. You can request it on any object type. Its value is a string representing the name of the object type. This is particularly useful on the client when querying fields that return an interface or a union type. By requesting `__typename`, the client can determine which specific concrete type was returned and handle it accordingly, for example, to render the correct UI component.

Real-time with Subscriptions

17. How do GraphQL Subscriptions work at a high level?

Subscriptions maintain a long-lived connection between the client and server to push real-time updates. The typical flow is:

The client sends a subscription query to the server, usually over a WebSocket connection.
The server validates the query and “subscribes” the client to a specific event stream, often using a Pub/Sub mechanism (like Redis Pub/Sub).
When a relevant event occurs on the backend (e.g., a mutation creates a new message), the server publishes that event.
The subscription manager receives the event and pushes the corresponding data payload to all subscribed clients over their WebSocket connections.

Read the Apollo guide on Subscriptions.

18. What are some challenges of scaling a stateful subscription server?

Subscription servers are stateful because they must maintain an active WebSocket connection for each client. This poses challenges for horizontal scaling:

Load Balancing: Standard load balancers are designed for stateless HTTP requests. You need “sticky sessions” or a more sophisticated routing layer to ensure messages for a specific connection go to the correct server instance.
Pub/Sub Backend: You need a robust, external Pub/Sub system (like Redis or Kafka) to broadcast events across all server instances. A simple in-memory event emitter will not work in a multi-instance environment.
State Recovery: If a server instance crashes, all its WebSocket connections are lost. Clients need to have logic to automatically reconnect, and the system may need a way to resynchronize their state.

19. How does the resolver for a subscription field differ from a query or mutation resolver?

A query or mutation resolver returns data directly. A subscription field resolver is different: it must return an `AsyncIterator`. This iterator is used by the GraphQL engine to listen for events. When the Pub/Sub system publishes a new event, a corresponding value is yielded by the iterator, which is then sent to the client. So, instead of a “request-response” model, it’s a “subscribe-stream” model.

Security & Error Handling

20. What is the difference between Authentication and Authorization in a GraphQL context?

Authentication (who is the user?) is the process of verifying a user’s identity. This is typically handled *before* the GraphQL layer, often in middleware. The authenticated user’s details are then passed into the GraphQL `context` object for use by the resolvers.
Authorization (what can the user do?) happens *inside* the GraphQL layer. It’s the process of checking if the authenticated user has permission to access a specific field or perform a certain mutation. This logic can be placed in resolvers, business logic services, or declaratively using custom schema directives.

21. How should errors be handled in a GraphQL API?

GraphQL is designed to handle partial successes. A response can contain both a `data` field and an `errors` field. The `errors` field is an array of error objects, each containing a `message`, `locations` (which part of the query failed), and an optional `extensions` object.

Best practices include:

Throwing errors for exceptional situations (e.g., user not found).
Using the `extensions` object to provide structured error information, like a machine-readable `code` (“NOT_FOUND”) and validation details, so clients can handle errors programmatically.
Returning a union type (e.g., `union PostResult = Post | PostNotFoundError`) for predictable, non-exceptional “errors,” making them part of the schema.

Explore GraphQL Error Handling patterns.

22. Should you disable introspection in production? Why or why not?

Disabling introspection in production is a common but debated security practice. Arguments for disabling: It prevents attackers from easily discovering your entire API schema, making it harder for them to find potential vulnerabilities (“security through obscurity”). Arguments against disabling: It breaks many legitimate developer tools like GraphiQL and Apollo Studio that rely on introspection. A better approach is often to secure your API endpoint properly and rely on persisted queries or an allowlist, which prevents any arbitrary query from running, regardless of whether the schema is discoverable.

23. How do you prevent leaking internal implementation details or IDs in your schema?

It’s a best practice to not expose your database’s auto-incrementing integer IDs directly in your API. Instead, you can:

Use opaque, globally unique IDs (like UUIDs) for your public-facing IDs.
Implement a custom `ID` scalar that Base64-encodes a combination of the type name and the internal ID (e.g., `User:123` becomes `VXNlcjoxMjM=`). This is a common pattern used by Relay.

Additionally, use API-specific DTOs or dedicated GraphQL types rather than exposing your database models or internal objects directly. This creates a clean separation between your API contract and your implementation details.

24. How can you implement field-level authorization?

Field-level authorization can be implemented in several ways:

In the Resolver: Add logic at the beginning of a field’s resolver to check the user’s permissions from the `context` object. This is simple but can lead to boilerplate.
In the Business Logic/Service Layer: The service that the resolver calls is responsible for checking permissions. This centralizes the logic.
Using Custom Directives: The cleanest, most declarative approach. You can create a schema directive like `@hasRole(role: “ADMIN”)` and apply it directly to fields in your schema. The directive’s resolver logic then runs before the field’s resolver, enforcing the rule.

25. What is query batching on the client and how does it affect the backend?

Query batching is a client-side optimization (common in libraries like Apollo Client) where multiple individual GraphQL queries made within a short time window are bundled together and sent to the server in a single HTTP request. On the backend, you receive an array of query operations instead of a single one. Your server must be configured to handle this array, execute each query, and return an array of corresponding results. This reduces the number of HTTP requests but does not change the way resolvers are executed for each query (i.e., it does not solve the N+1 problem on its own).

26. How do you handle file uploads in GraphQL?

The most common way to handle file uploads is using the **GraphQL multipart request specification**. This involves sending a `multipart/form-data` request instead of a standard JSON request. The request contains the GraphQL query and a “map” that specifies which file(s) in the request correspond to which variable(s) in the GraphQL operation. On the server, a library or middleware (like `graphql-upload`) is needed to parse this multipart request and make the uploaded file available as a stream to the resolver.

27. What is the Global Object Identification specification?

It’s a convention, popularized by Relay, for providing a consistent mechanism for fetching objects by a globally unique ID. It involves:

A `Node` interface with a single field: `id: ID!`.
All major objects in your schema implement the `Node` interface.
A top-level `node(id: ID!): Node` query field that can refetch any object in the system given its global ID.

This provides a uniform way for clients to manage and refetch data from their local cache.

28. Can you use `GET` requests for GraphQL queries? What are the pros and cons?

Yes, you can send queries via `GET` requests by URL-encoding the query string as a query parameter. Pros:

Allows for simple HTTP caching by browsers, proxies, and CDNs.
Easy to debug and share, as the entire request is contained in the URL.

Cons:

Can be limited by URL length restrictions for very large queries.
Less secure, as query parameters are often logged.
According to the spec, mutations should not be sent via `GET`.

It’s a good choice for public, idempotent queries that can be cached.

29. What is resolver composition?

Resolver composition is the practice of combining multiple smaller functions or middleware to create a final resolver. This is often used to apply cross-cutting concerns like authentication, logging, or validation without cluttering the core business logic of the resolver. You might have a base resolver that gets the data, and wrap it with another function that checks permissions first. Libraries like `graphql-middleware` formalize this pattern.

30. What is the purpose of the `info` object in a resolver? Give a concrete example.

The `info` object contains the query’s Abstract Syntax Tree (AST) and other execution-level details. It allows for advanced, dynamic resolver logic.

A concrete example is optimizing a database query. By inspecting `info`, a resolver can see exactly which fields the client requested for a particular type. It can then dynamically build a SQL query that selects *only* those columns from the database, preventing over-fetching at the database layer. This is particularly useful when you have a large table with many columns, some of which are expensive to retrieve.

Skip the interview marathon.

We pre-vet senior engineers across Asia using these exact questions and more. Get matched in 24 hours, $0 upfront.

Get Pre-Vetted Talent