Add OAuth 2.0 Device Flow backend for CLI authentication#11984
Add OAuth 2.0 Device Flow backend for CLI authentication#11984
Conversation
- Add OAuth device flow endpoints for CLI authentication - Implement device code storage and management - Create database migration for device codes table - Integrate with existing API key system - Return API keys directly instead of OAuth tokens for simplified flow Co-authored-by: openhands <openhands@all-hands.dev>
|
|
- Remove client_id and scope requirements from request models since endpoints are already authenticated - Add keycloak_user_id field to track which user initiated the device authorization - Update database schema to store keycloak_user_id instead of client_id/scope - Add user authentication dependency injection to endpoints using get_user_id - Add security check to ensure only the requesting user can authorize their device - Remove refresh_token handling as it's not needed for this flow - Simplify DeviceCode model and store methods by removing unnecessary OAuth fields Co-authored-by: openhands <openhands@all-hands.dev>
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The device authorization endpoint should not require authentication since the whole point of device flow is to authenticate unauthenticated devices. Changes: - Remove authentication requirement from /oauth/device/authorize endpoint - Make keycloak_user_id nullable in device_codes table (set during authorization) - Update DeviceCodeStore.create_device_code() to not require keycloak_user_id - Set keycloak_user_id during device authorization, not creation This follows the OAuth 2.0 Device Flow specification where: 1. Device authorization is unauthenticated 2. User verification requires authentication 3. Token polling is unauthenticated Co-authored-by: openhands <openhands@all-hands.dev>
The device authorization and token polling endpoints need to be public since they are used by unauthenticated devices during the OAuth flow. - /oauth/device/authorize - Device authorization (unauthenticated) - /oauth/device/token - Token polling (unauthenticated) The verification endpoint (/oauth/device/verify) remains authenticated as it requires user login to authorize the device. Co-authored-by: openhands <openhands@all-hands.dev>
Remove unnecessary grant_type validation to keep the authorization flow minimal and simple. The device_code is sufficient for token requests. Changes: - Remove grant_type field from DeviceTokenRequest model - Remove grant_type validation in token endpoint - Simplify client request to only send device_code Co-authored-by: openhands <openhands@all-hands.dev>
Fix two issues in OAuth Device Flow: 1. Timezone-aware datetime comparison error: - Handle timezone-naive datetime from database in is_expired() method - Ensure both sides of comparison are timezone-aware 2. Pydantic deprecation warning: - Replace deprecated .dict() calls with .model_dump() - Maintains compatibility with Pydantic V2 Fixes error: 'can't compare offset-naive and offset-aware datetimes' Co-authored-by: openhands <openhands@all-hands.dev>
1. Change verification route from /oauth/device/verify to /oauth/verify - Update router prefix and route paths - Update verification URI in device authorization response - Update HTML form action 2. Fix authorization logic to properly link user ID during verification - Remove incorrect user ID matching check that always failed - Device codes are created without user ID (keycloak_user_id=None) - User ID is linked to device code during authorization via authorize() method - Add proper user authentication check This fixes the 'You are not authorized to approve this device request' error that occurred because we were checking for user ID match before linking it. Co-authored-by: openhands <openhands@all-hands.dev>
1. Remove /oauth/verify from middleware whitelist - Route should require user authentication, not be public - Only /oauth/device/authorize and /oauth/device/token are public 2. Add authentication requirement to GET /oauth/verify - Both GET and POST routes now require user_id via Depends(get_user_id) - Users must be logged in to access device verification page 3. Remove redundant user authentication check in POST handler - FastAPI dependency injection guarantees user_id is valid - No need for manual None check This ensures users must be authenticated to authorize devices, which is the correct security model for OAuth 2.0 Device Flow verification. Co-authored-by: openhands <openhands@all-hands.dev>
1. Handle authentication manually in OAuth device routes - Change user_id parameter to Optional[str] in both GET and POST /oauth/verify - Add manual authentication check and redirect to login if not authenticated - Build proper login URL with redirect_uri parameter 2. Add /oauth/verify back to middleware whitelist - Route now handles authentication manually instead of relying on middleware - Allows unauthenticated access so we can control the redirect behavior This fixes the issue where unauthenticated users saw raw redirect HTML instead of being properly redirected to the login page. Co-authored-by: openhands <openhands@all-hands.dev>
1. Replace manual authentication with Keycloak OAuth flow - GET /oauth/verify now redirects to Keycloak for authentication - Encode user_code in JWT state parameter for callback processing - Remove POST /oauth/verify endpoint (no longer needed) 2. Add Keycloak callback endpoint - GET /oauth/keycloak-callback handles authentication completion - Decode JWT state to retrieve user_code - Get user info from Keycloak tokens - Automatically authorize device and create CLI API key 3. Update middleware whitelist - Add /oauth/keycloak-callback to public endpoints - Both verification endpoints now handle auth flow properly This follows the same pattern as Slack integration and provides a cleaner, more secure authentication flow for device authorization. Co-authored-by: openhands <openhands@all-hands.dev>
- Import HOST_URL from integrations.utils - Update authentication failure messages to match Slack integration pattern - Use styled link to OpenHands Cloud with consistent messaging - Apply to both Keycloak token and user info failure cases This provides users with clear guidance on how to resolve authentication issues during the device authorization flow. Co-authored-by: openhands <openhands@all-hands.dev>
- Change redirect_uri from request.url.netloc to HOST_URL pattern - Match Slack integration approach exactly - Use HOST_URL/oauth/keycloak-callback for both redirect and token exchange - This fixes 'Invalid parameter: redirect_uri' error from Keycloak The redirect URI must match what Keycloak expects, which is the configured HOST_URL rather than the dynamic request URL. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove user_id column from migration (redundant with keycloak_user_id) - Remove access_token column from migration (should not be stored in DB) - Update DeviceCode model to remove user_id and access_token fields - Update DeviceCode.authorize() method to not store access_token - Update device_code_store.py to not pass access_token to authorize method - Update oauth_device.py to retrieve API key dynamically instead of storing it - Add helper methods to ApiKeyStore for retrieving/deleting by name and user_id - Fix bug where delete_api_key was called with name instead of key value Co-authored-by: openhands <openhands@all-hands.dev>
- Add DeviceCode model tests with parameterized expiration and status checks - Add DeviceCodeStore tests with parameterized lookup and authorization scenarios - Add OAuth2 device flow API endpoint tests with comprehensive error handling - Remove unused created_at field from DeviceCode model and migration - Remove cleanup_expired_codes method from DeviceCodeStore - Update conftest.py to include DeviceCode model for test database setup Tests cover: - Device code generation and validation - Authorization workflow (success/failure cases) - API endpoints (/authorize, /token, /verify, /keycloak-callback) - Error handling and edge cases - 32 tests total, all passing Co-authored-by: openhands <openhands@all-hands.dev>
- Remove unused cli_api_key variable in keycloak_callback - Fix import issues and formatting per ruff/mypy requirements - Add noqa comment for DeviceCode import in conftest.py (needed for DB table creation) - Fix trailing whitespace and end-of-file formatting - All 32 tests still passing, linting now clean Co-authored-by: openhands <openhands@all-hands.dev>
- Remove user_code validation from /verify endpoint before authentication - Move all database validation to happen after Keycloak authentication in callback - Update test to reflect new behavior where invalid codes redirect to Keycloak - This prevents potential probing of valid user codes without authentication Co-authored-by: openhands <openhands@all-hands.dev>
- Add session.expunge() to get_by_device_code and get_by_user_code methods - Properly detach objects from session before returning them - Update type annotations to use modern | None syntax instead of Optional - Prevents DetachedInstanceError and ensures clean session management Co-authored-by: openhands <openhands@all-hands.dev>
- Replace check-then-act pattern with database constraint enforcement - Use IntegrityError handling to retry on constraint violations - Remove vulnerable _generate_unique_codes method that had race condition - Add comprehensive tests for retry logic and constraint violations - Database already has unique indexes on device_code and user_code columns - Prevents race conditions where multiple processes could insert duplicate codes This fixes the classic Time-of-Check-Time-of-Use (TOCTOU) bug where: 1. Process A checks if codes exist (they don't) 2. Process B checks if codes exist (they don't) 3. Process A inserts codes 4. Process B inserts same codes -> constraint violation Now the database enforces uniqueness and we handle violations gracefully. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove delete_api_key_by_name call to allow multiple concurrent devices - Create unique API keys per device using user_code in name pattern - Update API key name to 'Device Link Access Key' for clarity - Modify token retrieval to use device-specific API key lookup - Add comprehensive test for multiple device authentication - Ensure backward compatibility with existing single-device flows This allows users to authenticate multiple devices simultaneously without interference, fixing the previous limitation where only one device could be authenticated at a time. Co-authored-by: openhands <openhands@all-hands.dev>
- Add rate limiting fields to device_codes table (last_poll_time, current_interval) - Implement check_rate_limit() and update_poll_time() methods in DeviceCode model - Add slow_down error response with interval field when clients poll too frequently - Increase polling interval by 5 seconds on each slow_down, capped at 60 seconds - Update DeviceCodeStore with update_poll_time() method for database persistence - Add comprehensive test suite covering all rate limiting scenarios: * First poll always allowed * Normal polling (respecting interval) allowed * Fast polling returns slow_down error with increased interval * Interval increases with repeated violations * Interval caps at maximum value (60 seconds) * Rate limiting applies even to authorized devices - Fix existing tests to mock new rate limiting methods - All tests passing with clean linting This ensures RFC 8628 compliance by preventing clients from overwhelming the server with excessive polling requests while providing clear feedback about required polling intervals. Co-authored-by: openhands <openhands@all-hands.dev>
- Changed from hardcoded DEVICE_TOKEN_POLL_INTERVAL to device_code_entry.current_interval - Ensures clients get correct polling interval if device has increased interval from rate limiting - Added test for device authorization with increased interval (15s vs default 5s) - Maintains backward compatibility for new device codes with default interval - Improves RFC 8628 compliance by providing accurate polling guidance This ensures that if a device code already has an increased interval from previous rate limiting violations, the authorization response correctly communicates the current required polling interval to the client. Co-authored-by: openhands <openhands@all-hands.dev>
|
Fixes made
Things not addressed in this PR
|
- Updated migration to use DateTime(timezone=True) for all datetime columns - Modified DeviceCode model to use DateTime(timezone=True) columns - Simplified is_expired() method by removing runtime timezone conversions - Simplified check_rate_limit() method by removing timezone conversion logic - Database now stores timezone-aware timestamps eliminating conversion overhead Performance improvements: - Eliminates timezone check on every is_expired() call (called every 5+ seconds per device) - Removes datetime.replace() object creation overhead - Ensures consistent timezone handling across all database backends - PostgreSQL will use TIMESTAMP WITH TIME ZONE for proper timezone storage This change ensures that datetime fields are always timezone-aware when retrieved from the database, eliminating the need for runtime timezone conversions and improving performance for frequently called methods. Co-authored-by: openhands <openhands@all-hands.dev>
- Cleaned up overly specific comments about database timezone storage - Comments were too implementation-specific and verbose for the context - Code is now cleaner and more maintainable Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands I'm not an OAuth expert, please do a /codereview-roasted and tell me if the implementation is correct |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
SummaryI've completed a comprehensive Linus-style code review of the OAuth 2.0 Device Flow implementation in PR #11984. As requested, I focused on security correctness and implementation quality rather than making changes. 🔥 Review ResultsOverall Rating: 🟡 Acceptable with caveats - Core logic is sound but has critical security issues that must be fixed before merging. Critical Issues Found:
Additional Issues:
Positive Aspects:✅ Previous reviewer's feedback was addressed (SQLAlchemy sessions, race conditions, timezone handling) Recommendation:Do not merge until HIGH priority security issues are fixed. The CSRF vulnerability is particularly serious and could make headlines. The transaction atomicity issue will cause subtle production bugs. These should be addressed before deployment. Note: I did not make any code changes as requested - this is a review-only assessment per the |
- Reorder operations to authorize device before creating API key - Add cleanup logic to revert authorization if API key creation fails - Implement transaction integrity to prevent data inconsistency - Add comprehensive error handling with detailed logging - Include test suite for all failure scenarios This fixes a critical security issue where API keys could be orphaned in the database if device authorization failed after key creation. Co-authored-by: openhands <openhands@all-hands.dev>
- Change /oauth/device/token endpoint to accept form data instead of JSON - Replace DeviceTokenRequest model with Form parameter - Update device_code parameter handling throughout the endpoint - Ensures RFC 8628 compliance for OAuth 2.0 Device Authorization Grant Co-authored-by: openhands <openhands@all-hands.dev>
- Remove automatic device authorization on page load - Add explicit user consent UI with security warning - Require user to click 'Authorize Device' button before API call - Display device code clearly with security notice - Maintain existing manual code entry functionality This prevents attackers from tricking users into authorizing devices by sending malicious links with user_code parameters.
- Remove unused variable api_key_created in oauth_device.py - Fix code formatting and style issues - Ensure all enterprise linting checks pass
- Remove non-existent DeviceTokenRequest import - Update test calls to use device_code parameter instead of request object - All 22 tests now pass successfully - Tests align with current Form-based parameter implementation Co-authored-by: openhands <openhands@all-hands.dev>
| def check_rate_limit(self) -> tuple[bool, int]: | ||
| """Check if the client is polling too fast. | ||
|
|
||
| Returns: | ||
| tuple: (is_too_fast, current_interval) | ||
| - is_too_fast: True if client should receive slow_down error | ||
| - current_interval: Current polling interval to use | ||
| """ | ||
| now = datetime.now(timezone.utc) | ||
|
|
||
| # If this is the first poll, allow it | ||
| if self.last_poll_time is None: | ||
| return False, self.current_interval | ||
|
|
||
| # Calculate time since last poll | ||
| time_since_last_poll = (now - self.last_poll_time).total_seconds() | ||
|
|
||
| # Check if polling too fast | ||
| if time_since_last_poll < self.current_interval: | ||
| # Increase interval for slow_down (RFC 8628 section 3.5) | ||
| new_interval = min(self.current_interval + 5, 60) # Cap at 60 seconds | ||
| return True, new_interval | ||
|
|
||
| return False, self.current_interval | ||
|
|
||
| def update_poll_time(self, increase_interval: bool = False) -> None: | ||
| """Update the last poll time and optionally increase the interval. | ||
|
|
||
| Args: | ||
| increase_interval: If True, increase the current interval for slow_down | ||
| """ | ||
| self.last_poll_time = datetime.now(timezone.utc) | ||
|
|
||
| if increase_interval: | ||
| # Increase interval by 5 seconds, cap at 60 seconds (RFC 8628) | ||
| self.current_interval = min(self.current_interval + 5, 60) |
There was a problem hiding this comment.
Just curious, @OpenHands is there an library we can use to do all these, instead of needing to implement these ourselves?
There was a problem hiding this comment.
I'm on it! xingyaoww can track my progress at all-hands.dev
There was a problem hiding this comment.
Summary
I answered the question about whether there's a library available for OAuth 2.0 Device Flow implementation instead of building it from scratch.
Answer
Yes, there is a library: Authlib - Python's most comprehensive OAuth library with full RFC 8628 (Device Flow) support.
Key Points:
What Authlib Provides:
- ✅ Built-in
DeviceAuthorizationEndpointandDeviceCodeGrantclasses for RFC 8628 - ✅ Battle-tested, standards-compliant implementation
- ✅ Handles device code generation, rate limiting, and token polling
The Important Caveat:
- ✅ Authlib has full OAuth server support for Flask and Django
- ❌ For FastAPI/Starlette, Authlib only provides OAuth client functionality (not server)
⚠️ Using Authlib with FastAPI as an OAuth server would still require:- Implementing abstract methods yourself
- Building custom FastAPI integration layer
- Connecting to your SQLAlchemy models
My Recommendation:
Your custom implementation is appropriate for this use case because:
- You're building an OAuth server with FastAPI (Authlib has no pre-built integration for this)
- Your implementation is already complete, tested (32 passing tests), and working
- Your use case is customized (returning API keys instead of standard OAuth tokens)
- The code is clean, well-structured, and maintainable
Bottom Line: While Authlib exists and is excellent for standard OAuth scenarios, your focused custom implementation (~300 lines) is a pragmatic choice given the lack of FastAPI OAuth server support in Authlib and your already-working solution. You could consider refactoring to Authlib in the future if OAuth requirements expand significantly beyond Device Flow.
- Fix TypeError when comparing timezone-naive expires_at from database with timezone-aware datetime.now(UTC) - Add timezone handling in validate_api_key method to convert naive datetimes to UTC before comparison - Add comprehensive tests for both timezone-aware and timezone-naive scenarios - Ensures API keys created through OAuth device flow work immediately after creation Co-authored-by: openhands <openhands@all-hands.dev>
…1984) Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Summary of PR
Implements OAuth 2.0 Device Flow backend for CLI authentication with OpenHands Cloud. Includes comprehensive unit tests (32 tests) and code cleanup.
OAuth 2.0 Device Flow
How it works:
POST /oauth/device/authorize→ returnsdevice_code+user_codeuser_code, logs in via KeycloakPOST /oauth/device/tokenwithdevice_code→ gets API keyWhy both codes are needed:
user_code(ABC12345): Short, human-readable for browser entrydevice_code(128 chars): Cryptographically secure for CLI pollingKey Features
🔐 Endpoints
POST /oauth/device/authorize- Start flowPOST /oauth/device/token- Poll for API keyGET /oauth/device/verify- User authorization pageGET /oauth/device/keycloak-callback- OAuth callbackChange Type
Checklist
Release Notes
Added OAuth 2.0 Device Flow for CLI authentication
Related: OpenHands/OpenHands-CLI#192
Testing:
To run this PR locally, use the following command:
GUI with Docker: