tududi/backend/modules/url/service.js
Chris c2e9a1aa21
feat: Add OIDC/SSO authentication support (#1008)
* feat: add OIDC/SSO database schema and models (Phase 1)

Add database foundation for OpenID Connect authentication:

Database Migrations:
- Create oidc_identities table (links users to OIDC accounts)
- Create oidc_state_nonces table (OAuth state/nonce for CSRF protection)
- Create auth_audit_log table (security event logging)
- Make password_digest nullable in users table (allow OIDC-only users)

Models:
- OIDCIdentity: Links users to external OIDC providers
- OIDCStateNonce: Temporary OAuth state management
- AuthAuditLog: Authentication event audit trail

Changes:
- Updated User model to allow null password_digest
- Added model associations in models/index.js
- All migrations tested and verified

Related to #977

* feat: add OIDC core services (Phase 2)

- Install openid-client@^6.2.0 for OIDC protocol support
- Implement providerConfig.js for loading providers from .env
  - Support single provider or numbered providers (OIDC_PROVIDER_1_*, etc.)
  - Auto-provision and admin email domain configuration
  - Provider caching for performance
- Implement stateManager.js for OAuth state/nonce management
  - CSRF protection with 10-minute TTL
  - One-time use state consumption
  - Automatic cleanup of expired states
- Implement auditService.js for authentication event logging
  - Track login success/failure, logout, OIDC linking/unlinking
  - Store IP address, user agent, and metadata
  - Support for event queries and retention cleanup
- Add comprehensive unit tests (60 tests, all passing)
  - providerConfig: 36 tests for env parsing and validation
  - stateManager: 12 tests for state lifecycle and security
  - auditService: 12 tests for event logging and queries

Phase 2 completes the backend core services needed for OIDC authentication.

* feat: implement OIDC authentication flow (Phase 3)

Core OIDC Flow (service.js):
- Provider discovery with issuer caching
- Authorization URL generation with state/nonce
- OAuth callback handling and token exchange
- ID token validation using openid-client
- Token refresh functionality

JIT User Provisioning (provisioningService.js):
- Auto-create users from OIDC claims
- Link existing email accounts to OIDC identities
- Admin role assignment based on email domain rules
- Automatic username generation from email
- Transaction-safe identity creation

Identity Management (oidcIdentityService.js):
- List user's linked OIDC identities
- Link additional providers to existing accounts
- Unlink identities with safety checks
- Prevent unlinking last auth method
- Update identity claims on login

HTTP Layer (controller.js + routes.js):
- GET /api/oidc/providers - List configured providers
- GET /api/oidc/auth/:slug - Initiate OIDC flow
- GET /api/oidc/callback/:slug - Handle OAuth callback
- POST /api/oidc/link/:slug - Link provider to current user
- DELETE /api/oidc/unlink/:id - Unlink identity
- GET /api/oidc/identities - Get user's identities

Integration:
- Register OIDC routes in Express app (public + authenticated)
- Update auth service to reject password login for OIDC-only users
- Audit logging for all OIDC operations
- Session creation on successful authentication

Security:
- State/nonce CSRF protection
- One-time use state consumption
- Transaction-safe user provisioning
- Foreign key constraints enforced

* feat: implement OIDC frontend login flow (Phase 4)

- Created OIDCProviderButtons component for SSO login options
- Created OIDCCallback component for OAuth callback handling
- Updated Login page to fetch and display OIDC providers
- Added /auth/callback/:provider route to App.tsx
- Added i18n translations for OIDC UI elements
- Downgraded openid-client to v5.7.0 (CommonJS compatibility)
- Fixed linting issues in backend OIDC modules

Phase 4 completes the frontend login flow for OIDC/SSO authentication.
Users can now see configured SSO providers on the login page.

* feat: implement OIDC account linking UI (Phase 5)

Add Connected Accounts section to Profile Security tab allowing users to:
- View linked OIDC provider accounts
- Link new SSO providers to their account
- Unlink OIDC identities with validation
- Prevent unlinking last authentication method

Backend changes:
- Add has_password virtual field to User model
- Include has_password in profile API response
- Track whether user has password set for validation

Frontend changes:
- Create oidcService for OIDC API operations
- Create ConnectedAccounts component with link/unlink flows
- Add confirmation dialog before unlinking accounts
- Validate that users cannot unlink their last auth method
- Show warning if user has no password set
- Integrate Connected Accounts into SecurityTab

User experience:
- View all linked SSO provider accounts with email and link date
- Link additional providers via "Link Provider" buttons
- Unlink with two-step confirmation to prevent accidents
- Clear error messages when unlinking would leave no auth method
- Warning message suggesting password setup for OIDC-only users

Fixes #977

* feat: complete OIDC documentation and UI improvements (Phase 6)

This commit completes Phase 6 of the OIDC/SSO implementation with comprehensive
documentation, bug fixes, and UI reorganization.

Documentation:
- Add comprehensive user guide at docs/10-oidc-sso.md with:
  - Setup guides for 6 major providers (Google, Okta, Keycloak, Authentik, PocketID, Azure AD)
  - Configuration examples for single and multiple providers
  - User features documentation (login, account linking, management)
  - Advanced topics (auto-provisioning, admin role assignment, hybrid auth)
  - Comprehensive troubleshooting section
  - Security considerations and best practices
- Update README.md with OIDC/SSO section and quick setup examples

Internationalization:
- Add i18n support to OIDCProviderButtons component
- Add translation keys for all OIDC UI text
- Update English translations with "sign_in_with" key

Bug Fixes:
- Fix oidcService.ts to correctly unwrap API responses
  - Backend returns {providers: [...]} and {identities: [...]}
  - Frontend was expecting plain arrays, causing "map is not a function" error
- Fix initiateOIDCLink to properly handle POST response

UI Improvements:
- Move OIDC/SSO to dedicated tab in profile settings
  - Create new OIDCTab component with green LinkIcon
  - Remove ConnectedAccounts from SecurityTab
  - Add OIDC tab between Security and API Keys tabs
  - Update ProfileSettings with new tab configuration
- Security tab now focuses solely on password management

Testing:
- All linting passes
- All tests pass (82 suites, 1223 tests)

Related to #977

* feat: add OIDC/SSO translations for all 24 languages

Add i18n support for OIDC/SSO features across all supported languages:
- "Sign in with {{provider}}" button text
- "OIDC/SSO" tab label in profile settings
- OIDC authentication flow messages

Translations added for: Arabic, Bulgarian, Danish, German, Greek, Spanish,
Finnish, French, Indonesian, Italian, Japanese, Korean, Dutch, Norwegian,
Polish, Portuguese, Romanian, Russian, Slovenian, Swedish, Turkish,
Ukrainian, Vietnamese, and Chinese.

* fix: resolve 13 CodeQL security alerts

This commit addresses critical security vulnerabilities identified by CodeQL scanning:

**Security Configuration (2 fixes)**
- Fix insecure Helmet configuration - enable CSP and HSTS in production
- Fix clear text cookie transmission - enable secure cookies in production

**Path Injection (3 fixes)**
- Add path validation in users/controller.js to prevent arbitrary file deletion
- Add path validation in users/service.js for avatar operations
- Add path sanitization in attachment-utils.js deleteFileFromDisk function

**Cross-Site Scripting (1 fix)**
- Fix XSS vulnerability in GeneralTab.tsx avatar URL handling
- Add URL sanitization to prevent javascript: protocol attacks

**URL Security (2 fixes)**
- Fix double escaping in url/service.js HTML entity decoding
- Fix incomplete URL sanitization for YouTube domain validation

**Denial of Service (1 fix)**
- Add loop bound protection in inboxProcessingService.js (10k char limit)

**Rate Limiting (3 fixes)**
- Add rate limiting to auth routes (register, verify-email)
- Add rate limiting to task attachment upload/delete endpoints
- Add rate limiting to user avatar upload/delete endpoints

**GitHub Actions Security (1 fix)**
- Add explicit read-only permissions to CI workflow

Note: CSRF middleware (#10) requires frontend changes and is tracked separately.

Relates to PR #1008

* fix: allow test files in path validation for tests

* fix: format long condition in attachment-utils for Prettier compliance

Break the path validation condition across multiple lines to meet Prettier formatting requirements and fix CI linting failure.

* fix: resolve CodeQL security alerts

- Add rate limiting to OIDC authentication routes using authLimiter and authenticatedApiLimiter
- Implement CSRF protection middleware using csrf-sync (skips for API tokens and test environment)
- Add CSRF token endpoint at /api/csrf-token
- Fix incomplete URL scheme validation in GeneralTab to block all dangerous schemes (javascript:, data:, vbscript:, file:)

This addresses 5 high-severity CodeQL security vulnerabilities:
- Missing rate limiting on OIDC auth routes
- Missing CSRF middleware protection
- Incomplete URL sanitization in avatar handling

All 1223 tests passing.

* fix: implement CSRF protection with lusca for CodeQL compliance

Add CSRF protection using lusca.csrf (CodeQL's recommended library) to
protect session-based authentication while supporting hybrid auth patterns.

Implementation:
- Pre-check middleware marks exempt requests (test env, Bearer tokens)
- Lusca CSRF middleware applied with exemption flag check
- Session-based requests require valid x-csrf-token header
- Bearer token requests exempt (don't use cookies)
- Test environment exempt for test execution

This addresses CodeQL security alert js/missing-token-validation while
maintaining support for both cookie-based and token-based authentication.

Related: #977 (OIDC/SSO authentication feature)
2026-04-13 12:17:35 +03:00

523 lines
15 KiB
JavaScript

'use strict';
const https = require('https');
const http = require('http');
const { URL } = require('url');
const { logError } = require('../../services/logService');
let nodeFetchInstance = null;
try {
// eslint-disable-next-line global-require
nodeFetchInstance = require('node-fetch');
} catch {
nodeFetchInstance = null;
}
const getFetchImplementation = () => {
if (typeof fetch === 'function') {
return fetch;
}
if (nodeFetchInstance) {
return nodeFetchInstance;
}
return null;
};
const fetchWithTimeout = async (url, options = {}, timeoutMs = 7000) => {
const fetchFn = getFetchImplementation();
if (!fetchFn) {
throw new Error('Fetch API is not available in this environment');
}
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), timeoutMs);
try {
const response = await fetchFn(url, {
...options,
signal: controller.signal,
});
return response;
} finally {
clearTimeout(timeout);
}
};
// Fast regex-based metadata extraction (much faster than cheerio for head content)
function extractMetadataFromHtml(html) {
try {
// Extract title with priority: og:title > twitter:title > title tag
let title = null;
// Try og:title first
const ogTitleMatch = html.match(
/<meta[^>]+property=["']og:title["'][^>]+content=["']([^"']+)["']/i
);
if (ogTitleMatch) {
title = ogTitleMatch[1];
} else {
// Try twitter:title
const twitterTitleMatch = html.match(
/<meta[^>]+name=["']twitter:title["'][^>]+content=["']([^"']+)["']/i
);
if (twitterTitleMatch) {
title = twitterTitleMatch[1];
} else {
// Fallback to title tag
const titleMatch = html.match(/<title[^>]*>([^<]+)<\/title>/i);
if (titleMatch) {
title = titleMatch[1].trim();
}
}
}
if (title) {
title = title.trim();
title = title
.replace(/&lt;/g, '<')
.replace(/&gt;/g, '>')
.replace(/&quot;/g, '"')
.replace(/&#39;/g, "'")
.replace(/&amp;/g, '&');
if (title.length > 100) {
title = title.substring(0, 100) + '...';
}
}
// Extract image with priority: og:image > twitter:image
let image = null;
const ogImageMatch = html.match(
/<meta[^>]+property=["']og:image["'][^>]+content=["']([^"']+)["']/i
);
if (ogImageMatch) {
image = ogImageMatch[1];
} else {
const twitterImageMatch = html.match(
/<meta[^>]+name=["']twitter:image["'][^>]+content=["']([^"']+)["']/i
);
if (twitterImageMatch) {
image = twitterImageMatch[1];
}
}
// Extract description
let description = null;
const ogDescMatch = html.match(
/<meta[^>]+property=["']og:description["'][^>]+content=["']([^"']+)["']/i
);
if (ogDescMatch) {
description = ogDescMatch[1];
} else {
const twitterDescMatch = html.match(
/<meta[^>]+name=["']twitter:description["'][^>]+content=["']([^"']+)["']/i
);
if (twitterDescMatch) {
description = twitterDescMatch[1];
} else {
const metaDescMatch = html.match(
/<meta[^>]+name=["']description["'][^>]+content=["']([^"']+)["']/i
);
if (metaDescMatch) {
description = metaDescMatch[1];
}
}
}
if (description && description.length > 150) {
description = description.substring(0, 150) + '...';
}
return {
title,
image,
description,
};
} catch (error) {
logError('Error parsing HTML:', error);
return { title: null, image: null, description: null };
}
}
// Helper function to resolve relative URLs to absolute URLs
function resolveUrl(baseUrl, relativeUrl) {
try {
return new URL(relativeUrl, baseUrl).href;
} catch {
return relativeUrl;
}
}
// Helper function to handle YouTube URLs specially
function handleYouTubeUrl(url) {
const youtubeRegex =
/(?:youtube\.com\/watch\?v=|youtu\.be\/)([a-zA-Z0-9_-]{11})/;
const match = url.match(youtubeRegex);
if (match) {
const videoId = match[1];
// For now, return basic YouTube info - this is fast and reliable
return {
title: 'YouTube Video',
image: `https://img.youtube.com/vi/${videoId}/maxresdefault.jpg`,
description: 'YouTube video',
};
}
return null;
}
const finalizeMetadata = (metadata, sourceUrl) => {
if (!metadata) {
return null;
}
const enriched = { ...metadata };
if (
enriched.image &&
!enriched.image.startsWith('http') &&
!enriched.image.startsWith('//')
) {
enriched.image = resolveUrl(sourceUrl, enriched.image);
}
if (!enriched.title) {
try {
enriched.title = new URL(sourceUrl).hostname;
} catch {
enriched.title = sourceUrl;
}
}
return enriched;
};
async function fetchMetadataViaFetch(normalizedUrl) {
const response = await fetchWithTimeout(
normalizedUrl,
{
method: 'GET',
redirect: 'follow',
headers: {
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
},
},
7000
);
if (!response || !response.ok) {
return null;
}
const contentType = response.headers.get('content-type');
if (contentType && !contentType.includes('text/html')) {
return null;
}
const html = await response.text();
if (!html) {
return null;
}
return finalizeMetadata(extractMetadataFromHtml(html), normalizedUrl);
}
function fetchMetadataViaHttp(normalizedUrl, maxRedirects = 5) {
return new Promise((resolve) => {
let finished = false;
const fallbackResolve = (metadata, sourceUrl = normalizedUrl) => {
if (finished) {
return;
}
finished = true;
clearTimeout(globalTimeout);
resolve(finalizeMetadata(metadata, sourceUrl));
};
const globalTimeout = setTimeout(() => {
fallbackResolve(null);
}, 6000);
function makeRequest(currentUrl, redirectCount = 0) {
if (redirectCount > maxRedirects) {
clearTimeout(globalTimeout);
fallbackResolve(null);
return;
}
try {
const urlObj = new URL(currentUrl);
const isHttps = urlObj.protocol === 'https:';
const client = isHttps ? https : http;
const options = {
hostname: urlObj.hostname,
port: urlObj.port || (isHttps ? 443 : 80),
path: urlObj.pathname + urlObj.search || '/',
method: 'GET',
timeout: 4000,
headers: {
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
},
};
const req = client.request(options, (res) => {
let resolvedForRequest = false;
const conclude = (metadata) => {
if (resolvedForRequest) {
return;
}
resolvedForRequest = true;
fallbackResolve(metadata, currentUrl);
};
if (
[301, 302, 303, 307, 308].includes(res.statusCode) &&
res.headers.location
) {
const redirectUrl = new URL(
res.headers.location,
currentUrl
).href;
res.resume();
makeRequest(redirectUrl, redirectCount + 1);
return;
}
if (res.statusCode < 200 || res.statusCode >= 400) {
conclude(null);
return;
}
let data = '';
let totalBytes = 0;
const maxBytes = 40000;
let foundMeta = false;
res.on('data', (chunk) => {
totalBytes += chunk.length;
if (totalBytes > maxBytes) {
res.destroy();
conclude(extractMetadataFromHtml(data));
return;
}
data += chunk;
if (
!foundMeta &&
(data.includes('og:title') ||
data.includes('twitter:title') ||
data.includes('</title>'))
) {
foundMeta = true;
}
if (foundMeta && data.includes('</head>')) {
res.destroy();
conclude(extractMetadataFromHtml(data));
}
});
res.on('end', () => {
conclude(extractMetadataFromHtml(data));
});
res.on('error', () => {
conclude(null);
});
});
req.on('error', () => {
fallbackResolve(null);
});
req.on('timeout', () => {
req.destroy();
fallbackResolve(null);
});
req.end();
} catch (error) {
clearTimeout(globalTimeout);
fallbackResolve(null);
}
}
makeRequest(normalizedUrl);
});
}
async function fetchMetadataViaProxy(normalizedUrl) {
const fetchFn = getFetchImplementation();
if (!fetchFn) {
return null;
}
const proxiedUrl = `https://r.jina.ai/${normalizedUrl}`;
const response = await fetchWithTimeout(
proxiedUrl,
{
method: 'GET',
headers: {
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
Accept: 'text/html,application/xhtml+xml',
},
},
7000
);
if (!response || !response.ok) {
return null;
}
const html = await response.text();
if (!html) {
return null;
}
return finalizeMetadata(extractMetadataFromHtml(html), normalizedUrl);
}
// Helper function to fetch URL metadata with proper redirect/timeout handling
async function fetchUrlMetadata(url) {
if (!url) {
return null;
}
let normalizedUrl = url.trim();
if (
!normalizedUrl.startsWith('http://') &&
!normalizedUrl.startsWith('https://')
) {
normalizedUrl = `https://${normalizedUrl}`;
}
try {
const parsedUrl = new URL(normalizedUrl);
const hostname = parsedUrl.hostname.toLowerCase();
if (
hostname === 'youtube.com' ||
hostname.endsWith('.youtube.com') ||
hostname === 'youtu.be'
) {
const youtubeMetadata = handleYouTubeUrl(normalizedUrl);
if (youtubeMetadata) {
return youtubeMetadata;
}
}
} catch (error) {
logError('Error parsing URL for YouTube check:', error);
}
try {
if (getFetchImplementation()) {
const metadata = await fetchMetadataViaFetch(normalizedUrl);
if (metadata) {
return metadata;
}
}
} catch (error) {
logError('Error fetching URL metadata via fetch:', error);
}
const httpMetadata = await fetchMetadataViaHttp(normalizedUrl);
if (httpMetadata) {
return httpMetadata;
}
try {
const proxyMetadata = await fetchMetadataViaProxy(normalizedUrl);
if (proxyMetadata) {
return proxyMetadata;
}
} catch (error) {
logError('Error fetching URL metadata via proxy:', error);
}
return null;
}
class UrlService {
async getTitle(url) {
if (!url) {
return { error: 'URL parameter is required' };
}
const metadata = await fetchUrlMetadata(url);
if (metadata && metadata.title) {
return {
url,
title: metadata.title,
image: metadata.image,
description: metadata.description,
};
} else {
return {
url,
title: null,
image: null,
description: null,
error: 'Could not extract metadata',
};
}
}
async extractFromText(text) {
if (!text) {
return { error: 'Text parameter is required' };
}
// Enhanced URL extraction - look for URLs with or without protocol
const urlWithProtocolRegex = /(https?:\/\/[^\s]+)/gi;
const urlWithoutProtocolRegex =
/(?:^|\s)((?:www\.)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(?::[0-9]{1,5})?(?:\/[^\s]*)?)/gi;
let urls = text.match(urlWithProtocolRegex);
// If no URLs with protocol found, look for URLs without protocol
if (!urls) {
const matches = text.match(urlWithoutProtocolRegex);
if (matches) {
// Clean up the matches (remove leading whitespace)
urls = matches.map((match) => match.trim());
}
}
if (urls && urls.length > 0) {
const firstUrl = urls[0];
const metadata = await fetchUrlMetadata(firstUrl);
if (metadata && metadata.title) {
return {
found: true,
url: firstUrl,
title: metadata.title,
image: metadata.image,
description: metadata.description,
originalText: text,
};
} else {
return {
found: true,
url: firstUrl,
title: null,
image: null,
description: null,
originalText: text,
};
}
} else {
return { found: false };
}
}
}
module.exports = new UrlService();