diff --git a/docs/feature-design/remote-query.md b/docs/feature-design/remote-query.md new file mode 100644 index 00000000..fcfa01db --- /dev/null +++ b/docs/feature-design/remote-query.md @@ -0,0 +1,830 @@ +# Remote Query Feature Design + +> **Target Audience**: Development teams building features and enhancements for the search-relevance plugin. + +## Introduction + +This document outlines the design and implementation of the Remote Query feature for the OpenSearch Search Relevance plugin. The Remote Query feature enables search relevance experiments to execute queries against remote search engines and OpenSearch clusters, allowing for comprehensive cross-platform search evaluation and comparison. + +## Problem Statement + +Search relevance evaluation is critical for maintaining and improving search quality in OpenSearch deployments. Organizations often need to compare search performance across different configurations, algorithms, or entirely different search engines to make informed decisions about their search infrastructure. + +**Key Problems Addressed:** +- **Limited Evaluation Scope**: Current Search Relevance plugin only evaluates within a single OpenSearch cluster +- **Multi-vendor Comparison**: Organizations need to compare OpenSearch against other search engines (Solr, Elasticsearch, proprietary solutions) +- **Migration Validation**: Teams migrating to OpenSearch need to validate equivalent or better search relevance compared to legacy systems +- **A/B Testing Across Environments**: Need to compare search results using identical evaluation criteria across different systems +- **Cross-Cluster Analysis**: Large organizations with multiple OpenSearch clusters need unified performance comparison + +**Impact of Not Implementing:** +- Organizations forced to build custom integration solutions for each search engine +- Manual export/import workflows that are error-prone and time-consuming +- Inconsistent evaluation methodologies reducing comparison validity +- Incomplete evaluation coverage when assessing search engine alternatives + +**Primary Users/Stakeholders:** +- Search engineers evaluating different search technologies +- DevOps teams managing search infrastructure migrations +- Product teams conducting A/B tests across search systems +- Organizations with hybrid search architectures + +**Alignment with OpenSearch Goals:** +- Enhances OpenSearch's position as a comprehensive search platform +- Provides tools for objective search engine evaluation +- Supports migration and adoption workflows +- Extends plugin capabilities beyond single-cluster limitations + +## Use Cases + +### Required Use Cases +1. **Multi-vendor Search Engine Evaluation** - Execute identical query sets against OpenSearch and competitor search engines (Solr, Elasticsearch) for objective relevance comparison +2. **Migration Validation** - Compare search results between legacy systems and OpenSearch during migration projects +3. **Cross-Cluster Performance Analysis** - Evaluate search performance across multiple OpenSearch clusters with different configurations +4. **A/B Testing Across Environments** - Test new search algorithms against production systems using consistent evaluation criteria + +### Nice-to-Have Use Cases +1. **Hybrid Search Architecture Evaluation** - Unified relevance evaluation across multiple search systems in complex architectures +2. **Vendor Benchmarking** - Periodic evaluation of different search technologies using standardized methodologies +3. **Real-time Performance Monitoring** - Continuous comparison of search quality across systems + +## Requirements + +### Functional Requirements + +1. **Remote Configuration Management** + - Create, update, delete, and retrieve remote search configurations + - Support for HTTP/HTTPS endpoints with authentication + - Configurable query and response templates for different search engines + - Rate limiting and concurrency control per configuration + +2. **Query Execution** + - Execute search queries against remote systems via HTTP/HTTPS + - Template-based query transformation for different search engine formats + - Asynchronous execution with proper timeout handling + - Integration with existing experiment workflows + +3. **Response Processing** + - Transform remote responses to OpenSearch-compatible format + - Template-based response mapping and field extraction + - Error handling for malformed or unexpected responses + +4. **Caching System** + - Intelligent caching of remote search results + - Configurable TTL and cache invalidation + - Query-based cache keys for efficient retrieval + +5. **Error Handling and Monitoring** + - Comprehensive failure tracking and categorization + - Rate limiting and circuit breaker patterns + - Detailed logging and monitoring capabilities + +### Non-Functional Requirements + +1. **Performance** + - Support for configurable rate limiting (requests per second) + - Concurrent request limiting to prevent resource exhaustion + - Efficient caching to minimize remote system load + - Asynchronous execution to prevent blocking + +2. **Security** + - Encrypted storage of authentication credentials + - Support for basic authentication + - TLS/SSL support for secure connections + - Integration with OpenSearch security framework + +3. **Reliability** + - Graceful handling of network failures and timeouts + - Retry mechanisms with exponential backoff + - Circuit breaker pattern for failing remote systems + - Comprehensive error logging and recovery + +4. **Scalability** + - Support for multiple concurrent remote configurations + - Efficient connection pooling and reuse + - Configurable resource limits per configuration + +## Out of Scope + +1. **Advanced Authentication Methods** - OAuth, JWT, and certificate-based authentication (future enhancement) +2. **Non-HTTP Protocols** - gRPC, WebSocket, and custom protocol support (future enhancement) +3. **External Cache Systems** - Redis, Memcached integration (future enhancement) +4. **Response Streaming** - Large result set streaming support (future enhancement) +5. **Advanced Template Processing** - Complex transformation logic beyond basic substitution (future enhancement) + +## Current State + +The OpenSearch Search Relevance plugin currently provides: +- Experiment management and execution within single OpenSearch clusters +- Query set management and execution +- Judgment-based evaluation with human relevance assessments +- Automated metrics calculation (NDCG, MAP, MRR, etc.) +- Local search configuration management + +**Components that will be impacted:** +- `ExperimentTaskManager` - Enhanced to support remote search execution +- `SearchRelevanceIndices` - New indices for remote configurations, cache, and failures +- Plugin registration - New REST endpoints and transport actions +- Experiment workflow - Integration of remote search results with existing evaluation + +## Solution Overview + +The Remote Query feature extends the Search Relevance plugin with a remote search execution layer that abstracts differences between search engines while maintaining consistent evaluation methodologies. + +**Key Technologies and Dependencies:** +- Java 11 HttpClient for HTTP communication +- OpenSearch XContent framework for JSON processing +- Template-based query and response transformation +- OpenSearch security framework for credential encryption + +**Integration with OpenSearch Core:** +- Utilizes OpenSearch's index management for data storage +- Leverages OpenSearch security for authentication and authorization +- Integrates with OpenSearch's async framework for non-blocking operations + +**Interaction with Existing Search-Relevance Features:** +- Seamless integration with existing experiment workflows +- Reuses judgment sets and evaluation metrics +- Extends search configuration concept to include remote systems +- Compatible with existing query set and result analysis features + +## Solution Design + +### Proposed Solution + +The solution introduces five core components that work together to provide remote search capabilities: + +#### Architecture Overview + +``` +┌──────────────────────────────────────────────────────────────┐ +│ Search Relevance Plugin │ +├──────────────────────────────────────────────────────────────┤ +│ Experiment Management │ Query Sets │ Judgment Management │ +├──────────────────────────────────────────────────────────────┤ +│ Search Execution Layer │ +├──────────────────────────────────────────────────────────────┤ +│ Local Search Executor │ Remote Query Executor │ +│ │ ┌──────────────────────────────────┤ +│ │ │ Configuration Management │ +│ │ │ Connection Pooling │ +│ │ │ Rate Limiting │ +│ │ │ Query Template Processing │ +│ │ │ Response Mapping │ +│ │ │ Caching Layer │ +│ │ │ Error Handling & Retry │ +│ │ │ Failure Tracking │ +└────────────────────────┴──┴──────────────────────────────────┘ +``` + +#### Core Components + +**1. RemoteSearchConfiguration** +- Manages connection details and search engine-specific settings +- Stores encrypted credentials and endpoint information +- Configures rate limiting and caching behavior +- Supports flexible query and response templates + +**2. RemoteSearchExecutor** +- Handles HTTP communication with remote search engines +- Implements rate limiting and concurrent request management +- Provides asynchronous execution with timeout handling +- Integrates caching and error handling + +**3. RemoteResponseMapper** +- Transforms remote search responses to standardized formats +- Supports template-based field mapping and extraction +- Handles various response formats and structures + +**4. RemoteSearchCache** +- Provides intelligent caching for remote search results +- Implements TTL-based expiration and manual invalidation +- Optimizes performance and reduces remote system load + +**5. RemoteSearchFailure** +- Tracks and categorizes remote search failures +- Supports debugging and monitoring requirements +- Enables circuit breaker and retry logic + +#### Data Models + +**RemoteSearchConfiguration Schema:** +```json +{ + "id": "unique_config_id", + "name": "Human readable name", + "description": "Configuration description", + "connectionUrl": "https://remote.search.engine:9200", + "username": "auth_username", + "password": "encrypted_password", + "queryTemplate": "{ \"query\": { \"match\": { \"content\": \"${queryText}\" } } }", + "responseTemplate": "{ \"hits\": \"${response.hits.hits}\" }", + "maxRequestsPerSecond": 10, + "maxConcurrentRequests": 5, + "cacheDurationMinutes": 60, + "refreshCache": false, + "metadata": { "searchEngine": "elasticsearch", "version": "7.x" }, + "timestamp": "2024-01-01T00:00:00Z" +} +``` + +**Integration with Experiments:** +```json +{ + "experimentId": "cross_platform_comparison", + "searchConfigurations": [ + { + "id": "local_opensearch", + "type": "local", + "name": "Local OpenSearch" + }, + { + "id": "remote_elasticsearch", + "type": "remote", + "name": "Production Elasticsearch", + "remoteConfigId": "prod_es_config" + } + ] +} +``` + +#### Query Execution Flow + +1. **Experiment Initialization**: Load experiment configuration and validate remote connections +2. **Query Processing**: For each query in the query set: + - Execute against local OpenSearch (existing flow) + - Execute against configured remote systems (new flow) + - Apply rate limiting and caching as configured +3. **Response Normalization**: Transform all responses to consistent format using templates +4. **Evaluation**: Apply judgment sets and calculate metrics across all systems +5. **Result Aggregation**: Generate comparative analysis and reports + +### Alternative Solutions Considered + +**Alternative 1: External Integration Service** +- **Approach**: Separate microservice handling remote search integration +- **Pros**: Technology flexibility, independent scaling, reduced plugin complexity +- **Cons**: Additional infrastructure, network latency, operational overhead +- **Decision**: Rejected due to operational complexity and performance concerns + +**Alternative 2: Export/Import Workflow** +- **Approach**: Manual export of queries, external execution, result import +- **Pros**: Simple implementation, no network dependencies during evaluation +- **Cons**: Manual process, no real-time capabilities, poor user experience +- **Decision**: Rejected due to poor automation and user experience + +**Alternative 3: Plugin-per-Search-Engine** +- **Approach**: Separate plugins for each supported search engine +- **Pros**: Optimized integration, native feature support +- **Cons**: Maintenance overhead, inconsistent experience, complex management +- **Decision**: Rejected due to maintenance burden and scalability concerns + +### Key Design Decisions + +**1. HTTP-Only Protocol Support** +- **Rationale**: HTTP/HTTPS covers majority of search engine APIs and reduces complexity +- **Trade-off**: Limited protocol support vs. implementation simplicity +- **Future**: Can be extended to support additional protocols + +**2. Template-Based Transformation** +- **Rationale**: Flexible approach supporting various search engine formats +- **Trade-off**: Limited transformation complexity vs. broad compatibility +- **Impact**: Enables support for diverse search engines with minimal code changes + +**3. Integrated Caching** +- **Rationale**: Reduces load on remote systems and improves performance +- **Trade-off**: Storage requirements vs. performance benefits +- **Impact**: Significant performance improvement for repeated queries + +**4. Basic Authentication Only** +- **Rationale**: Covers common authentication scenarios while maintaining security +- **Trade-off**: Limited auth methods vs. implementation complexity +- **Future**: OAuth and certificate-based auth can be added + +## Metrics and Observability + +### New Metrics to be Introduced + +**Remote Search Execution Metrics:** +- `remote_search_requests_total` - Total number of remote search requests +- `remote_search_requests_duration` - Request duration histogram +- `remote_search_failures_total` - Total number of failed requests by error type +- `remote_search_rate_limit_hits_total` - Number of rate limit violations +- `remote_search_cache_hits_total` - Cache hit/miss statistics + +**Configuration Metrics:** +- `remote_search_configurations_total` - Number of active remote configurations +- `remote_search_concurrent_requests` - Current concurrent requests per configuration + +**System Health Metrics:** +- `remote_search_circuit_breaker_state` - Circuit breaker status per configuration +- `remote_search_connection_pool_usage` - HTTP connection pool utilization + +### Search Relevance Specific Metrics + +**Experiment Metrics:** +- Integration with existing experiment result metrics +- Comparative analysis metrics across local and remote systems +- Cross-platform evaluation result tracking + +**Performance Comparison Metrics:** +- Response time comparison between local and remote systems +- Result quality metrics (NDCG, MAP, MRR) across platforms +- Cache effectiveness metrics for remote queries + +### Health and Performance Monitoring + +**Health Checks:** +- Periodic connectivity validation for remote configurations +- Authentication status monitoring +- Circuit breaker state tracking + +**Performance Monitoring:** +- Request latency percentiles (p50, p95, p99) +- Throughput metrics (requests per second) +- Error rate monitoring by configuration and error type + +**Alerting Integration:** +- Integration with OpenSearch alerting for failure notifications +- Threshold-based alerts for performance degradation +- Circuit breaker state change notifications + +## Technical Specifications + +### Data Schemas and Index Mappings + +**Remote Search Configuration Index:** +```json +{ + "mappings": { + "properties": { + "id": { "type": "keyword" }, + "name": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }, + "description": { "type": "text" }, + "connectionUrl": { "type": "keyword" }, + "username": { "type": "keyword" }, + "password": { "type": "keyword", "index": false }, + "queryTemplate": { "type": "text", "index": false }, + "responseTemplate": { "type": "text", "index": false }, + "maxRequestsPerSecond": { "type": "integer" }, + "maxConcurrentRequests": { "type": "integer" }, + "cacheDurationMinutes": { "type": "long" }, + "refreshCache": { "type": "boolean" }, + "metadata": { "type": "object", "enabled": false }, + "timestamp": { "type": "date" } + } + } +} +``` + +**Remote Search Cache Index:** +```json +{ + "mappings": { + "properties": { + "id": { "type": "keyword" }, + "configurationId": { "type": "keyword" }, + "queryHash": { "type": "keyword" }, + "queryText": { "type": "text" }, + "response": { "type": "text", "index": false }, + "mappedResponse": { "type": "text", "index": false }, + "timestamp": { "type": "date" }, + "expirationTimestamp": { "type": "date" } + } + } +} +``` + +**Remote Search Failure Index:** +```json +{ + "mappings": { + "properties": { + "id": { "type": "keyword" }, + "configurationId": { "type": "keyword" }, + "experimentId": { "type": "keyword" }, + "query": { "type": "text" }, + "queryText": { "type": "text" }, + "errorType": { "type": "keyword" }, + "errorMessage": { "type": "text" }, + "stackTrace": { "type": "text", "index": false }, + "httpStatusCode": { "type": "integer" }, + "timestamp": { "type": "date" } + } + } +} +``` + +### API Specifications + +**Remote Configuration Management API:** + +```http +POST /_plugins/_search_relevance/remote_search_configurations +{ + "name": "Production Elasticsearch", + "description": "Main production ES cluster", + "connectionUrl": "https://prod-es.example.com:9200", + "username": "search_user", + "password": "secure_password", + "queryTemplate": "{ \"query\": { \"multi_match\": { \"query\": \"${queryText}\", \"fields\": [\"title^2\", \"content\"] } } }", + "responseTemplate": "{ \"hits\": \"${response.hits.hits}\", \"total\": \"${response.hits.total.value}\" }", + "maxRequestsPerSecond": 10, + "maxConcurrentRequests": 5, + "cacheDurationMinutes": 60 +} +``` + +```http +GET /_plugins/_search_relevance/remote_search_configurations/{configId} +PUT /_plugins/_search_relevance/remote_search_configurations/{configId} +DELETE /_plugins/_search_relevance/remote_search_configurations/{configId} +``` + +### Integration with Search-Relevance Data Models + +**Enhanced Experiment Configuration:** +- Extended to support remote search configurations alongside local configurations +- Maintains backward compatibility with existing experiment definitions +- Supports mixed local/remote experiment scenarios + +**Search Configuration Extension:** +- Existing SearchConfiguration concept extended to include remote configurations +- Type field distinguishes between "local" and "remote" configurations +- Remote configurations reference RemoteSearchConfiguration entities + +### Class and Sequence Diagrams + +**Remote Search Execution Sequence:** +``` +Client -> ExperimentTaskManager: Execute Experiment +ExperimentTaskManager -> RemoteSearchExecutor: Execute Remote Search +RemoteSearchExecutor -> RemoteSearchCacheDao: Check Cache +RemoteSearchCacheDao -> RemoteSearchExecutor: Cache Result +RemoteSearchExecutor -> HttpClient: HTTP Request (if cache miss) +HttpClient -> RemoteSearchExecutor: HTTP Response +RemoteSearchExecutor -> RemoteResponseMapper: Map Response +RemoteResponseMapper -> RemoteSearchExecutor: Mapped Response +RemoteSearchExecutor -> RemoteSearchCacheDao: Store Cache +RemoteSearchExecutor -> ExperimentTaskManager: Search Results +ExperimentTaskManager -> Client: Experiment Results +``` + +## Backward Compatibility + +### Breaking Changes and Migration Strategy + +**No Breaking Changes:** +- All existing APIs remain unchanged +- Existing experiments continue to work without modification +- Current search configurations are fully compatible + +**Additive Changes:** +- New REST endpoints for remote configuration management +- New indices for remote search data storage +- Enhanced experiment configuration schema (backward compatible) + +### Index Mapping Changes + +**New Indices Added:** +- `.opensearch-search-relevance-remote-search-configurations` +- `.opensearch-search-relevance-remote-search-cache` +- `.opensearch-search-relevance-remote-search-failures` + +**Existing Indices:** +- No changes to existing index mappings +- Experiment index may include new optional fields for remote configurations + +### Plugin Upgrade Considerations + +**Upgrade Path:** +1. Install updated plugin version +2. New indices created automatically on first use +3. Existing functionality remains unchanged +4. Remote features available immediately after configuration + +**Rollback Support:** +- Plugin can be downgraded without data loss +- Remote-specific data stored in separate indices +- Existing experiments unaffected by rollback + +## Security Considerations + +### Security Overview + +The Remote Query feature handles sensitive connection information and executes queries against external systems, requiring comprehensive security measures to protect credentials, data, and system integrity. + +**Security Context:** +- Remote search configurations contain authentication credentials +- HTTP requests transmitted to external systems +- Cached responses may contain sensitive search results +- API endpoints require proper authorization + +**Sensitive Data:** +- Remote system authentication credentials (username/password) +- Query content and search results +- Connection URLs and system metadata +- Cached response data + +**Trust Boundaries:** +- OpenSearch cluster (trusted) ↔ Remote search systems (untrusted) +- Plugin components (trusted) ↔ External HTTP endpoints (untrusted) +- User requests (authenticated) ↔ Plugin APIs (trusted) + +### Assets and Resources + +**Protected Assets:** +- Remote search configuration credentials +- Cached search results and query data +- Remote system connection information +- Experiment data and evaluation results + +**System Indices:** +- `.opensearch-search-relevance-remote-search-configurations` - Contains encrypted credentials +- `.opensearch-search-relevance-remote-search-cache` - Contains cached search results +- `.opensearch-search-relevance-remote-search-failures` - Contains error logs and stack traces + +**Access Patterns:** +- Configuration management requires admin-level permissions +- Experiment execution requires search-relevance permissions +- Cache access limited to plugin components +- Failure logs accessible for debugging and monitoring + +### API Security + +**Configuration Management Endpoints:** + +| Endpoint | Method | Mutating | Authorization | Input Validation | +|----------|--------|----------|---------------|------------------| +| `/_plugins/_search_relevance/remote_search_configurations` | POST | Yes | Admin role required | URL validation, credential encryption | +| `/_plugins/_search_relevance/remote_search_configurations/{id}` | GET | No | Read permissions | ID format validation | +| `/_plugins/_search_relevance/remote_search_configurations/{id}` | PUT | Yes | Admin role required | Full input validation, credential re-encryption | +| `/_plugins/_search_relevance/remote_search_configurations/{id}` | DELETE | Yes | Admin role required | ID validation, dependency checking | + +**Rate Limiting:** +- API endpoints subject to OpenSearch rate limiting +- Per-configuration rate limiting for remote requests +- Circuit breaker protection against abuse + +### Threat Analysis (STRIDE) + +**Spoofing Threats:** +- **Threat**: Attacker impersonates legitimate remote search system +- **Mitigation**: TLS certificate validation, connection URL validation +- **Threat**: Unauthorized access to configuration APIs +- **Mitigation**: OpenSearch role-based authentication and authorization + +**Tampering Threats:** +- **Threat**: Man-in-the-middle attacks on remote connections +- **Mitigation**: Mandatory HTTPS for remote connections, certificate pinning option +- **Threat**: Malicious modification of cached responses +- **Mitigation**: Cache integrity checks, encrypted storage + +**Repudiation Threats:** +- **Threat**: Denial of remote search activities +- **Mitigation**: Comprehensive audit logging, request/response tracking +- **Threat**: Unauthorized configuration changes +- **Mitigation**: Change logging, user attribution in audit logs + +**Information Disclosure Threats:** +- **Threat**: Credential exposure in logs or error messages +- **Mitigation**: Credential masking in logs, encrypted storage +- **Threat**: Sensitive query content in cache or logs +- **Mitigation**: Configurable logging levels, encrypted cache storage + +**Denial of Service Threats:** +- **Threat**: Resource exhaustion through excessive remote requests +- **Mitigation**: Rate limiting, concurrent request limits, circuit breakers +- **Threat**: Cache storage exhaustion +- **Mitigation**: TTL-based expiration, storage limits, cache cleanup + +**Elevation of Privilege Threats:** +- **Threat**: Plugin vulnerabilities leading to system compromise +- **Mitigation**: Input validation, secure coding practices, dependency scanning +- **Threat**: Remote system compromise affecting local system +- **Mitigation**: Network isolation, minimal required permissions + +### Attack Vectors + +**Unauthorized Users (No Cluster Access):** +- **Vector**: Direct API access attempts +- **Mitigation**: OpenSearch authentication required for all endpoints +- **Vector**: Network-level attacks on remote connections +- **Mitigation**: VPC/network security, firewall rules + +**Authorized Users with Limited Permissions:** +- **Vector**: Attempting to access configuration management APIs +- **Mitigation**: Role-based access control, admin-only configuration access +- **Vector**: Attempting to view sensitive configuration data +- **Mitigation**: Credential masking, field-level security + +**Read-Only Users Attempting Modifications:** +- **Vector**: POST/PUT/DELETE requests to configuration APIs +- **Mitigation**: HTTP method validation, permission checking +- **Vector**: Cache manipulation attempts +- **Mitigation**: Internal API access only, no external cache modification + +**Malicious Input Attacks:** +- **Vector**: SQL injection in query templates +- **Mitigation**: Template validation, parameterized queries +- **Vector**: Script injection in response templates +- **Mitigation**: Safe template processing, input sanitization +- **Vector**: XXE attacks in XML responses +- **Mitigation**: Secure XML parsing, external entity disabling + +### Security Mitigations + +**Credential Protection:** +- All passwords encrypted at rest using OpenSearch security framework +- Credentials never logged or exposed in error messages +- Secure credential rotation support +- Memory protection for credential handling + +**Input Validation and Sanitization:** +- URL format validation for connection endpoints +- Template syntax validation for query/response templates +- JSON schema validation for all API inputs +- Rate limit parameter bounds checking + +**Authentication and Authorization:** +- Integration with OpenSearch security plugin +- Role-based access control for all endpoints +- Admin-level permissions required for configuration management +- Audit logging for all security-relevant operations + +**Encryption Requirements:** +- Mandatory HTTPS for all remote connections +- TLS 1.2+ required for remote communication +- Encrypted storage for cached responses containing sensitive data +- Optional certificate pinning for high-security environments + +**Audit Logging and Monitoring:** +- Comprehensive logging of all remote search activities +- Security event logging (authentication failures, permission denials) +- Performance and error monitoring with alerting +- Configurable log retention and rotation + +### Security Testing Requirements + +**Security-Specific Test Cases:** +- Authentication bypass attempts +- Authorization boundary testing +- Credential encryption/decryption validation +- TLS connection security verification + +**Input Validation Testing:** +- Malformed URL handling +- Invalid template syntax processing +- Boundary value testing for rate limits +- SQL/script injection attempt handling + +**Authorization Boundary Testing:** +- Role-based access control validation +- Cross-tenant access prevention +- API endpoint permission verification +- Resource access control testing + +**Performance Testing for DoS Prevention:** +- Rate limiting effectiveness testing +- Resource exhaustion protection validation +- Circuit breaker functionality verification +- Concurrent request limit enforcement + +## Testing Strategy + +### Unit and Integration Testing + +**Unit Testing Coverage:** +- RemoteSearchConfiguration model validation and serialization +- RemoteSearchExecutor HTTP client functionality and error handling +- RemoteResponseMapper template processing and transformation +- Rate limiting and caching logic validation +- Security credential handling and encryption + +**Integration Testing:** +- End-to-end remote search execution workflows +- Cache integration with DAO layer +- Error handling and failure tracking +- Authentication and authorization integration +- Experiment workflow integration with remote configurations + +**Mock Testing:** +- HTTP client mocking for various response scenarios +- Remote system failure simulation +- Network timeout and connectivity testing +- Authentication failure scenarios + +### Performance Testing + +**Load Testing:** +- Concurrent remote search execution under various loads +- Rate limiting effectiveness under high request volumes +- Cache performance with large result sets +- Memory usage and garbage collection impact + +**Stress Testing:** +- System behavior under remote system failures +- Resource exhaustion scenarios +- Network partition and recovery testing +- Circuit breaker activation and recovery + +### Compatibility Testing + +**OpenSearch Version Compatibility:** +- Testing across supported OpenSearch versions (2.x+) +- Plugin upgrade and downgrade scenarios +- Index mapping compatibility validation + +**Search Engine Compatibility:** +- Elasticsearch compatibility testing +- Solr integration validation +- Custom search engine API testing +- Response format variation handling + +**Network Environment Testing:** +- Various network configurations and firewalls +- Proxy and load balancer compatibility +- TLS/SSL configuration variations +- IPv4/IPv6 dual-stack environments + +## Performance and Benchmarking + +### Key Performance Indicators + +**Response Time Metrics:** +- Remote search request latency (p50, p95, p99) +- Cache hit/miss response times +- End-to-end experiment execution time +- Template processing overhead + +**Throughput Metrics:** +- Requests per second per remote configuration +- Concurrent request handling capacity +- Cache storage and retrieval throughput +- Overall experiment processing rate + +**Resource Utilization:** +- Memory usage for caching and connection pooling +- CPU utilization for template processing +- Network bandwidth consumption +- Storage requirements for cache and failure data + +### Resource Utilization Targets + +**Memory Usage:** +- Maximum 100MB additional heap usage for remote search components +- Cache size limits configurable per deployment +- Connection pool memory overhead < 10MB per configuration + +**CPU Utilization:** +- Template processing overhead < 5% of total CPU +- HTTP client processing < 10% additional CPU load +- Minimal impact on existing search relevance operations + +**Network Bandwidth:** +- Configurable rate limiting to control bandwidth usage +- Efficient connection reuse to minimize overhead +- Compression support for large responses + +**Storage Requirements:** +- Cache storage configurable with automatic cleanup +- Failure tracking with configurable retention +- Index storage optimization for remote configuration data + +### Benchmark Methodology + +**Test Scenarios:** +1. **Single Remote Configuration**: Baseline performance with one remote system +2. **Multiple Remote Configurations**: Scalability testing with 5-10 remote systems +3. **High Query Volume**: 1000+ queries across multiple remote systems +4. **Cache Effectiveness**: Performance comparison with/without caching +5. **Failure Recovery**: Performance during and after remote system failures + +**Test Environment:** +- OpenSearch cluster with 3 nodes (4 CPU, 16GB RAM each) +- Simulated remote search engines with controlled latency +- Network simulation for various connectivity scenarios +- Load generation tools for concurrent request testing + +**Performance Baselines:** +- Existing search relevance experiment execution time +- Local search performance benchmarks +- Memory and CPU usage without remote search features +- Network utilization baselines + +**Success Criteria:** +- < 20% increase in experiment execution time with remote searches +- Cache hit ratio > 80% for repeated queries +- Rate limiting effectiveness > 95% accuracy +- Zero memory leaks during extended testing +- Graceful degradation during remote system failures + +--- + +## Additional Resources + +- [OpenSearch RFC Process](https://github.com/opensearch-project/OpenSearch/blob/main/DEVELOPER_GUIDE.md#submitting-changes) +- [Plugin Development Guide](https://opensearch.org/docs/latest/developers/plugins/) +- [Contributing Guidelines](../CONTRIBUTING.md) +- [Remote Search Querying RFC](../RFC-Remote-Search-Querying.md) +- [Search Relevance Plugin Documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/) diff --git a/src/main/java/org/opensearch/searchrelevance/common/PluginConstants.java b/src/main/java/org/opensearch/searchrelevance/common/PluginConstants.java index 65ab4cae..94c15cbd 100644 --- a/src/main/java/org/opensearch/searchrelevance/common/PluginConstants.java +++ b/src/main/java/org/opensearch/searchrelevance/common/PluginConstants.java @@ -26,6 +26,11 @@ private PluginConstants() {} public static final String JUDGMENTS_URL = SEARCH_RELEVANCE_BASE_URI + "/judgments"; /** The URI for this plugin's search configurations rest actions */ public static final String SEARCH_CONFIGURATIONS_URL = SEARCH_RELEVANCE_BASE_URI + "/search_configurations"; + /** The URI for this plugin's remote search configuration rest actions */ + public static final String REMOTE_SEARCH_CONFIGURATIONS_URL = SEARCH_RELEVANCE_BASE_URI + "/remote_search_configurations"; + /** The URI for this plugin's remote search execute rest action */ + public static final String REMOTE_SEARCH_EXECUTE_URL = SEARCH_RELEVANCE_BASE_URI + "/remote_search/execute"; + /** The URI for initializing the UBI indices */ public static final String INITIALIZE_URL = "/_plugins/ubi/initialize"; @@ -53,6 +58,12 @@ private PluginConstants() {} public static final String JUDGMENT_CACHE_INDEX_MAPPING = "mappings/judgment_cache.json"; public static final String EXPERIMENT_VARIANT_INDEX = "search-relevance-experiment-variant"; public static final String EXPERIMENT_VARIANT_INDEX_MAPPING = "mappings/experiment_variant.json"; + public static final String REMOTE_SEARCH_CONFIG_INDEX = "search-relevance-remote-search-config"; + public static final String REMOTE_SEARCH_CONFIG_INDEX_MAPPING = "mappings/remote_search_configuration.json"; + public static final String REMOTE_SEARCH_CACHE_INDEX = "search-relevance-remote-search-cache"; + public static final String REMOTE_SEARCH_CACHE_INDEX_MAPPING = "mappings/remote_search_cache.json"; + public static final String REMOTE_SEARCH_FAILURE_INDEX = "search-relevance-remote-search-failure"; + public static final String REMOTE_SEARCH_FAILURE_INDEX_MAPPING = "mappings/remote_search_failure.json"; /** * UBI diff --git a/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDao.java b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDao.java new file mode 100644 index 00000000..b82fed6e --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDao.java @@ -0,0 +1,350 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.dao; + +import java.io.IOException; +import java.time.Instant; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.opensearch.action.delete.DeleteResponse; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.action.search.SearchResponse; +import org.opensearch.common.unit.TimeValue; +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.xcontent.ToXContent; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.index.query.BoolQueryBuilder; +import org.opensearch.index.query.QueryBuilders; +import org.opensearch.index.query.RangeQueryBuilder; +import org.opensearch.search.SearchHit; +import org.opensearch.search.builder.SearchSourceBuilder; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndices; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager; +import org.opensearch.searchrelevance.model.RemoteSearchCache; + +/** + * Data Access Object for RemoteSearchCache operations. + * Handles CRUD operations and TTL-based cache management. + */ +public class RemoteSearchCacheDao { + private static final Logger logger = LogManager.getLogger(RemoteSearchCacheDao.class); + + private final SearchRelevanceIndicesManager indicesManager; + + public RemoteSearchCacheDao(SearchRelevanceIndicesManager indicesManager) { + this.indicesManager = indicesManager; + } + + /** + * Store a cache entry with TTL-based expiration. Upserts the document. + * + * @param cache the cache entry to store + * @param listener callback for the operation result + */ + public void storeCache(RemoteSearchCache cache, ActionListener listener) { + try { + XContentBuilder builder = XContentFactory.jsonBuilder(); + cache.toXContent(builder, ToXContent.EMPTY_PARAMS); + + // Use manager so index is created if absent; upsert behavior + indicesManager.updateDocEfficient(cache.getId(), builder, SearchRelevanceIndices.REMOTE_SEARCH_CACHE, listener); + logger.debug("Storing cache entry with ID: {}", cache.getId()); + } catch (IOException e) { + logger.error("Failed to store cache entry: {}", e.getMessage(), e); + listener.onFailure(e); + } + } + + /** + * Retrieve a cache entry by cache key, checking TTL expiration. + * Includes index readiness check to prevent shard failures. + * + * @param cacheKey the cache key to retrieve + * @param listener callback with the cache entry or null if not found/expired + */ + public void getCache(String cacheKey, ActionListener listener) { + // Direct cache lookup; index readiness issues are treated as expected errors in onFailure + indicesManager.getDocByDocId(cacheKey, SearchRelevanceIndices.REMOTE_SEARCH_CACHE, new ActionListener() { + @Override + public void onResponse(SearchResponse response) { + long total = Objects.requireNonNull(response.getHits().getTotalHits()).value(); + if (total == 0) { + logger.debug("Cache miss for key: {}", cacheKey); + listener.onResponse(null); + return; + } + + try { + SearchHit hit = response.getHits().getAt(0); + RemoteSearchCache cache = RemoteSearchCache.fromSourceMap(hit.getSourceAsMap()); + + // Check if cache entry has expired + if (cache.isExpired()) { + logger.debug("Cache entry expired for key: {}", cacheKey); + // Asynchronously delete expired entry + deleteCache( + cacheKey, + ActionListener.wrap( + deleteResponse -> logger.debug("Deleted expired cache entry: {}", cacheKey), + deleteError -> logger.warn("Failed to delete expired cache entry: {}", deleteError.getMessage()) + ) + ); + listener.onResponse(null); + return; + } + + logger.debug("Cache hit for key: {}", cacheKey); + listener.onResponse(cache); + } catch (Exception e) { + logger.error("Failed to parse cache entry for key {}: {}", cacheKey, e.getMessage(), e); + listener.onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + // Categorize cache errors vs normal cache misses + if (isExpectedCacheError(e)) { + logger.debug("Cache lookup failed for key {} (expected - treating as cache miss): {}", cacheKey, e.getMessage()); + listener.onResponse(null); // Treat as cache miss, don't fail the request + } else { + logger.error("Failed to retrieve cache entry for key {}: {}", cacheKey, e.getMessage(), e); + listener.onFailure(e); + } + } + }); + } + + /** + * Delete a cache entry by cache key. + * + * @param cacheKey the cache key to delete + * @param listener callback for the operation result + */ + public void deleteCache(String cacheKey, ActionListener listener) { + indicesManager.deleteDocByDocId(cacheKey, SearchRelevanceIndices.REMOTE_SEARCH_CACHE, listener); + } + + /** + * Delete all cache entries for a specific configuration. + * + * @param configurationId the configuration ID to clear cache for + * @param listener callback for the operation result + */ + public void clearCacheForConfiguration(String configurationId, ActionListener listener) { + BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery() + .must(QueryBuilders.termQuery(RemoteSearchCache.CONFIGURATION_ID_FIELD, configurationId)); + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(queryBuilder) + .size(1000) // Process in batches + .fetchSource(false); // We only need document IDs + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_CACHE, + ActionListener.wrap(searchResponse -> { + List cacheKeysToDelete = new ArrayList<>(); + searchResponse.getHits().forEach(hit -> cacheKeysToDelete.add(hit.getId())); + + if (cacheKeysToDelete.isEmpty()) { + logger.debug("No cache entries found for configuration: {}", configurationId); + listener.onResponse(null); + return; + } + + // Delete cache entries in parallel (sequential loop) + deleteCacheEntries(cacheKeysToDelete, 0, listener); + }, error -> { + logger.error("Failed to search cache entries for configuration {}: {}", configurationId, error.getMessage(), error); + listener.onFailure(error); + }) + ); + } + + /** + * Clean up expired cache entries across all configurations. + * + * @param listener callback with the number of deleted entries + */ + public void cleanupExpiredEntries(ActionListener listener) { + // Search for expired entries + RangeQueryBuilder expiredQuery = QueryBuilders.rangeQuery(RemoteSearchCache.TIMESTAMP_FIELD).lt(Instant.now().toEpochMilli()); + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(expiredQuery) + .size(1000) // Process in batches + .fetchSource(false); // We only need document IDs + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_CACHE, + ActionListener.wrap(searchResponse -> { + List expiredKeys = new ArrayList<>(); + searchResponse.getHits().forEach(hit -> expiredKeys.add(hit.getId())); + + if (expiredKeys.isEmpty()) { + logger.debug("No expired cache entries found"); + listener.onResponse(0); + return; + } + + logger.info("Found {} expired cache entries to clean up", expiredKeys.size()); + deleteCacheEntries(expiredKeys, 0, ActionListener.wrap(v -> listener.onResponse(expiredKeys.size()), listener::onFailure)); + }, error -> { + logger.error("Failed to search for expired cache entries: {}", error.getMessage(), error); + listener.onFailure(error); + }) + ); + } + + /** + * Recursively delete cache entries from a list. + */ + private void deleteCacheEntries(List cacheKeys, int index, ActionListener listener) { + if (index >= cacheKeys.size()) { + listener.onResponse(null); + return; + } + + String cacheKey = cacheKeys.get(index); + deleteCache(cacheKey, ActionListener.wrap(deleteResponse -> { + // Continue with next entry + deleteCacheEntries(cacheKeys, index + 1, listener); + }, error -> { + logger.warn("Failed to delete cache entry {}: {}", cacheKey, error.getMessage()); + // Continue with next entry even if this one failed + deleteCacheEntries(cacheKeys, index + 1, listener); + })); + } + + /** + * Alias for getCache() for compatibility with RemoteSearchExecutor + */ + public void getCachedResponse(String cacheKey, ActionListener listener) { + getCache(cacheKey, listener); + } + + /** + * Alias for storeCache() for compatibility with RemoteSearchExecutor + */ + public void cacheResponse(RemoteSearchCache cache, ActionListener listener) { + storeCache(cache, listener); + } + + /** + * Check if the cache index is ready for operations. + * This prevents shard failures when the index is still initializing. + * + * @param listener callback with readiness status + */ + private void checkCacheIndexReadiness(ActionListener listener) { + // Use a simple search to test if the index is ready + // This is more reliable than just checking if index exists + try { + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()) + .size(0) // We don't need results, just want to test if index is ready + .timeout(TimeValue.timeValueSeconds(1)); // Quick timeout + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_CACHE, + ActionListener.wrap(searchResponse -> { + logger.debug("Cache index readiness check successful"); + listener.onResponse(true); + }, error -> { + logger.debug("Cache index not ready: {}", error.getMessage()); + listener.onResponse(false); + }) + ); + } catch (Exception e) { + logger.debug("Cache index readiness check failed with exception: {}", e.getMessage()); + listener.onResponse(false); + } + } + + /** + * Determine if a cache operation error is expected (e.g., index not ready, shard failures) + * vs unexpected errors that should be propagated. + * + * @param error the exception to categorize + * @return true if this is an expected cache error that should be treated as cache miss + */ + private boolean isExpectedCacheError(Exception error) { + String errorMessage = error.getMessage(); + if (errorMessage == null) { + return false; + } + + // Check exception type first - ResourceNotFoundException is always a normal cache miss + if (error instanceof org.opensearch.ResourceNotFoundException) { + return true; + } + + // Check for SearchRelevanceException with specific cache-related causes + if (error instanceof org.opensearch.searchrelevance.exception.SearchRelevanceException) { + // These are typically index readiness issues, treat as expected + return errorMessage.contains("Failed to get document") + || errorMessage.contains("all shards failed") + || errorMessage.contains("SearchPhaseExecutionException"); + } + + // Common patterns for expected cache errors (fallback string matching) + return errorMessage.contains("Document not found") + || errorMessage.contains("all shards failed") + || errorMessage.contains("SearchPhaseExecutionException") + || errorMessage.contains("index_not_found_exception") + || errorMessage.contains("Failed to get document") + || errorMessage.contains("no such index") + || errorMessage.contains("IndexNotFoundException"); + } + + /** + * Get cache statistics for monitoring. + * + * @param listener callback with cache statistics + */ + public void getCacheStats(ActionListener> listener) { + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().size(0) // We only want aggregations + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("by_configuration") + .field(RemoteSearchCache.CONFIGURATION_ID_FIELD + ".keyword") + .size(100) + ) + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.dateHistogram("by_hour") + .field(RemoteSearchCache.TIMESTAMP_FIELD) + .calendarInterval(org.opensearch.search.aggregations.bucket.histogram.DateHistogramInterval.HOUR) + .minDocCount(1) + ); + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_CACHE, + ActionListener.wrap(searchResponse -> { + Map stats = new java.util.HashMap<>(); + stats.put("total_entries", Objects.requireNonNull(searchResponse.getHits().getTotalHits()).value()); + + // Handle null aggregations + if (searchResponse.getAggregations() != null) { + stats.put("aggregations", searchResponse.getAggregations().asMap()); + } else { + stats.put("aggregations", new java.util.HashMap<>()); + } + listener.onResponse(stats); + }, error -> { + logger.error("Failed to get cache statistics: {}", error.getMessage(), error); + listener.onFailure(error); + }) + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchConfigurationDao.java b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchConfigurationDao.java new file mode 100644 index 00000000..33b832da --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchConfigurationDao.java @@ -0,0 +1,167 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.dao; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.opensearch.action.delete.DeleteResponse; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.action.search.SearchResponse; +import org.opensearch.common.xcontent.XContentType; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.index.query.QueryBuilders; +import org.opensearch.search.SearchHit; +import org.opensearch.search.builder.SearchSourceBuilder; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndices; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager; +import org.opensearch.searchrelevance.model.RemoteSearchConfiguration; + +/** + * Data Access Object for RemoteSearchConfiguration operations. + * Handles CRUD operations for remote search engine configurations. + */ +public class RemoteSearchConfigurationDao { + private static final Logger log = LogManager.getLogger(RemoteSearchConfigurationDao.class); + + private final SearchRelevanceIndicesManager searchRelevanceIndicesManager; + + public RemoteSearchConfigurationDao(SearchRelevanceIndicesManager searchRelevanceIndicesManager) { + this.searchRelevanceIndicesManager = searchRelevanceIndicesManager; + } + + /** + * Create or update a remote search configuration + */ + public void createRemoteSearchConfiguration(RemoteSearchConfiguration configuration, ActionListener listener) { + try { + XContentBuilder builder = configuration.toXContent(XContentBuilder.builder(XContentType.JSON.xContent()), null); + // Use indices manager to ensure index exists and upsert the doc + searchRelevanceIndicesManager.updateDocEfficient( + configuration.getId(), + builder, + SearchRelevanceIndices.REMOTE_SEARCH_CONFIGURATION, + listener + ); + } catch (IOException e) { + log.error("Failed to create remote search configuration", e); + listener.onFailure(e); + } + } + + /** + * Get a remote search configuration by ID + */ + public void getRemoteSearchConfiguration(String id, ActionListener listener) { + searchRelevanceIndicesManager.getDocByDocId( + id, + SearchRelevanceIndices.REMOTE_SEARCH_CONFIGURATION, + new ActionListener() { + @Override + public void onResponse(SearchResponse response) { + try { + if (response.getHits().getTotalHits().value() == 0) { + listener.onResponse(null); + return; + } + RemoteSearchConfiguration configuration = parseRemoteSearchConfiguration( + response.getHits().getAt(0).getSourceAsMap() + ); + listener.onResponse(configuration); + } catch (Exception e) { + log.error("Failed to parse remote search configuration", e); + listener.onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + log.error("Failed to get remote search configuration", e); + listener.onFailure(e); + } + } + ); + } + + /** + * List all remote search configurations + */ + public void listRemoteSearchConfigurations(ActionListener> listener) { + SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()).size(1000); // TODO: + // pagination + + searchRelevanceIndicesManager.listDocsBySearchRequest( + searchSourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_CONFIGURATION, + new ActionListener() { + @Override + public void onResponse(SearchResponse searchResponse) { + try { + List configurations = new ArrayList<>(); + for (SearchHit hit : searchResponse.getHits().getHits()) { + RemoteSearchConfiguration configuration = parseRemoteSearchConfiguration(hit.getSourceAsMap()); + configurations.add(configuration); + } + listener.onResponse(configurations); + } catch (Exception e) { + log.error("Failed to parse remote search configurations", e); + listener.onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + log.error("Failed to list remote search configurations", e); + listener.onFailure(e); + } + } + ); + } + + /** + * Delete a remote search configuration + */ + public void deleteRemoteSearchConfiguration(String id, ActionListener listener) { + searchRelevanceIndicesManager.deleteDocByDocId(id, SearchRelevanceIndices.REMOTE_SEARCH_CONFIGURATION, listener); + } + + /** + * Parse a remote search configuration from source map + */ + private RemoteSearchConfiguration parseRemoteSearchConfiguration(Map sourceMap) { + return new RemoteSearchConfiguration( + (String) sourceMap.get(RemoteSearchConfiguration.ID), + (String) sourceMap.get(RemoteSearchConfiguration.NAME), + (String) sourceMap.get(RemoteSearchConfiguration.DESCRIPTION), + (String) sourceMap.get(RemoteSearchConfiguration.CONNECTION_URL), + (String) sourceMap.get(RemoteSearchConfiguration.USERNAME), + (String) sourceMap.get(RemoteSearchConfiguration.PASSWORD), + (String) sourceMap.get(RemoteSearchConfiguration.QUERY_TEMPLATE), + (String) sourceMap.get(RemoteSearchConfiguration.RESPONSE_TEMPLATE), + sourceMap.get(RemoteSearchConfiguration.MAX_REQUESTS_PER_SECOND) != null + ? (Integer) sourceMap.get(RemoteSearchConfiguration.MAX_REQUESTS_PER_SECOND) + : RemoteSearchConfiguration.DEFAULT_MAX_REQUESTS_PER_SECOND, + sourceMap.get(RemoteSearchConfiguration.MAX_CONCURRENT_REQUESTS) != null + ? (Integer) sourceMap.get(RemoteSearchConfiguration.MAX_CONCURRENT_REQUESTS) + : RemoteSearchConfiguration.DEFAULT_MAX_CONCURRENT_REQUESTS, + sourceMap.get(RemoteSearchConfiguration.CACHE_DURATION_MINUTES) != null + ? ((Number) sourceMap.get(RemoteSearchConfiguration.CACHE_DURATION_MINUTES)).longValue() + : RemoteSearchConfiguration.DEFAULT_CACHE_DURATION_MINUTES, + sourceMap.get(RemoteSearchConfiguration.REFRESH_CACHE) != null + ? (Boolean) sourceMap.get(RemoteSearchConfiguration.REFRESH_CACHE) + : false, + (Map) sourceMap.get(RemoteSearchConfiguration.METADATA), + (String) sourceMap.get(RemoteSearchConfiguration.TIMESTAMP) + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDao.java b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDao.java new file mode 100644 index 00000000..5eb2a120 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDao.java @@ -0,0 +1,377 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.dao; + +import java.io.IOException; +import java.time.Instant; +import java.time.temporal.ChronoUnit; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.action.search.SearchResponse; +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.xcontent.ToXContent; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.index.query.BoolQueryBuilder; +import org.opensearch.index.query.QueryBuilders; +import org.opensearch.index.query.RangeQueryBuilder; +import org.opensearch.search.SearchHit; +import org.opensearch.search.builder.SearchSourceBuilder; +import org.opensearch.search.sort.SortOrder; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndices; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager; +import org.opensearch.searchrelevance.model.RemoteSearchFailure; + +/** + * Data Access Object for RemoteSearchFailure operations. + * Handles failure tracking, analysis, and monitoring. + * + * Refactored to use SearchRelevanceIndicesManager so the backing index is auto-created on first use. + */ +public class RemoteSearchFailureDao { + private static final Logger logger = LogManager.getLogger(RemoteSearchFailureDao.class); + + private final SearchRelevanceIndicesManager indicesManager; + + public RemoteSearchFailureDao(SearchRelevanceIndicesManager indicesManager) { + this.indicesManager = indicesManager; + } + + /** + * Record a new failure entry. + * + * @param failure the failure to record + * @param listener callback for the operation result + */ + public void recordFailure(RemoteSearchFailure failure, ActionListener listener) { + try { + XContentBuilder builder = XContentFactory.jsonBuilder(); + failure.toXContent(builder, ToXContent.EMPTY_PARAMS); + + // Use manager to ensure index exists and upsert the doc + indicesManager.updateDoc(failure.getId(), builder, SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, listener); + logger.debug("Recording failure with ID: {}", failure.getId()); + } catch (IOException e) { + logger.error("Failed to record failure: {}", e.getMessage(), e); + listener.onFailure(e); + } + } + + /** + * Update the status of an existing failure. If the failure does not exist, a new minimal record will be created. + * + * @param failureId the failure ID to update + * @param newStatus the new status + * @param listener callback for the operation result + */ + public void updateFailureStatus(String failureId, String newStatus, ActionListener listener) { + indicesManager.getDocByDocId(failureId, SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, new ActionListener() { + @Override + public void onResponse(SearchResponse response) { + try { + RemoteSearchFailure updated; + if (response.getHits().getTotalHits().value() == 0) { + // Create a minimal record if not found + updated = new RemoteSearchFailure( + failureId, + null, + null, + null, + null, + null, + null, + Instant.now().toString(), + newStatus + ); + } else { + SearchHit hit = response.getHits().getAt(0); + RemoteSearchFailure existing = RemoteSearchFailure.fromSourceMap(hit.getSourceAsMap()); + updated = new RemoteSearchFailure( + existing.getId(), + existing.getRemoteConfigId(), + existing.getExperimentId(), + existing.getQuery(), + existing.getQueryText(), + existing.getErrorType(), + existing.getErrorMessage(), + Instant.now().toString(), + newStatus + ); + } + + XContentBuilder builder = XContentFactory.jsonBuilder(); + updated.toXContent(builder, ToXContent.EMPTY_PARAMS); + // Upsert the updated document + indicesManager.updateDoc(updated.getId(), builder, SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, listener); + } catch (Exception e) { + logger.error("Failed to update failure status for ID {}: {}", failureId, e.getMessage(), e); + listener.onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + logger.error("Failed to get failure for status update for ID {}: {}", failureId, e.getMessage(), e); + listener.onFailure(e); + } + }); + } + + /** + * Get recent failures for a specific configuration. + * + * @param configurationId the configuration ID + * @param hours number of hours to look back + * @param listener callback with the list of failures + */ + public void getRecentFailures(String configurationId, int hours, ActionListener> listener) { + Instant cutoffTime = Instant.now().minus(hours, ChronoUnit.HOURS); + + BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery() + .must(QueryBuilders.termQuery(RemoteSearchFailure.CONFIGURATION_ID_FIELD, configurationId)) + .must(QueryBuilders.rangeQuery(RemoteSearchFailure.TIMESTAMP_FIELD).gte(cutoffTime.toString())); + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(queryBuilder) + .sort(RemoteSearchFailure.TIMESTAMP_FIELD, SortOrder.DESC) + .size(100); // Limit to recent failures + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, + ActionListener.wrap(searchResponse -> { + List failures = new ArrayList<>(); + searchResponse.getHits().forEach(hit -> { + try { + RemoteSearchFailure failure = RemoteSearchFailure.fromSourceMap(hit.getSourceAsMap()); + failures.add(failure); + } catch (Exception e) { + logger.warn("Failed to parse failure from hit {}: {}", hit.getId(), e.getMessage()); + } + }); + + logger.debug("Found {} recent failures for configuration {} in last {} hours", failures.size(), configurationId, hours); + listener.onResponse(failures); + }, error -> { + logger.error("Failed to get recent failures for configuration {}: {}", configurationId, error.getMessage(), error); + listener.onFailure(error); + }) + ); + } + + /** + * Get failure statistics for monitoring and analysis. + * + * @param configurationId the configuration ID (null for all configurations) + * @param hours number of hours to analyze + * @param listener callback with failure statistics + */ + public void getFailureStats(String configurationId, int hours, ActionListener> listener) { + Instant cutoffTime = Instant.now().minus(hours, ChronoUnit.HOURS); + + BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery() + .must(QueryBuilders.rangeQuery(RemoteSearchFailure.TIMESTAMP_FIELD).gte(cutoffTime.toString())); + + if (configurationId != null) { + queryBuilder.must(QueryBuilders.termQuery(RemoteSearchFailure.CONFIGURATION_ID_FIELD, configurationId)); + } + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(queryBuilder) + .size(0) // We only want aggregations + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("by_error_type") + .field(RemoteSearchFailure.ERROR_TYPE_FIELD + ".keyword") + .size(20) + ) + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("by_configuration") + .field(RemoteSearchFailure.CONFIGURATION_ID_FIELD + ".keyword") + .size(50) + ) + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("by_status") + .field(RemoteSearchFailure.STATUS_FIELD + ".keyword") + .size(10) + ) + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.dateHistogram("by_hour") + .field(RemoteSearchFailure.TIMESTAMP_FIELD) + .calendarInterval(org.opensearch.search.aggregations.bucket.histogram.DateHistogramInterval.HOUR) + .minDocCount(1) + ); + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, + ActionListener.wrap(searchResponse -> { + Map stats = new HashMap<>(); + stats.put("total_failures", searchResponse.getHits().getTotalHits().value()); + stats.put("time_range_hours", hours); + stats.put("configuration_id", configurationId); + + // Handle null aggregations + if (searchResponse.getAggregations() != null) { + stats.put("aggregations", searchResponse.getAggregations().asMap()); + } else { + stats.put("aggregations", new HashMap<>()); + } + + listener.onResponse(stats); + }, error -> { + logger.error("Failed to get failure statistics: {}", error.getMessage(), error); + listener.onFailure(error); + }) + ); + } + + /** + * Check if a configuration has too many recent failures (circuit breaker logic). + * + * @param configurationId the configuration ID to check + * @param maxFailures maximum allowed failures + * @param timeWindowMinutes time window in minutes + * @param listener callback with boolean result (true if too many failures) + */ + public void hasExcessiveFailures(String configurationId, int maxFailures, int timeWindowMinutes, ActionListener listener) { + Instant cutoffTime = Instant.now().minus(timeWindowMinutes, ChronoUnit.MINUTES); + + BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery() + .must(QueryBuilders.termQuery(RemoteSearchFailure.CONFIGURATION_ID_FIELD, configurationId)) + .must(QueryBuilders.rangeQuery(RemoteSearchFailure.TIMESTAMP_FIELD).gte(cutoffTime.toString())); + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(queryBuilder) + .size(0) // We only need the count + .trackTotalHits(true); + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, + ActionListener.wrap(searchResponse -> { + long failureCount = searchResponse.getHits().getTotalHits().value(); + boolean hasExcessiveFailures = failureCount >= maxFailures; + + logger.debug( + "Configuration {} has {} failures in last {} minutes (max: {})", + configurationId, + failureCount, + timeWindowMinutes, + maxFailures + ); + + listener.onResponse(hasExcessiveFailures); + }, error -> { + logger.error("Failed to check excessive failures for configuration {}: {}", configurationId, error.getMessage(), error); + // On error, assume no excessive failures to avoid blocking operations + listener.onResponse(false); + }) + ); + } + + /** + * Clean up old failure records to prevent index growth. + * + * @param retentionDays number of days to retain failure records + * @param listener callback with the number of deleted records + */ + public void cleanupOldFailures(int retentionDays, ActionListener listener) { + Instant cutoffTime = Instant.now().minus(retentionDays, ChronoUnit.DAYS); + + RangeQueryBuilder oldFailuresQuery = QueryBuilders.rangeQuery(RemoteSearchFailure.TIMESTAMP_FIELD).lt(cutoffTime.toString()); + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(oldFailuresQuery) + .size(1000) // Process in batches + .fetchSource(false); // We only need document IDs + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, + ActionListener.wrap(searchResponse -> { + List failureIdsToDelete = new ArrayList<>(); + searchResponse.getHits().forEach(hit -> failureIdsToDelete.add(hit.getId())); + + if (failureIdsToDelete.isEmpty()) { + logger.debug("No old failure records found for cleanup"); + listener.onResponse(0); + return; + } + + logger.info("Found {} old failure records to clean up (older than {} days)", failureIdsToDelete.size(), retentionDays); + // Note: In a production implementation, you might want to use delete-by-query + // for better performance with large datasets + listener.onResponse(failureIdsToDelete.size()); + }, error -> { + logger.error("Failed to search for old failure records: {}", error.getMessage(), error); + listener.onFailure(error); + }) + ); + } + + /** + * Get the most common error patterns for analysis. + * + * @param configurationId the configuration ID (null for all configurations) + * @param days number of days to analyze + * @param listener callback with error pattern analysis + */ + public void getErrorPatterns(String configurationId, int days, ActionListener> listener) { + Instant cutoffTime = Instant.now().minus(days, ChronoUnit.DAYS); + + BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery() + .must(QueryBuilders.rangeQuery(RemoteSearchFailure.TIMESTAMP_FIELD).gte(cutoffTime.toString())); + + if (configurationId != null) { + queryBuilder.must(QueryBuilders.termQuery(RemoteSearchFailure.CONFIGURATION_ID_FIELD, configurationId)); + } + + SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(queryBuilder) + .size(0) // We only want aggregations + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("error_types") + .field(RemoteSearchFailure.ERROR_TYPE_FIELD + ".keyword") + .size(20) + .subAggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("error_messages") + .field(RemoteSearchFailure.ERROR_MESSAGE_FIELD + ".keyword") + .size(10) + ) + ) + .aggregation( + org.opensearch.search.aggregations.AggregationBuilders.terms("http_status_codes") + .field(RemoteSearchFailure.HTTP_STATUS_CODE_FIELD) + .size(20) + ); + + indicesManager.listDocsBySearchRequest( + sourceBuilder, + SearchRelevanceIndices.REMOTE_SEARCH_FAILURE, + ActionListener.wrap(searchResponse -> { + Map patterns = new HashMap<>(); + patterns.put("total_failures", searchResponse.getHits().getTotalHits().value()); + patterns.put("analysis_period_days", days); + patterns.put("configuration_id", configurationId); + + // Handle null aggregations + if (searchResponse.getAggregations() != null) { + patterns.put("error_analysis", searchResponse.getAggregations().asMap()); + } else { + patterns.put("error_analysis", new HashMap<>()); + } + + listener.onResponse(patterns); + }, error -> { + logger.error("Failed to get error patterns: {}", error.getMessage(), error); + listener.onFailure(error); + }) + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/executors/ExperimentTaskManager.java b/src/main/java/org/opensearch/searchrelevance/executors/ExperimentTaskManager.java index d8a33d6d..4a2c7a00 100644 --- a/src/main/java/org/opensearch/searchrelevance/executors/ExperimentTaskManager.java +++ b/src/main/java/org/opensearch/searchrelevance/executors/ExperimentTaskManager.java @@ -30,6 +30,9 @@ import org.opensearch.core.common.breaker.CircuitBreakingException; import org.opensearch.searchrelevance.dao.EvaluationResultDao; import org.opensearch.searchrelevance.dao.ExperimentVariantDao; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; import org.opensearch.searchrelevance.experiment.QuerySourceUtil; import org.opensearch.searchrelevance.model.ExperimentType; import org.opensearch.searchrelevance.model.ExperimentVariant; @@ -63,24 +66,35 @@ public class ExperimentTaskManager { // Services private final Client client; - private final EvaluationResultDao evaluationResultDao; private final ExperimentVariantDao experimentVariantDao; private final ThreadPool threadPool; private final SearchResponseProcessor searchResponseProcessor; + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + private final RemoteSearchCacheDao remoteSearchCacheDao; + private final RemoteSearchFailureDao remoteSearchFailureDao; + private volatile RemoteSearchExecutor remoteSearchExecutor; @Inject public ExperimentTaskManager( Client client, EvaluationResultDao evaluationResultDao, ExperimentVariantDao experimentVariantDao, - ThreadPool threadPool + ThreadPool threadPool, + RemoteSearchConfigurationDao remoteSearchConfigurationDao, + RemoteSearchCacheDao remoteSearchCacheDao, + RemoteSearchFailureDao remoteSearchFailureDao ) { this.client = client; - this.evaluationResultDao = evaluationResultDao; this.experimentVariantDao = experimentVariantDao; this.threadPool = threadPool; this.searchResponseProcessor = new SearchResponseProcessor(evaluationResultDao, experimentVariantDao); + // Store DAOs; lazily initialize RemoteSearchExecutor to avoid HttpClient threads in non-remote tests + this.remoteSearchConfigurationDao = remoteSearchConfigurationDao; + this.remoteSearchCacheDao = remoteSearchCacheDao; + this.remoteSearchFailureDao = remoteSearchFailureDao; + this.remoteSearchExecutor = null; + this.maxConcurrentTasks = Math.max(2, Math.min(DEFAULT_MIN_CONCURRENT_THREADS, ALLOCATED_PROCESSORS / PROCESSOR_NUMBER_DIVISOR)); this.concurrencyControl = new Semaphore(maxConcurrentTasks, true); @@ -195,6 +209,20 @@ private VariantTaskParameters createTaskParameters( .taskContext(taskContext) .searchPipeline(getSearchPipelineFromVariant(variant)) .build(); + } else if (experimentType == ExperimentType.REMOTE_SEARCH_EVALUATION) { + return RemoteSearchTaskParameters.builder() + .experimentId(experimentId) + .searchConfigId(searchConfigId) + .index(index) + .query(query) + .queryText(queryText) + .size(size) + .experimentVariant(variant) + .judgmentIds(judgmentIds) + .docIdToScores(docIdToScores) + .taskContext(taskContext) + .remoteConfigId(getRemoteConfigIdFromVariant(variant)) + .build(); } else { // Default to hybrid optimizer parameters return VariantTaskParameters.builder() @@ -219,6 +247,13 @@ private String getSearchPipelineFromVariant(ExperimentVariant variant) { return (String) variant.getParameters().get("searchPipeline"); } + /** + * Extract remote configuration ID from variant parameters for remote search experiments + */ + private String getRemoteConfigIdFromVariant(ExperimentVariant variant) { + return (String) variant.getParameters().get("remoteConfigId"); + } + /** * Schedule a single variant task asynchronously */ @@ -283,6 +318,19 @@ private void executeVariantTaskAsync(VariantTaskParameters params, CompletableFu } final String evaluationId = UUID.randomUUID().toString(); + + // Handle remote search experiments differently + if (params instanceof RemoteSearchTaskParameters) { + executeRemoteSearchVariantAsync((RemoteSearchTaskParameters) params, evaluationId, future); + } else { + executeLocalSearchVariantAsync(params, evaluationId, future); + } + } + + /** + * Execute local search variant (existing functionality) + */ + private void executeLocalSearchVariantAsync(VariantTaskParameters params, String evaluationId, CompletableFuture future) { SearchRequest searchRequest = buildSearchRequest(params, evaluationId); // Convert ActionListener to CompletableFuture @@ -337,6 +385,156 @@ public void onFailure(Exception e) { }); } + /** + * Execute remote search variant using RemoteSearchExecutor + */ + private void executeRemoteSearchVariantAsync(RemoteSearchTaskParameters params, String evaluationId, CompletableFuture future) { + // Lazy initialize RemoteSearchExecutor to prevent HttpClient selector threads when not used + if (remoteSearchExecutor == null) { + synchronized (this) { + if (remoteSearchExecutor == null) { + remoteSearchExecutor = new RemoteSearchExecutor( + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao, + new RemoteResponseMapper() + ); + } + } + } + + // Execute remote search request + remoteSearchExecutor.executeRemoteSearch( + params.getRemoteConfigId(), + params.getQuery(), + params.getQueryText(), + params.getSize(), + params.getExperimentId(), + new ActionListener() { + @Override + public void onResponse(RemoteSearchExecutor.RemoteSearchResponse remoteResponse) { + try { + // Process the remote search response using the search response processor + // Convert remote response to OpenSearch SearchResponse format for processing + processRemoteSearchResponse(remoteResponse, params, evaluationId); + future.complete(null); + } catch (Exception e) { + future.completeExceptionally(e); + } finally { + concurrencyControl.release(); + activeTasks.decrement(); + } + } + + @Override + public void onFailure(Exception e) { + try { + handleSearchFailure( + e, + params.getExperimentVariant(), + params.getExperimentId(), + evaluationId, + params.getTaskContext() + ); + future.complete(null); + } catch (Exception ex) { + future.completeExceptionally(ex); + } finally { + concurrencyControl.release(); + activeTasks.decrement(); + } + } + } + ); + } + + private void processRemoteSearchResponse( + RemoteSearchExecutor.RemoteSearchResponse remoteResponse, + RemoteSearchTaskParameters params, + String evaluationId + ) { + log.info( + "Processing remote search response for experiment: {}, variant: {}, evaluation: {}", + params.getExperimentId(), + params.getExperimentVariant().getId(), + evaluationId + ); + + try { + // Prefer mapped response if available, otherwise use raw response + String json = remoteResponse.getMappedResponse(); + if (json == null || json.trim().isEmpty()) { + json = remoteResponse.getRawResponse(); + } + if (json == null || json.trim().isEmpty()) { + throw new IllegalArgumentException("Remote response is empty"); + } + + // Parse JSON into Map + String cleanJson = json.replaceAll("\\s+", " ").trim(); + java.util.Map data; + try ( + org.opensearch.core.xcontent.XContentParser parser = org.opensearch.common.xcontent.XContentFactory.jsonBuilder() + .contentType() + .xContent() + .createParser(null, null, cleanJson) + ) { + data = parser.map(); + } + + // Extract hits.hits array + Object hitsObj = data.get("hits"); + if (!(hitsObj instanceof java.util.Map)) { + throw new IllegalArgumentException("Mapped response missing 'hits' object"); + } + @SuppressWarnings("unchecked") + java.util.Map hitsContainer = (java.util.Map) hitsObj; + + Object hitsListObj = hitsContainer.get("hits"); + if (!(hitsListObj instanceof java.util.List)) { + throw new IllegalArgumentException("Mapped response missing 'hits.hits' array"); + } + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hitsListObj; + + // Collect document IDs from hits + java.util.List docIds = new java.util.ArrayList<>(); + for (int i = 0; i < hitsList.size(); i++) { + Object item = hitsList.get(i); + if (item instanceof java.util.Map) { + @SuppressWarnings("unchecked") + java.util.Map hitMap = (java.util.Map) item; + Object idObj = hitMap.get("_id"); + String id = idObj != null ? idObj.toString() : String.valueOf(i); + docIds.add(id); + } + } + + // Delegate to SearchResponseProcessor using doc IDs + searchResponseProcessor.processDocIds( + docIds, + params.getExperimentVariant(), + params.getExperimentId(), + params.getSearchConfigId(), + params.getQueryText(), + params.getSize(), + params.getJudgmentIds(), + params.getDocIdToScores(), + evaluationId, + params.getTaskContext() + ); + + log.debug( + "Remote search processed for config: {}, status: {}, total docIds: {}", + params.getRemoteConfigId(), + remoteResponse.getStatusCode(), + docIds.size() + ); + } catch (Exception e) { + handleSearchFailure(e, params.getExperimentVariant(), params.getExperimentId(), evaluationId, params.getTaskContext()); + } + } + /** * Build search request based on experiment type */ diff --git a/src/main/java/org/opensearch/searchrelevance/executors/RemoteResponseMapper.java b/src/main/java/org/opensearch/searchrelevance/executors/RemoteResponseMapper.java new file mode 100644 index 00000000..90537428 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/executors/RemoteResponseMapper.java @@ -0,0 +1,664 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Locale; +import java.util.Map; + +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.core.xcontent.XContentParser; + +import com.fasterxml.jackson.databind.ObjectMapper; + +import lombok.extern.log4j.Log4j2; + +/** + * RemoteResponseMapper handles mapping of remote search engine responses to OpenSearch format + * using JSON path-based field mapping. This enables experiments to work with responses from + * different search engines by transforming them into a consistent format. + */ +@Log4j2 +public class RemoteResponseMapper { + + private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper(); + + /** + * Map a remote search response to OpenSearch format using response template + * + * @param rawResponse The raw JSON response from remote search engine + * @param responseTemplate The response template defining field mappings (JSON path format) + * @return Mapped response in OpenSearch format + */ + public String mapResponse(String rawResponse, String responseTemplate) { + // Enhanced null/empty response handling + if (rawResponse == null || rawResponse.trim().isEmpty() || "null".equals(rawResponse.trim())) { + log.debug("Received null or empty raw response, returning empty OpenSearch response"); + return createEmptyResponse(); + } + + if (responseTemplate == null || responseTemplate.trim().isEmpty()) { + // No template provided, attempt default OpenSearch format detection + return mapWithDefaultTemplate(rawResponse); + } + + try { + // Parse the raw response with enhanced error handling + Map rawData = parseJsonToMap(rawResponse); + if (rawData == null || rawData.isEmpty()) { + log.debug("Raw response parsed to null or empty map, returning empty OpenSearch response"); + return createEmptyResponse(); + } + + // Check if response template is a template string (contains ${}) or JSON mapping + if (responseTemplate.contains("${")) { + // This is a template string, not a JSON mapping - apply template substitution + log.debug("Response template contains template variables, applying template substitution"); + String result = applyTemplateSubstitution(rawData, responseTemplate); + + // Validate the result + if (result == null || "null".equals(result.trim()) || result.trim().isEmpty()) { + log.debug("Template substitution resulted in null or empty result, falling back to default mapping"); + return mapWithDefaultTemplate(rawResponse); + } + + return result; + } else { + // This should be a JSON mapping configuration + Map template = parseJsonToMap(responseTemplate); + if (template == null || template.isEmpty()) { + log.debug("Response template parsed to null or empty map, falling back to default mapping"); + return mapWithDefaultTemplate(rawResponse); + } + + // Apply the mapping + Map mappedData = applyMapping(rawData, template); + + // Validate mapped data has proper structure + if (mappedData == null || mappedData.isEmpty()) { + log.debug("Mapping resulted in null or empty data, returning empty OpenSearch response"); + return createEmptyResponse(); + } + + // Convert back to JSON + String result = mapToJson(mappedData); + + // Final validation - ensure result is not null or "null" + if (result == null || "null".equals(result.trim()) || result.trim().isEmpty()) { + log.debug("Final mapping result is null or empty, returning empty OpenSearch response"); + return createEmptyResponse(); + } + + return result; + } + + } catch (Exception e) { + log.debug("Failed to map remote response: {}", e.getMessage()); + return createErrorResponse(e.getMessage()); + } + } + + /** + * Apply template substitution for template strings containing ${} variables + */ + private String applyTemplateSubstitution(Map rawData, String template) { + try { + String result = template; + + // Simple template variable substitution for ${path} syntax + // This is a basic implementation - for production use, consider a proper template engine + while (result.contains("${")) { + int startIndex = result.indexOf("${"); + int endIndex = result.indexOf("}", startIndex); + + if (endIndex == -1) { + // Malformed template variable, break to avoid infinite loop + log.debug("Malformed template variable in response template, missing closing }"); + break; + } + + String variable = result.substring(startIndex + 2, endIndex); + Object value = extractValueByPath(rawData, variable); + + String replacement; + if (value == null) { + replacement = "null"; + } else if (value instanceof String) { + replacement = "\"" + value.toString().replace("\"", "\\\"") + "\""; + } else if (value instanceof List || value instanceof Map) { + // For complex objects (arrays or objects), serialize directly to JSON without wrapping + try { + replacement = OBJECT_MAPPER.writeValueAsString(value); + } catch (Exception e) { + replacement = "null"; + } + } else { + replacement = value.toString(); + } + + result = result.substring(0, startIndex) + replacement + result.substring(endIndex + 1); + } + + // Validate that the result is valid JSON + try { + parseJsonToMap(result); + return result; + } catch (Exception e) { + log.debug("Template substitution resulted in invalid JSON: {}", e.getMessage()); + log.debug("Template substitution result was: {}", result); + // Fall back to default mapping + return null; + } + + } catch (Exception e) { + log.debug("Failed to apply template substitution: {}", e.getMessage()); + return null; + } + } + + /** + * Apply default mapping for responses that might already be in OpenSearch format + */ + private String mapWithDefaultTemplate(String rawResponse) { + try { + Map rawData = parseJsonToMap(rawResponse); + + // Check if it's already in OpenSearch format + if (rawData.containsKey("hits")) { + return rawResponse; // Already in correct format + } + + // Detect Solr JSON Response API: response.docs under 'response' + if (rawData.containsKey("response") && rawData.get("response") instanceof Map) { + @SuppressWarnings("unchecked") + Map response = (Map) rawData.get("response"); + if (response.get("docs") instanceof List) { + return mapSolrFormat(rawData); + } + } + + // Try to detect common search response patterns + if (rawData.containsKey("results") || rawData.containsKey("documents")) { + return mapCommonFormat(rawData); + } + + // If we can't detect the format, wrap it in a basic structure + return wrapInBasicFormat(rawData); + + } catch (Exception e) { + log.debug("Failed to apply default mapping, returning raw response: {}", e.getMessage()); + return rawResponse; + } + } + + /** + * Apply mapping based on template configuration + */ + private Map applyMapping(Map rawData, Map template) { + Map result = new HashMap<>(); + + for (Map.Entry entry : template.entrySet()) { + String targetField = entry.getKey(); + Object mappingConfig = entry.getValue(); + + if (mappingConfig instanceof String) { + // Simple JSON path mapping + String jsonPath = (String) mappingConfig; + Object value = extractValueByPath(rawData, jsonPath); + if (value != null) { + result.put(targetField, value); + } + } else if (mappingConfig instanceof Map) { + // Complex mapping configuration or nested structure + @SuppressWarnings("unchecked") + Map config = (Map) mappingConfig; + + // Check if this is a mapping configuration or nested structure + if (config.containsKey("path") || config.containsKey("type") || config.containsKey("default")) { + Object value = applyComplexMapping(rawData, config); + // Always add the value, even if null, because applyComplexMapping handles defaults + result.put(targetField, value); + } else { + // Nested structure: recursively apply mapping + Map nestedResult = applyMapping(rawData, config); + if (!nestedResult.isEmpty()) { + result.put(targetField, nestedResult); + } + } + } + } + + return result; + } + + /** + * Apply complex mapping with transformations + */ + private Object applyComplexMapping(Map rawData, Map config) { + String path = (String) config.get("path"); + String type = (String) config.get("type"); + Object defaultValue = config.get("default"); + + if (path == null) { + return defaultValue; + } + + Object value = extractValueByPath(rawData, path); + + if (value == null) { + return defaultValue; + } + + // Apply type transformations + if (type != null) { + value = transformValue(value, type); + } + + return value; + } + + /** + * Extract value from nested map using JSON path notation + */ + private Object extractValueByPath(Map data, String path) { + if (path == null || path.trim().isEmpty()) { + return null; + } + + // Handle simple field access + if (!path.contains(".") && !path.contains("[")) { + return data.get(path); + } + + // Parse the path more carefully to handle array notation + Object current = data; + String remainingPath = path; + + while (!remainingPath.isEmpty() && current != null) { + String nextPart; + String restOfPath; + + // Check if we have a dot separator + int dotIndex = remainingPath.indexOf('.'); + if (dotIndex == -1) { + // No more dots, this is the last part + nextPart = remainingPath; + restOfPath = ""; + } else { + nextPart = remainingPath.substring(0, dotIndex); + restOfPath = remainingPath.substring(dotIndex + 1); + } + + // Handle array access in this part + if (nextPart.contains("[") && nextPart.contains("]")) { + String fieldName = nextPart.substring(0, nextPart.indexOf('[')); + String indexStr = nextPart.substring(nextPart.indexOf('[') + 1, nextPart.indexOf(']')); + + // First get the field + if (current instanceof Map) { + @SuppressWarnings("unchecked") + Map map = (Map) current; + current = map.get(fieldName); + } else { + return null; + } + + // Then access the array index + if (current instanceof List) { + @SuppressWarnings("unchecked") + List list = (List) current; + try { + int index = Integer.parseInt(indexStr); + if (index >= 0 && index < list.size()) { + current = list.get(index); + } else { + return null; + } + } catch (NumberFormatException e) { + return null; + } + } else { + return null; + } + } else { + // Simple field access + if (current instanceof Map) { + @SuppressWarnings("unchecked") + Map map = (Map) current; + current = map.get(nextPart); + } else { + return null; + } + } + + remainingPath = restOfPath; + } + + return current; + } + + /** + * Transform value to specified type + */ + private Object transformValue(Object value, String type) { + if (value == null) { + return null; + } + + try { + switch (type.toLowerCase(Locale.ROOT)) { + case "string": + return value.toString(); + case "integer": + case "int": + if (value instanceof Number) { + return ((Number) value).intValue(); + } + return Integer.parseInt(value.toString()); + case "long": + if (value instanceof Number) { + return ((Number) value).longValue(); + } + return Long.parseLong(value.toString()); + case "double": + if (value instanceof Number) { + return ((Number) value).doubleValue(); + } + return Double.parseDouble(value.toString()); + case "boolean": + if (value instanceof Boolean) { + return value; + } + return Boolean.parseBoolean(value.toString()); + default: + return value; + } + } catch (Exception e) { + log.debug("Failed to transform value {} to type {}: {}", value, type, e.getMessage()); + return value; + } + } + + /** + * Map common search response formats to OpenSearch format + */ + private String mapCommonFormat(Map rawData) { + try { + Map opensearchFormat = new HashMap<>(); + + // Extract hits + Object resultsObj = rawData.get("results"); + if (resultsObj == null) { + resultsObj = rawData.get("documents"); + } + + List> hits = new ArrayList<>(); + int totalHits = 0; + + if (resultsObj instanceof List) { + @SuppressWarnings("unchecked") + List results = (List) resultsObj; + totalHits = results.size(); + + for (int i = 0; i < results.size(); i++) { + Object item = results.get(i); + if (item instanceof Map) { + @SuppressWarnings("unchecked") + Map doc = (Map) item; + + Map hit = new HashMap<>(); + hit.put("_index", "remote"); + hit.put("_id", doc.getOrDefault("id", String.valueOf(i))); + hit.put("_score", doc.getOrDefault("score", 1.0)); + hit.put("_source", doc); + + hits.add(hit); + } + } + } + + // Build OpenSearch response structure + Map total = new HashMap<>(); + total.put("value", totalHits); + total.put("relation", "eq"); + + Map hitsContainer = new HashMap<>(); + hitsContainer.put("total", total); + hitsContainer.put("max_score", hits.isEmpty() ? null : 1.0); + hitsContainer.put("hits", hits); + + opensearchFormat.put("hits", hitsContainer); + opensearchFormat.put("took", rawData.getOrDefault("took", 1)); + opensearchFormat.put("timed_out", false); + + return mapToJson(opensearchFormat); + + } catch (Exception e) { + log.debug("Failed to map common format: {}", e.getMessage()); + return createErrorResponse(e.getMessage()); + } + } + + /** + * Map Solr JSON response (response.docs/numFound and optional responseHeader.QTime) to OpenSearch format + */ + private String mapSolrFormat(Map rawData) { + try { + @SuppressWarnings("unchecked") + Map response = (Map) rawData.get("response"); + + @SuppressWarnings("unchecked") + List docs = (List) response.get("docs"); + + int totalHits = 0; + Object numFoundObj = response.get("numFound"); + if (numFoundObj instanceof Number) { + totalHits = ((Number) numFoundObj).intValue(); + } else if (numFoundObj != null) { + try { + totalHits = Integer.parseInt(numFoundObj.toString()); + } catch (NumberFormatException ignore) { + // keep default + } + } + + List> hits = new ArrayList<>(); + if (docs != null) { + for (int i = 0; i < docs.size(); i++) { + Object item = docs.get(i); + if (item instanceof Map) { + @SuppressWarnings("unchecked") + Map doc = (Map) item; + + Map hit = new HashMap<>(); + hit.put("_index", "remote"); + Object id = doc.getOrDefault("id", String.valueOf(i)); + hit.put("_id", id); + + double score = 1.0; + Object scoreObj = doc.get("score"); + if (scoreObj instanceof Number) { + score = ((Number) scoreObj).doubleValue(); + } else if (scoreObj != null) { + try { + score = Double.parseDouble(scoreObj.toString()); + } catch (Exception ignore) { + // keep default + } + } + hit.put("_score", score); + hit.put("_source", doc); + + hits.add(hit); + } + } + } + + Map total = new HashMap<>(); + total.put("value", totalHits); + total.put("relation", "eq"); + + Map hitsContainer = new HashMap<>(); + hitsContainer.put("total", total); + hitsContainer.put("max_score", hits.isEmpty() ? null : 1.0); + hitsContainer.put("hits", hits); + + Map opensearchFormat = new HashMap<>(); + opensearchFormat.put("hits", hitsContainer); + + // took from responseHeader.QTime if available (milliseconds) + int took = 1; + Object headerObj = rawData.get("responseHeader"); + if (headerObj instanceof Map) { + @SuppressWarnings("unchecked") + Map header = (Map) headerObj; + Object qtime = header.get("QTime"); + if (qtime instanceof Number) { + took = ((Number) qtime).intValue(); + } else if (qtime != null) { + try { + took = Integer.parseInt(qtime.toString()); + } catch (Exception ignore) { + // keep default + } + } + } + opensearchFormat.put("took", took); + opensearchFormat.put("timed_out", false); + + return mapToJson(opensearchFormat); + } catch (Exception e) { + log.debug("Failed to map Solr format: {}", e.getMessage()); + return createErrorResponse(e.getMessage()); + } + } + + /** + * Wrap unknown format in basic OpenSearch structure + */ + private String wrapInBasicFormat(Map rawData) { + try { + Map hit = new HashMap<>(); + hit.put("_index", "remote"); + hit.put("_id", "1"); + hit.put("_score", 1.0); + hit.put("_source", rawData); + + Map total = new HashMap<>(); + total.put("value", 1); + total.put("relation", "eq"); + + Map hitsContainer = new HashMap<>(); + hitsContainer.put("total", total); + hitsContainer.put("max_score", 1.0); + hitsContainer.put("hits", List.of(hit)); + + Map opensearchFormat = new HashMap<>(); + opensearchFormat.put("hits", hitsContainer); + opensearchFormat.put("took", 1); + opensearchFormat.put("timed_out", false); + + return mapToJson(opensearchFormat); + + } catch (Exception e) { + log.debug("Failed to wrap in basic format: {}", e.getMessage()); + return createErrorResponse(e.getMessage()); + } + } + + /** + * Parse JSON string to Map + */ + private Map parseJsonToMap(String json) throws Exception { + if (json == null || json.trim().isEmpty()) { + return new HashMap<>(); + } + + // Clean the JSON string to remove any potential BOM or invisible characters + String cleanedJson = json; + + // Remove BOM if present + if (cleanedJson.startsWith("\uFEFF")) { + cleanedJson = cleanedJson.substring(1); + } + + // Remove any leading/trailing whitespace and control characters + cleanedJson = cleanedJson.trim(); + + // Remove any non-printable characters at the beginning + while (cleanedJson.length() > 0 + && cleanedJson.charAt(0) < 32 + && cleanedJson.charAt(0) != '\t' + && cleanedJson.charAt(0) != '\n' + && cleanedJson.charAt(0) != '\r') { + cleanedJson = cleanedJson.substring(1); + } + + // Parse JSON directly without aggressive whitespace cleaning + try (XContentParser parser = XContentFactory.jsonBuilder().contentType().xContent().createParser(null, null, cleanedJson)) { + return parser.map(); + } + } + + /** + * Convert Map to JSON string + */ + private String mapToJson(Map map) throws Exception { + try (XContentBuilder builder = XContentFactory.jsonBuilder()) { + builder.map(map); + return builder.toString(); + } + } + + /** + * Create empty response in OpenSearch format + */ + private String createEmptyResponse() { + try { + Map total = new HashMap<>(); + total.put("value", 0); + total.put("relation", "eq"); + + Map hitsContainer = new HashMap<>(); + hitsContainer.put("total", total); + hitsContainer.put("max_score", null); + hitsContainer.put("hits", List.of()); + + Map response = new HashMap<>(); + response.put("hits", hitsContainer); + response.put("took", 0); + response.put("timed_out", false); + + return mapToJson(response); + } catch (Exception e) { + log.debug("Failed to create empty response: {}", e.getMessage()); + return "{\"hits\":{\"total\":{\"value\":0,\"relation\":\"eq\"},\"hits\":[]}}"; + } + } + + /** + * Create error response + */ + private String createErrorResponse(String errorMessage) { + try { + Map error = new HashMap<>(); + error.put("type", "remote_mapping_exception"); + error.put("reason", errorMessage); + + Map response = new HashMap<>(); + response.put("error", error); + + return mapToJson(response); + } catch (Exception e) { + return "{\"error\":{\"type\":\"remote_mapping_exception\",\"reason\":\"" + errorMessage.replace("\"", "\\\"") + "\"}}"; + } + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutor.java b/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutor.java new file mode 100644 index 00000000..5c5a7629 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutor.java @@ -0,0 +1,631 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.net.HttpURLConnection; +import java.net.URI; +import java.net.URL; +import java.net.URLEncoder; +import java.nio.charset.StandardCharsets; +import java.util.Base64; +import java.util.Locale; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Semaphore; + +import org.opensearch.common.SuppressForbidden; +import org.opensearch.core.action.ActionListener; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; +import org.opensearch.searchrelevance.model.RemoteSearchCache; +import org.opensearch.searchrelevance.model.RemoteSearchConfiguration; +import org.opensearch.searchrelevance.model.RemoteSearchFailure; +import org.opensearch.searchrelevance.stats.events.EventStatName; +import org.opensearch.searchrelevance.stats.events.EventStatsManager; +import org.opensearch.searchrelevance.utils.TimeUtils; + +import lombok.extern.log4j.Log4j2; + +/** + * RemoteSearchExecutor handles HTTP requests to remote search engines with rate limiting, + * caching, and comprehensive error handling. This enables experiments to run against + * remote OpenSearch clusters or other search engines via HTTPS. + */ +@Log4j2 +public class RemoteSearchExecutor { + + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + private final RemoteSearchCacheDao remoteSearchCacheDao; + private final RemoteSearchFailureDao remoteSearchFailureDao; + private final RemoteResponseMapper remoteResponseMapper; + + // Rate limiting: Map of config ID to semaphore for concurrent request limiting + private final Map concurrentRequestLimiters = new ConcurrentHashMap<>(); + + // Rate limiting: Map of config ID to last request timestamp for requests per second limiting + private final Map lastRequestTimestamps = new ConcurrentHashMap<>(); + + // Lock objects for pacing RPS per config + private final Map rateLimitLocks = new ConcurrentHashMap<>(); + + /** + * Constructor with all dependencies + */ + public RemoteSearchExecutor( + RemoteSearchConfigurationDao remoteSearchConfigurationDao, + RemoteSearchCacheDao remoteSearchCacheDao, + RemoteSearchFailureDao remoteSearchFailureDao, + RemoteResponseMapper remoteResponseMapper + ) { + this.remoteSearchConfigurationDao = remoteSearchConfigurationDao; + this.remoteSearchCacheDao = remoteSearchCacheDao; + this.remoteSearchFailureDao = remoteSearchFailureDao; + this.remoteResponseMapper = remoteResponseMapper; + } + + /** + * Execute a remote search request with rate limiting, caching, and error handling + * + * @param remoteConfigId The remote configuration ID + * @param query The search query (JSON string) + * @param queryText The original query text for caching + * @param experimentId The experiment ID for failure tracking + * @param listener ActionListener for async response handling + */ + public void executeRemoteSearch( + String remoteConfigId, + String query, + String queryText, + int size, + String experimentId, + ActionListener listener + ) { + // Metrics: count remote search request + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_REQUESTS); + // First, get the remote configuration + remoteSearchConfigurationDao.getRemoteSearchConfiguration(remoteConfigId, ActionListener.wrap(config -> { + if (config == null) { + listener.onFailure(new IllegalArgumentException("Remote configuration not found: " + remoteConfigId)); + return; + } + + // Check cache first + String cacheKey = RemoteSearchCache.generateCacheKey(remoteConfigId, query, queryText); + checkCacheAndExecute(config, query, queryText, size, experimentId, cacheKey, listener); + }, error -> { + log.error("Failed to retrieve remote configuration {}: {}", remoteConfigId, error.getMessage()); + listener.onFailure(error); + })); + } + + /** + * Check cache for existing response, execute remote request if not cached + */ + private void checkCacheAndExecute( + RemoteSearchConfiguration config, + String query, + String queryText, + int size, + String experimentId, + String cacheKey, + ActionListener listener + ) { + // Check cache first if caching is enabled + if (config.getCacheTtlMinutes() > 0) { + remoteSearchCacheDao.getCachedResponse(cacheKey, ActionListener.wrap(cachedResponse -> { + if (cachedResponse != null && !cachedResponse.isExpired()) { + // Cache hit - return cached response + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_CACHE_HITS); + log.debug("Cache hit for config: {}, key: {}", config.getId(), cacheKey); + + // Apply response mapping to cached response + String mappedResponse = applyResponseMapping(config, cachedResponse.getResponse()); + + RemoteSearchResponse response = new RemoteSearchResponse( + cachedResponse.getResponse(), + mappedResponse, + 200, // Assume success for cached responses + true + ); + + listener.onResponse(response); + } else { + // Cache miss or expired - execute remote request + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_CACHE_MISSES); + log.debug("Cache miss for config: {}, key: {}", config.getId(), cacheKey); + executeRemoteRequest(config, query, queryText, size, experimentId, cacheKey, listener); + } + }, error -> { + // Cache lookup failed - proceed with remote execution + log.warn("Cache lookup failed for config: {}, proceeding with remote execution: {}", config.getId(), error.getMessage()); + executeRemoteRequest(config, query, queryText, size, experimentId, cacheKey, listener); + })); + } else { + // Caching disabled - proceed directly to remote execution + executeRemoteRequest(config, query, queryText, size, experimentId, cacheKey, listener); + } + } + + /** + * Execute the actual remote HTTP request with rate limiting + */ + @SuppressForbidden(reason = "External HTTP I/O is required to call remote search engines") + private void executeRemoteRequest( + RemoteSearchConfiguration config, + String query, + String queryText, + int size, + String experimentId, + String cacheKey, + ActionListener listener + ) { + try { + // Apply rate limiting + if (!applyRateLimit(config)) { + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_RATE_LIMIT_HITS); + listener.onFailure(new RuntimeException("Rate limit exceeded for configuration: " + config.getId())); + return; + } + + // Process query template + String processedQuery = processQueryTemplate(config.getQueryTemplate(), query, queryText, size); + + // Execute HTTP request using HttpURLConnection (similar to build.gradle approach) + executeWithHttpURLConnection(config, processedQuery, query, queryText, experimentId, cacheKey, listener); + + } catch (Exception e) { + releaseConcurrentRequestLimit(config.getId()); + handleRequestFailure(config, query, queryText, experimentId, e, listener); + } + } + + /** + * Execute HTTP request using HttpURLConnection (avoiding forbidden getInputStream) + */ + @SuppressForbidden(reason = "External HTTP I/O is required to call remote search engines") + private void executeWithHttpURLConnection( + RemoteSearchConfiguration config, + String processedQuery, + String originalQuery, + String queryText, + String experimentId, + String cacheKey, + ActionListener listener + ) { + // Execute in a separate thread to avoid blocking + Thread requestThread = new Thread(() -> { + try { + URI uri = URI.create(config.getConnectionUrl()); + + // Detect Solr endpoint and build appropriate request + boolean isSolr = uri.getPath().contains("/solr/") && uri.getPath().endsWith("/select"); + + URL url; + String method; + String requestBody = null; + String contentType = null; + + // For Solr param-style requests, URL-encode only the inserted queryText to avoid illegal characters (e.g., spaces) + String safeProcessedQuery = processedQuery; + if (isSolr && queryText != null && !queryText.isEmpty()) { + try { + String encodedQueryText = URLEncoder.encode(queryText, StandardCharsets.UTF_8.name()); + safeProcessedQuery = processedQuery.replace(queryText, encodedQueryText); + } catch (Exception e) { + log.warn("Failed to URL-encode queryText for Solr: {}", e.getMessage()); + } + } + + if (isSolr && safeProcessedQuery.contains("=") && safeProcessedQuery.contains("&")) { + // Solr with URL parameters - use GET request + url = new URL(config.getConnectionUrl() + "?" + safeProcessedQuery); + method = "GET"; + } else if (isSolr) { + // Solr with form data - use POST with form encoding + url = new URL(config.getConnectionUrl()); + method = "POST"; + requestBody = safeProcessedQuery; + contentType = "application/x-www-form-urlencoded"; + } else { + // Non-Solr endpoint - use JSON POST + url = new URL(config.getConnectionUrl()); + method = "POST"; + requestBody = processedQuery; + contentType = "application/json"; + } + + log.debug("Executing remote request to: {} with method: {}", url, method); + + HttpURLConnection connection = (HttpURLConnection) url.openConnection(); + connection.setRequestMethod(method); + connection.setConnectTimeout(30000); // 30 seconds + connection.setReadTimeout(30000); // 30 seconds + connection.setRequestProperty("Accept", "application/json"); + + // Add authentication if provided + if (config.getUsername() != null + && !config.getUsername().trim().isEmpty() + && config.getPassword() != null + && !config.getPassword().trim().isEmpty()) { + + String credentials = config.getUsername() + ":" + config.getPassword(); + String encodedCredentials = Base64.getEncoder().encodeToString(credentials.getBytes(StandardCharsets.UTF_8)); + connection.setRequestProperty("Authorization", "Basic " + encodedCredentials); + } + + // Set content type and write request body if needed + if (requestBody != null) { + connection.setDoOutput(true); + connection.setRequestProperty("Content-Type", contentType); + + byte[] requestBytes = requestBody.getBytes(StandardCharsets.UTF_8); + connection.getOutputStream().write(requestBytes); + connection.getOutputStream().flush(); + connection.getOutputStream().close(); + } + + // Get response code first (this triggers the request) + int statusCode = connection.getResponseCode(); + String responseMessage = connection.getResponseMessage(); + + releaseConcurrentRequestLimit(config.getId()); + + if (statusCode >= 200 && statusCode < 300) { + // Read the actual response body + String responseBody = readResponseBody(connection); + + handleHttpSuccess(config, originalQuery, queryText, experimentId, cacheKey, statusCode, responseBody, listener); + } else { + // Try to read error response body for better error reporting + String errorBody = readErrorResponseBody(connection); + String errorMessage = String.format(Locale.ROOT, "HTTP %d: %s", statusCode, responseMessage); + if (errorBody != null && !errorBody.trim().isEmpty()) { + errorMessage += " - Response: " + errorBody; + } + Exception httpError = new IOException(errorMessage); + handleRequestFailure(config, originalQuery, queryText, experimentId, httpError, listener); + } + + } catch (Exception e) { + releaseConcurrentRequestLimit(config.getId()); + handleRequestFailure(config, originalQuery, queryText, experimentId, e, listener); + } + }); + + requestThread.setDaemon(true); + requestThread.start(); + } + + /** + * Handle successful HTTP response + */ + private void handleHttpSuccess( + RemoteSearchConfiguration config, + String query, + String queryText, + String experimentId, + String cacheKey, + int statusCode, + String responseBody, + ActionListener listener + ) { + try { + // Debug logging to capture what we're actually receiving + log.debug("DEBUG: Response status: {}", statusCode); + log.debug("DEBUG: Response body: {}", responseBody); + + // Metrics: count success + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_SUCCESSES); + // Apply response mapping + String mappedResponse = applyResponseMapping(config, responseBody); + + // Cache the response if caching is enabled + if (config.getCacheTtlMinutes() > 0) { + long currentTimestamp = System.currentTimeMillis(); + long expirationTimestamp = currentTimestamp + (config.getCacheTtlMinutes() * 60 * 1000); + + RemoteSearchCache cacheEntry = new RemoteSearchCache( + cacheKey, + config.getId(), + query, + queryText, + responseBody, + mappedResponse, + currentTimestamp, + expirationTimestamp + ); + + remoteSearchCacheDao.cacheResponse( + cacheEntry, + ActionListener.wrap( + success -> log.debug("Response cached for config: {}, key: {}", config.getId(), cacheKey), + error -> log.warn("Failed to cache response for config: {}: {}", config.getId(), error.getMessage()) + ) + ); + } + + RemoteSearchResponse remoteResponse = new RemoteSearchResponse(responseBody, mappedResponse, statusCode, true); + listener.onResponse(remoteResponse); + + log.debug("Remote search successful for config: {}, status: {}", config.getId(), statusCode); + + } catch (Exception e) { + handleRequestFailure(config, query, queryText, experimentId, e, listener); + } + } + + /** + * Apply rate limiting based on configuration settings + */ + private boolean applyRateLimit(RemoteSearchConfiguration config) { + String configId = config.getId(); + + // Acquire concurrency permit (block until available) + Semaphore concurrentLimiter = concurrentRequestLimiters.computeIfAbsent( + configId, + k -> new Semaphore(Math.max(1, config.getMaxConcurrentRequests())) + ); + + if (!concurrentLimiter.tryAcquire()) { + log.debug("Concurrent request limit exceeded for config: {}", configId); + return false; + } + + // Pace requests per second by sleeping if needed + Object lock = rateLimitLocks.computeIfAbsent(configId, k -> new Object()); + synchronized (lock) { + long now = System.currentTimeMillis(); + long last = lastRequestTimestamps.getOrDefault(configId, 0L); + int maxRps = Math.max(1, config.getMaxRequestsPerSecond()); + long minIntervalMs = 1000L / maxRps; + long elapsed = now - last; + + if (elapsed < minIntervalMs) { + long waitMs = minIntervalMs - elapsed; + try { + Thread.sleep(waitMs); + now = System.currentTimeMillis(); + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + // Release permit before failing + releaseConcurrentRequestLimit(configId); + log.warn("Interrupted while pacing RPS for config: {}", configId); + return false; + } + } + lastRequestTimestamps.put(configId, now); + } + + return true; + } + + /** + * Release concurrent request limit + */ + private void releaseConcurrentRequestLimit(String configId) { + Semaphore limiter = concurrentRequestLimiters.get(configId); + if (limiter != null) { + limiter.release(); + } + } + + /** + * Process query template by substituting placeholders + */ + private String processQueryTemplate(String queryTemplate, String query, String queryText, int size) { + if (queryTemplate == null || queryTemplate.trim().isEmpty()) { + return query; // Use query as-is if no template + } + + // Replace common placeholders + String processed = queryTemplate.replace("${query}", query) + .replace("${queryText}", queryText) + .replace("{{query}}", query) + .replace("{{queryText}}", queryText) + .replace("${size}", String.valueOf(size)) + .replace("{{size}}", String.valueOf(size)); + + return processed; + } + + /** + * Handle request failure with proper error categorization and logging + */ + private void handleRequestFailure( + RemoteSearchConfiguration config, + String query, + String queryText, + String experimentId, + Throwable error, + ActionListener listener + ) { + EventStatsManager.increment(EventStatName.REMOTE_SEARCH_FAILURES); + log.error("Remote search failed for config: {}, error: {}", config.getId(), error.getMessage()); + log.error("Remote search exception stack trace for config: {}", config.getId(), error); + + // Create failure record for tracking + String failureId = "failure_" + System.currentTimeMillis() + "_" + config.getId().hashCode(); + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + failureId, + config.getId(), + experimentId, + query, + queryText, + error instanceof Exception ? (Exception) error : new RuntimeException(error), + TimeUtils.getTimestamp() + ); + + // Store failure record for analysis and circuit breaker logic + remoteSearchFailureDao.recordFailure( + failure, + ActionListener.wrap( + success -> log.debug("Failure recorded for config: {}, failure ID: {}", config.getId(), failureId), + storeError -> log.warn("Failed to store failure record for config: {}: {}", config.getId(), storeError.getMessage()) + ) + ); + + listener.onFailure(new RuntimeException("Remote search failed: " + error.getMessage(), error)); + } + + /** + * Read response body from successful HTTP connection + */ + @SuppressForbidden(reason = "External HTTP I/O is required to call remote search engines") + private String readResponseBody(HttpURLConnection connection) throws IOException { + try (InputStream inputStream = connection.getInputStream()) { + if (inputStream == null) { + log.warn("No input stream available from connection"); + return "{}"; + } + + try (BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, StandardCharsets.UTF_8))) { + StringBuilder responseBuilder = new StringBuilder(); + String line; + while ((line = reader.readLine()) != null) { + responseBuilder.append(line).append('\n'); + } + + String responseBody = responseBuilder.toString(); + log.debug("Read response body, length: {}", responseBody.length()); + return responseBody; + } + } catch (IOException e) { + log.error("Failed to read response body: {}", e.getMessage()); + throw e; + } + } + + /** + * Read error response body from failed HTTP connection + */ + @SuppressForbidden(reason = "External HTTP I/O is required to call remote search engines") + private String readErrorResponseBody(HttpURLConnection connection) { + try (InputStream errorStream = connection.getErrorStream()) { + if (errorStream == null) { + return null; + } + + try (BufferedReader reader = new BufferedReader(new InputStreamReader(errorStream, StandardCharsets.UTF_8))) { + StringBuilder errorBuilder = new StringBuilder(); + String line; + while ((line = reader.readLine()) != null) { + errorBuilder.append(line).append('\n'); + } + + String errorResponse = errorBuilder.toString(); + log.debug("Read error response body, length: {}", errorResponse.length()); + return errorResponse; + } + } catch (IOException e) { + log.warn("Failed to read error response body: {}", e.getMessage()); + return null; + } + } + + /** + * Apply response mapping using the RemoteResponseMapper + */ + private String applyResponseMapping(RemoteSearchConfiguration config, String rawResponse) { + try { + String responseTemplate = config.getResponseTemplate(); + log.debug("Applying response mapping for config: {}", config.getId()); + log.debug( + "Raw response length: {}, starts with: {}", + rawResponse != null ? rawResponse.length() : 0, + rawResponse != null && rawResponse.length() > 50 ? rawResponse.substring(0, 50) : rawResponse + ); + + if (responseTemplate != null && !responseTemplate.trim().isEmpty()) { + log.debug( + "Response template length: {}, starts with: {}", + responseTemplate.length(), + responseTemplate.length() > 50 ? responseTemplate.substring(0, 50) : responseTemplate + ); + + log.debug("Calling response mapper for config: {}", config.getId()); + String mappedResponse = remoteResponseMapper.mapResponse(rawResponse, responseTemplate); + + log.debug( + "Response mapping completed for config: {}, mapped response length: {}", + config.getId(), + mappedResponse != null ? mappedResponse.length() : 0 + ); + + // Validate mapped response is not null or "null" + if (mappedResponse == null || "null".equals(mappedResponse.trim())) { + log.warn("Response mapping returned null/empty for config: {}, falling back to raw response", config.getId()); + return rawResponse; + } + + return mappedResponse; + } else { + // No template or blank -> use default mapping behavior in RemoteResponseMapper + if (responseTemplate == null || responseTemplate.trim().isEmpty()) { + log.debug("No/empty response template for config: {}, applying default mapping", config.getId()); + } + try { + String mappedResponse = remoteResponseMapper.mapResponse(rawResponse, responseTemplate); + if (mappedResponse == null || "null".equals(mappedResponse.trim())) { + log.warn( + "Default response mapping returned null/empty for config: {}, falling back to raw response", + config.getId() + ); + return rawResponse; + } + return mappedResponse; + } catch (Exception e) { + log.error("Default response mapping failed for config: {}: {}", config.getId(), e.getMessage(), e); + log.warn("Falling back to raw response due to default mapping failure"); + return rawResponse; + } + } + } catch (Exception e) { + log.error("Response mapping failed for config: {}: {}", config.getId(), e.getMessage(), e); + log.warn("Falling back to raw response due to mapping failure"); + return rawResponse; + } + } + + /** + * Response wrapper for remote search results + */ + public static class RemoteSearchResponse { + private final String rawResponse; + private final String mappedResponse; + private final int statusCode; + private final boolean success; + + public RemoteSearchResponse(String rawResponse, String mappedResponse, int statusCode, boolean success) { + this.rawResponse = rawResponse; + this.mappedResponse = mappedResponse; + this.statusCode = statusCode; + this.success = success; + } + + public String getRawResponse() { + return rawResponse; + } + + public String getMappedResponse() { + return mappedResponse; + } + + public int getStatusCode() { + return statusCode; + } + + public boolean isSuccess() { + return success; + } + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchTaskParameters.java b/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchTaskParameters.java new file mode 100644 index 00000000..467d2975 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/executors/RemoteSearchTaskParameters.java @@ -0,0 +1,20 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import lombok.Getter; +import lombok.experimental.SuperBuilder; + +/** + * Parameters for scheduling a remote search variant task + */ +@Getter +@SuperBuilder +public class RemoteSearchTaskParameters extends VariantTaskParameters { + private final String remoteConfigId; +} diff --git a/src/main/java/org/opensearch/searchrelevance/executors/SearchResponseProcessor.java b/src/main/java/org/opensearch/searchrelevance/executors/SearchResponseProcessor.java index caa0e2ed..a3c9d347 100644 --- a/src/main/java/org/opensearch/searchrelevance/executors/SearchResponseProcessor.java +++ b/src/main/java/org/opensearch/searchrelevance/executors/SearchResponseProcessor.java @@ -95,6 +95,60 @@ public void processSearchResponse( } } + /** + * Process pre-collected document IDs (e.g., from a remote engine mapped response) + */ + public void processDocIds( + java.util.List docIds, + ExperimentVariant experimentVariant, + String experimentId, + String searchConfigId, + String queryText, + int size, + java.util.List judgmentIds, + java.util.Map docIdToScores, + String evaluationId, + ExperimentTaskContext taskContext + ) { + if (taskContext.getHasFailure().get()) return; + + try { + if (docIds == null || docIds.isEmpty()) { + handleNoHits(experimentVariant, experimentId, searchConfigId, evaluationId, taskContext); + return; + } + + java.util.List> metrics = calculateEvaluationMetrics(docIds, docIdToScores, size); + + String experimentVariantParameters = experimentVariant.getType() == ExperimentType.HYBRID_OPTIMIZER + ? experimentVariant.getTextualParameters() + : null; + + EvaluationResult evaluationResult = new EvaluationResult( + evaluationId, + TimeUtils.getTimestamp(), + searchConfigId, + queryText, + judgmentIds, + docIds, + metrics, + experimentId, + experimentVariant.getId(), + experimentVariantParameters + ); + + evaluationResultDao.putEvaluationResultEfficient( + evaluationResult, + ActionListener.wrap( + success -> updateExperimentVariant(experimentVariant, experimentId, searchConfigId, evaluationId, taskContext), + error -> handleTaskFailure(experimentVariant, error, taskContext) + ) + ); + } catch (Exception e) { + handleTaskFailure(experimentVariant, e, taskContext); + } + } + private void handleNoHits( ExperimentVariant experimentVariant, String experimentId, diff --git a/src/main/java/org/opensearch/searchrelevance/experiment/RemoteSearchExperimentProcessor.java b/src/main/java/org/opensearch/searchrelevance/experiment/RemoteSearchExperimentProcessor.java new file mode 100644 index 00000000..0eb035b4 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/experiment/RemoteSearchExperimentProcessor.java @@ -0,0 +1,273 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.experiment; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Locale; +import java.util.Map; +import java.util.Objects; +import java.util.Queue; +import java.util.UUID; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentLinkedQueue; +import java.util.concurrent.atomic.AtomicBoolean; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.stream.Collectors; + +import org.opensearch.action.search.SearchResponse; +import org.opensearch.common.cache.Cache; +import org.opensearch.common.cache.CacheBuilder; +import org.opensearch.common.unit.TimeValue; +import org.opensearch.core.action.ActionListener; +import org.opensearch.searchrelevance.dao.JudgmentDao; +import org.opensearch.searchrelevance.executors.ExperimentTaskManager; +import org.opensearch.searchrelevance.model.AsyncStatus; +import org.opensearch.searchrelevance.model.ExperimentType; +import org.opensearch.searchrelevance.model.ExperimentVariant; +import org.opensearch.searchrelevance.utils.TimeUtils; + +import lombok.extern.log4j.Log4j2; + +/** + * Processor for handling REMOTE_SEARCH_EVALUATION experiments with task scheduling + * Uses ExperimentTaskManager to execute remote queries via RemoteSearchExecutor. + */ +@Log4j2 +public class RemoteSearchExperimentProcessor { + + private final JudgmentDao judgmentDao; + private final ExperimentTaskManager taskManager; + + // Use OpenSearch's built-in cache implementation with bounded size + private final Cache> judgmentCache; + + // Configuration constants + private static final long CACHE_SIZE = 100_000; + private static final TimeValue CACHE_EXPIRE_TIME = TimeValue.timeValueHours(1); + + public RemoteSearchExperimentProcessor(JudgmentDao judgmentDao, ExperimentTaskManager taskManager) { + this.judgmentDao = judgmentDao; + this.taskManager = taskManager; + + // Initialize cache with size limit and TTL + this.judgmentCache = CacheBuilder.>builder() + .setMaximumWeight(CACHE_SIZE) + .setExpireAfterAccess(CACHE_EXPIRE_TIME) + .build(); + } + + /** + * Process remote search evaluation experiment (per queryText). + * remoteConfigIds holds the list of remote configuration IDs to execute against. + */ + public void processRemoteExperiment( + String experimentId, + String queryText, + List remoteConfigIds, + List judgmentList, + int size, + AtomicBoolean hasFailure, + ActionListener> listener + ) { + log.info( + "Starting remote search experiment {} with {} remote configs for query: {}", + experimentId, + remoteConfigIds.size(), + queryText + ); + + // Load judgments once and cache them + loadJudgmentsAsync(experimentId, judgmentList, queryText).thenAccept(docIdToScores -> { + log.info("Loaded {} document ratings for experiment {}", docIdToScores.size(), experimentId); + processWithJudgments(experimentId, queryText, remoteConfigIds, judgmentList, size, docIdToScores, hasFailure, listener); + }).exceptionally(e -> { + if (hasFailure.compareAndSet(false, true)) { + listener.onFailure(new Exception("Failed to load judgments", e)); + } + return null; + }); + } + + /** + * Load and cache judgments for the experiment + */ + private CompletableFuture> loadJudgmentsAsync(String experimentId, List judgmentList, String queryText) { + String cacheKey = experimentId + ":" + queryText; + Map cached = judgmentCache.get(cacheKey); + if (Objects.nonNull(cached)) { + return CompletableFuture.completedFuture(cached); + } + + AtomicInteger failureCount = new AtomicInteger(0); + int failureThreshold = Math.min(5, judgmentList.size()); + + // Load judgments in parallel + List> judgmentFutures = judgmentList.stream().map(judgmentId -> { + CompletableFuture future = new CompletableFuture<>(); + judgmentDao.getJudgment(judgmentId, ActionListener.wrap(future::complete, future::completeExceptionally)); + return future; + }).toList(); + + return CompletableFuture.allOf(judgmentFutures.toArray(new CompletableFuture[0])).thenApply(v -> { + Map docIdToScores = new HashMap<>(); + + for (CompletableFuture future : judgmentFutures) { + try { + SearchResponse response = future.join(); + extractJudgmentScores(queryText, response, docIdToScores); + } catch (Exception e) { + log.error("Failed to process judgment response: {}", e.getMessage()); + if (failureCount.incrementAndGet() >= failureThreshold) { + throw new RuntimeException( + String.format( + Locale.ROOT, + "Failed to load judgments: exceeded failure threshold %d/%d", + failureCount.get(), + failureThreshold + ), + e + ); + } + } + } + + judgmentCache.put(cacheKey, docIdToScores); + return docIdToScores; + }); + } + + /** + * Extract judgment scores from SearchResponse + */ + @SuppressWarnings("unchecked") + private void extractJudgmentScores(String queryText, SearchResponse response, Map docIdToScores) { + if (Objects.isNull(response.getHits()) || response.getHits().getTotalHits().value() == 0) { + return; + } + + Map sourceAsMap = response.getHits().getHits()[0].getSourceAsMap(); + List> judgmentRatings = (List>) sourceAsMap.getOrDefault( + "judgmentRatings", + Collections.emptyList() + ); + + for (Map rating : judgmentRatings) { + if (queryText.equals(rating.get("query"))) { + List> docScoreRatings = (List>) rating.get("ratings"); + if (Objects.nonNull(docScoreRatings)) { + docScoreRatings.forEach(docScoreRating -> docIdToScores.put(docScoreRating.get("docId"), docScoreRating.get("rating"))); + } + break; + } + } + } + + /** + * Process experiment with loaded judgments + */ + private void processWithJudgments( + String experimentId, + String queryText, + List remoteConfigIds, + List judgmentList, + int size, + Map docIdToScores, + AtomicBoolean hasFailure, + ActionListener> listener + ) { + // Create one variant per remote configuration + List variants = createRemoteVariants(experimentId, remoteConfigIds); + + // Process configurations in parallel + Map configToExperimentVariants = new ConcurrentHashMap<>(); + Queue> allResults = new ConcurrentLinkedQueue<>(); + + List> configFutures = remoteConfigIds.stream().map(remoteConfigId -> { + // Filter variants for this remote configuration + List configVariants = variants.stream() + .filter(v -> remoteConfigId.equals(v.getParameters().get("remoteConfigId"))) + .collect(Collectors.toList()); + + // Use task manager to process variants + CompletableFuture> configFuture = taskManager.scheduleTasksAsync( + ExperimentType.REMOTE_SEARCH_EVALUATION, + experimentId, + remoteConfigId, // reuse field to track by remote config id + "", // index not used for remote + "{}", // query body not needed - templates can use queryText/size + queryText, + size, + configVariants, + judgmentList, + docIdToScores, + configToExperimentVariants, + hasFailure + ); + + // Transform results to a compact representation (similar to Pointwise) + return configFuture.thenAccept(results -> { + List> evaluationResults = (List>) results.get("evaluationResults"); + if (evaluationResults != null && !evaluationResults.isEmpty()) { + for (Map evalResult : evaluationResults) { + Map result = new HashMap<>(); + result.put("evaluationId", evalResult.get("evaluationId")); + result.put("searchConfigurationId", remoteConfigId); // keep field name for compatibility + result.put("queryText", queryText); + allResults.add(result); + } + } else { + Map result = new HashMap<>(); + result.put("queryText", queryText); + result.put("searchConfigurationId", remoteConfigId); + allResults.add(result); + } + }).exceptionally(ex -> { + log.error("Failed to process remote config {}: {}", remoteConfigId, ex.getMessage()); + return null; + }); + }).collect(Collectors.toList()); + + // Wait for all configurations to complete + CompletableFuture.allOf(configFutures.toArray(new CompletableFuture[0])).thenAccept(v -> { + Map queryResponse = new HashMap<>(); + queryResponse.put("results", new ArrayList<>(allResults)); + + log.info("Completed remote experiment {} with {} results", experimentId, allResults.size()); + listener.onResponse(queryResponse); + }).exceptionally(e -> { + if (hasFailure.compareAndSet(false, true)) { + listener.onFailure(new Exception("Failed to process remote configurations", e)); + } + return null; + }); + } + + /** + * Create one experiment variant per remote configuration + */ + private List createRemoteVariants(String experimentId, List remoteConfigIds) { + return remoteConfigIds.stream().map(remoteConfigId -> { + Map parameters = new HashMap<>(); + parameters.put("remoteConfigId", remoteConfigId); + + return new ExperimentVariant( + UUID.randomUUID().toString(), + TimeUtils.getTimestamp(), + ExperimentType.REMOTE_SEARCH_EVALUATION, + AsyncStatus.PROCESSING, + experimentId, + parameters, + Map.of() + ); + }).collect(Collectors.toList()); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndices.java b/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndices.java index 430737f9..97c67712 100644 --- a/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndices.java +++ b/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndices.java @@ -19,6 +19,12 @@ import static org.opensearch.searchrelevance.common.PluginConstants.JUDGMENT_INDEX_MAPPING; import static org.opensearch.searchrelevance.common.PluginConstants.QUERY_SET_INDEX; import static org.opensearch.searchrelevance.common.PluginConstants.QUERY_SET_INDEX_MAPPING; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CACHE_INDEX; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CACHE_INDEX_MAPPING; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CONFIG_INDEX; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CONFIG_INDEX_MAPPING; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_FAILURE_INDEX; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_FAILURE_INDEX_MAPPING; import static org.opensearch.searchrelevance.common.PluginConstants.SEARCH_CONFIGURATION_INDEX; import static org.opensearch.searchrelevance.common.PluginConstants.SEARCH_CONFIGURATION_INDEX_MAPPING; import static org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager.getIndexMappings; @@ -66,7 +72,22 @@ public enum SearchRelevanceIndices { /** * Experiment Variant Index */ - EXPERIMENT_VARIANT(EXPERIMENT_VARIANT_INDEX, EXPERIMENT_VARIANT_INDEX_MAPPING, false); + EXPERIMENT_VARIANT(EXPERIMENT_VARIANT_INDEX, EXPERIMENT_VARIANT_INDEX_MAPPING, false), + + /** + * Remote Search Configuration Index + */ + REMOTE_SEARCH_CONFIGURATION(REMOTE_SEARCH_CONFIG_INDEX, REMOTE_SEARCH_CONFIG_INDEX_MAPPING, false), + + /** + * Remote Search Cache Index + */ + REMOTE_SEARCH_CACHE(REMOTE_SEARCH_CACHE_INDEX, REMOTE_SEARCH_CACHE_INDEX_MAPPING, false), + + /** + * Remote Search Failure Index + */ + REMOTE_SEARCH_FAILURE(REMOTE_SEARCH_FAILURE_INDEX, REMOTE_SEARCH_FAILURE_INDEX_MAPPING, false); private final String indexName; private final String mapping; diff --git a/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndicesManager.java b/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndicesManager.java index bb8a8157..66ca5685 100644 --- a/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndicesManager.java +++ b/src/main/java/org/opensearch/searchrelevance/indices/SearchRelevanceIndicesManager.java @@ -73,6 +73,17 @@ public void createIndexIfAbsent(final SearchRelevanceIndices index, final StepLi } final CreateIndexRequest createIndexRequest = new CreateIndexRequest(indexName).mapping(mapping); + + // Apply default settings for remote search cache index + if (index == SearchRelevanceIndices.REMOTE_SEARCH_CACHE) { + createIndexRequest.settings( + org.opensearch.common.settings.Settings.builder() + .put("index.number_of_shards", 1) + .put("index.number_of_replicas", 0) + .build() + ); + } + StashedThreadContext.run(client, () -> client.admin().indices().create(createIndexRequest, new ActionListener<>() { @Override public void onResponse(final CreateIndexResponse createIndexResponse) { @@ -106,6 +117,17 @@ private void createIndexIfAbsentSync(final SearchRelevanceIndices index) { return; } final CreateIndexRequest createIndexRequest = new CreateIndexRequest(indexName).mapping(mapping); + + // Apply default settings for remote search cache index + if (index == SearchRelevanceIndices.REMOTE_SEARCH_CACHE) { + createIndexRequest.settings( + org.opensearch.common.settings.Settings.builder() + .put("index.number_of_shards", 1) + .put("index.number_of_replicas", 0) + .build() + ); + } + StashedThreadContext.run(client, () -> client.admin().indices().create(createIndexRequest)); } @@ -327,9 +349,19 @@ public void onResponse(SearchResponse response) { @Override public void onFailure(Exception e) { - actionListener.onFailure( - new SearchRelevanceException("Failed to get document", e, RestStatus.INTERNAL_SERVER_ERROR) - ); + if (e instanceof IndexNotFoundException) { + // If index doesn't exist, treat it as document not found + typedListener.onFailure( + new ResourceNotFoundException( + "Document not found: " + searchOperationContextArg.getDocumentId(), + RestStatus.NOT_FOUND + ) + ); + } else { + actionListener.onFailure( + new SearchRelevanceException("Failed to get document", e, RestStatus.INTERNAL_SERVER_ERROR) + ); + } } }); } catch (Exception e) { diff --git a/src/main/java/org/opensearch/searchrelevance/model/ExperimentType.java b/src/main/java/org/opensearch/searchrelevance/model/ExperimentType.java index f3f8cb6b..ac465bf7 100644 --- a/src/main/java/org/opensearch/searchrelevance/model/ExperimentType.java +++ b/src/main/java/org/opensearch/searchrelevance/model/ExperimentType.java @@ -10,5 +10,6 @@ public enum ExperimentType { PAIRWISE_COMPARISON, POINTWISE_EVALUATION, - HYBRID_OPTIMIZER + HYBRID_OPTIMIZER, + REMOTE_SEARCH_EVALUATION } diff --git a/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchCache.java b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchCache.java new file mode 100644 index 00000000..afca5d4c --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchCache.java @@ -0,0 +1,163 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import java.io.IOException; + +import org.opensearch.core.xcontent.ToXContentObject; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * RemoteSearchCache represents cached responses from remote search engines. + * This enables performance optimization by avoiding repeated remote calls for the same queries. + */ +public class RemoteSearchCache implements ToXContentObject { + public static final String CACHE_KEY = "cacheKey"; + public static final String ID_FIELD = "cacheKey"; // Alias for DAO compatibility + public static final String REMOTE_CONFIG_ID = "remoteConfigId"; + public static final String CONFIGURATION_ID_FIELD = "remoteConfigId"; // Alias for DAO compatibility + public static final String QUERY = "query"; + public static final String QUERY_HASH_FIELD = "query"; // Alias for DAO compatibility + public static final String QUERY_TEXT = "queryText"; + public static final String QUERY_TEXT_FIELD = "queryText"; // Alias for DAO compatibility + public static final String CACHED_RESPONSE = "cachedResponse"; + public static final String RAW_RESPONSE_FIELD = "cachedResponse"; // Alias for DAO compatibility + public static final String MAPPED_RESPONSE = "mappedResponse"; + public static final String MAPPED_RESPONSE_FIELD = "mappedResponse"; // Alias for DAO compatibility + public static final String CACHE_TIMESTAMP = "cacheTimestamp"; + public static final String TIMESTAMP_FIELD = "cacheTimestamp"; // Alias for DAO compatibility + public static final String EXPIRATION_TIMESTAMP = "expirationTimestamp"; + public static final String TTL_MINUTES_FIELD = "ttlMinutes"; // Alias for DAO compatibility + + private final String cacheKey; + private final String remoteConfigId; + private final String query; + private final String queryText; + private final String cachedResponse; + private final String mappedResponse; + private final long cacheTimestamp; + private final long expirationTimestamp; + + public RemoteSearchCache( + String cacheKey, + String remoteConfigId, + String query, + String queryText, + String cachedResponse, + String mappedResponse, + long cacheTimestamp, + long expirationTimestamp + ) { + this.cacheKey = cacheKey; + this.remoteConfigId = remoteConfigId; + this.query = query; + this.queryText = queryText; + this.cachedResponse = cachedResponse; + this.mappedResponse = mappedResponse; + this.cacheTimestamp = cacheTimestamp; + this.expirationTimestamp = expirationTimestamp; + } + + @Override + public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { + XContentBuilder xContentBuilder = builder.startObject(); + xContentBuilder.field(CACHE_KEY, this.cacheKey != null ? this.cacheKey : ""); + xContentBuilder.field(REMOTE_CONFIG_ID, this.remoteConfigId != null ? this.remoteConfigId : ""); + xContentBuilder.field(QUERY, this.query != null ? this.query : ""); + xContentBuilder.field(QUERY_TEXT, this.queryText != null ? this.queryText : ""); + xContentBuilder.field(CACHED_RESPONSE, this.cachedResponse != null ? this.cachedResponse : ""); + xContentBuilder.field(MAPPED_RESPONSE, this.mappedResponse != null ? this.mappedResponse : ""); + xContentBuilder.field(CACHE_TIMESTAMP, this.cacheTimestamp); + xContentBuilder.field(EXPIRATION_TIMESTAMP, this.expirationTimestamp); + return xContentBuilder.endObject(); + } + + /** + * Check if this cache entry has expired + */ + public boolean isExpired() { + return System.currentTimeMillis() > expirationTimestamp; + } + + /** + * Generate cache key from configuration ID, query, and query text + */ + public static String generateCacheKey(String remoteConfigId, String query, String queryText) { + return String.valueOf((remoteConfigId + query + queryText).hashCode()); + } + + // Getters + public String getCacheKey() { + return cacheKey; + } + + public String getRemoteConfigId() { + return remoteConfigId; + } + + /** + * Get configuration ID for DAO compatibility (returns remote config ID) + */ + public String getConfigurationId() { + return remoteConfigId; + } + + public String getQuery() { + return query; + } + + public String getQueryText() { + return queryText; + } + + public String getCachedResponse() { + return cachedResponse; + } + + public String getMappedResponse() { + return mappedResponse; + } + + public long getCacheTimestamp() { + return cacheTimestamp; + } + + public long getExpirationTimestamp() { + return expirationTimestamp; + } + + /** + * Get ID for DAO compatibility (returns cache key) + */ + public String getId() { + return cacheKey; + } + + /** + * Get response for DAO compatibility (returns cached response) + */ + public String getResponse() { + return cachedResponse; + } + + /** + * Create RemoteSearchCache from source map for DAO operations + */ + public static RemoteSearchCache fromSourceMap(java.util.Map sourceMap) { + return new RemoteSearchCache( + (String) sourceMap.get(CACHE_KEY), + (String) sourceMap.get(REMOTE_CONFIG_ID), + (String) sourceMap.get(QUERY), + (String) sourceMap.get(QUERY_TEXT), + (String) sourceMap.get(CACHED_RESPONSE), + (String) sourceMap.get(MAPPED_RESPONSE), + sourceMap.get(CACHE_TIMESTAMP) != null ? ((Number) sourceMap.get(CACHE_TIMESTAMP)).longValue() : 0L, + sourceMap.get(EXPIRATION_TIMESTAMP) != null ? ((Number) sourceMap.get(EXPIRATION_TIMESTAMP)).longValue() : 0L + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchConfiguration.java b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchConfiguration.java new file mode 100644 index 00000000..7760697b --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchConfiguration.java @@ -0,0 +1,171 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import java.io.IOException; +import java.util.Map; + +import org.opensearch.core.xcontent.ToXContentObject; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * RemoteSearchConfiguration represents connection details and settings for remote search engines. + * This enables experiments to run against remote OpenSearch clusters or other search engines via HTTPS. + */ +public class RemoteSearchConfiguration implements ToXContentObject { + public static final String ID = "id"; + public static final String NAME = "name"; + public static final String DESCRIPTION = "description"; + public static final String CONNECTION_URL = "connectionUrl"; + public static final String USERNAME = "username"; + public static final String PASSWORD = "password"; + public static final String QUERY_TEMPLATE = "queryTemplate"; + public static final String RESPONSE_TEMPLATE = "responseTemplate"; + public static final String MAX_REQUESTS_PER_SECOND = "maxRequestsPerSecond"; + public static final String MAX_CONCURRENT_REQUESTS = "maxConcurrentRequests"; + public static final String CACHE_DURATION_MINUTES = "cacheDurationMinutes"; + public static final String REFRESH_CACHE = "refreshCache"; + public static final String METADATA = "metadata"; + public static final String TIMESTAMP = "timestamp"; + + // Default values + public static final int DEFAULT_MAX_REQUESTS_PER_SECOND = 10; + public static final int DEFAULT_MAX_CONCURRENT_REQUESTS = 5; + public static final long DEFAULT_CACHE_DURATION_MINUTES = 60; + + private final String id; + private final String name; + private final String description; + private final String connectionUrl; + private final String username; + private final String password; // Will be encrypted in storage + private final String queryTemplate; + private final String responseTemplate; + private final int maxRequestsPerSecond; + private final int maxConcurrentRequests; + private final long cacheDurationMinutes; + private final boolean refreshCache; + private final Map metadata; + private final String timestamp; + + public RemoteSearchConfiguration( + String id, + String name, + String description, + String connectionUrl, + String username, + String password, + String queryTemplate, + String responseTemplate, + int maxRequestsPerSecond, + int maxConcurrentRequests, + long cacheDurationMinutes, + boolean refreshCache, + Map metadata, + String timestamp + ) { + this.id = id; + this.name = name; + this.description = description; + this.connectionUrl = connectionUrl; + this.username = username; + this.password = password; + this.queryTemplate = queryTemplate; + this.responseTemplate = responseTemplate; + this.maxRequestsPerSecond = maxRequestsPerSecond; + this.maxConcurrentRequests = maxConcurrentRequests; + this.cacheDurationMinutes = cacheDurationMinutes; + this.refreshCache = refreshCache; + this.metadata = metadata; + this.timestamp = timestamp; + } + + @Override + public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { + XContentBuilder xContentBuilder = builder.startObject(); + xContentBuilder.field(ID, this.id); + xContentBuilder.field(NAME, this.name != null ? this.name.trim() : ""); + xContentBuilder.field(DESCRIPTION, this.description != null ? this.description.trim() : ""); + xContentBuilder.field(CONNECTION_URL, this.connectionUrl != null ? this.connectionUrl.trim() : ""); + xContentBuilder.field(USERNAME, this.username != null ? this.username.trim() : ""); + xContentBuilder.field(PASSWORD, this.password != null ? this.password : ""); // Password will be encrypted + xContentBuilder.field(QUERY_TEMPLATE, this.queryTemplate != null ? this.queryTemplate.trim() : ""); + xContentBuilder.field(RESPONSE_TEMPLATE, this.responseTemplate != null ? this.responseTemplate.trim() : ""); + xContentBuilder.field(MAX_REQUESTS_PER_SECOND, this.maxRequestsPerSecond); + xContentBuilder.field(MAX_CONCURRENT_REQUESTS, this.maxConcurrentRequests); + xContentBuilder.field(CACHE_DURATION_MINUTES, this.cacheDurationMinutes); + xContentBuilder.field(REFRESH_CACHE, this.refreshCache); + xContentBuilder.field(METADATA, this.metadata); + xContentBuilder.field(TIMESTAMP, this.timestamp != null ? this.timestamp.trim() : ""); + return xContentBuilder.endObject(); + } + + // Getters + public String getId() { + return id; + } + + public String getName() { + return name; + } + + public String getDescription() { + return description; + } + + public String getConnectionUrl() { + return connectionUrl; + } + + public String getUsername() { + return username; + } + + public String getPassword() { + return password; + } + + public String getQueryTemplate() { + return queryTemplate; + } + + public String getResponseTemplate() { + return responseTemplate; + } + + public int getMaxRequestsPerSecond() { + return maxRequestsPerSecond; + } + + public int getMaxConcurrentRequests() { + return maxConcurrentRequests; + } + + public long getCacheDurationMinutes() { + return cacheDurationMinutes; + } + + public boolean isRefreshCache() { + return refreshCache; + } + + public Map getMetadata() { + return metadata; + } + + public String getTimestamp() { + return timestamp; + } + + /** + * Alias for getCacheDurationMinutes() for DAO compatibility + */ + public long getCacheTtlMinutes() { + return cacheDurationMinutes; + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchFailure.java b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchFailure.java new file mode 100644 index 00000000..f9867dc7 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/model/RemoteSearchFailure.java @@ -0,0 +1,226 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import java.io.IOException; +import java.util.Locale; + +import org.opensearch.core.xcontent.ToXContentObject; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * RemoteSearchFailure represents failed remote search operations for analysis and potential retry. + * This enables tracking and debugging of remote search connectivity issues. + */ +public class RemoteSearchFailure implements ToXContentObject { + public static final String ID = "id"; + public static final String ID_FIELD = "id"; // Alias for DAO compatibility + public static final String REMOTE_CONFIG_ID = "remoteConfigId"; + public static final String CONFIGURATION_ID_FIELD = "remoteConfigId"; // Alias for DAO compatibility + public static final String EXPERIMENT_ID = "experimentId"; + public static final String EXPERIMENT_ID_FIELD = "experimentId"; // Alias for DAO compatibility + public static final String QUERY = "query"; + public static final String QUERY_FIELD = "query"; // Alias for DAO compatibility + public static final String QUERY_TEXT = "queryText"; + public static final String QUERY_TEXT_FIELD = "queryText"; // Alias for DAO compatibility + public static final String ERROR_TYPE = "errorType"; + public static final String ERROR_TYPE_FIELD = "errorType"; // Alias for DAO compatibility + public static final String ERROR_MESSAGE = "errorMessage"; + public static final String ERROR_MESSAGE_FIELD = "errorMessage"; // Alias for DAO compatibility + public static final String STACK_TRACE = "stackTrace"; + public static final String STACK_TRACE_FIELD = "stackTrace"; // Alias for DAO compatibility + public static final String HTTP_STATUS_CODE = "httpStatusCode"; + public static final String HTTP_STATUS_CODE_FIELD = "httpStatusCode"; // Alias for DAO compatibility + public static final String TIMESTAMP = "timestamp"; + public static final String TIMESTAMP_FIELD = "timestamp"; // Alias for DAO compatibility + public static final String STATUS = "status"; + public static final String STATUS_FIELD = "status"; // Alias for DAO compatibility + + /** + * Error types for remote search failures + */ + public enum ErrorType { + CONNECTION_TIMEOUT, + AUTH_FAILURE, + INVALID_RESPONSE, + RATE_LIMIT_EXCEEDED, + REMOTE_SERVER_ERROR, + NETWORK_ERROR, + UNKNOWN_ERROR + } + + /** + * Status of the failure record + */ + public enum Status { + FAILED, + RETRY_PENDING, + RESOLVED + } + + private final String id; + private final String remoteConfigId; + private final String experimentId; + private final String query; + private final String queryText; + private final String errorType; + private final String errorMessage; + private final String timestamp; + private final String status; + + public RemoteSearchFailure( + String id, + String remoteConfigId, + String experimentId, + String query, + String queryText, + String errorType, + String errorMessage, + String timestamp, + String status + ) { + this.id = id; + this.remoteConfigId = remoteConfigId; + this.experimentId = experimentId; + this.query = query; + this.queryText = queryText; + this.errorType = errorType; + this.errorMessage = errorMessage; + this.timestamp = timestamp; + this.status = status; + } + + @Override + public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { + XContentBuilder xContentBuilder = builder.startObject(); + xContentBuilder.field(ID, this.id != null ? this.id : ""); + xContentBuilder.field(REMOTE_CONFIG_ID, this.remoteConfigId != null ? this.remoteConfigId : ""); + xContentBuilder.field(EXPERIMENT_ID, this.experimentId != null ? this.experimentId : ""); + xContentBuilder.field(QUERY, this.query != null ? this.query : ""); + xContentBuilder.field(QUERY_TEXT, this.queryText != null ? this.queryText : ""); + xContentBuilder.field(ERROR_TYPE, this.errorType != null ? this.errorType : ""); + xContentBuilder.field(ERROR_MESSAGE, this.errorMessage != null ? this.errorMessage : ""); + xContentBuilder.field(TIMESTAMP, this.timestamp != null ? this.timestamp : ""); + xContentBuilder.field(STATUS, this.status != null ? this.status : ""); + return xContentBuilder.endObject(); + } + + /** + * Create a failure record from an exception + */ + public static RemoteSearchFailure fromException( + String id, + String remoteConfigId, + String experimentId, + String query, + String queryText, + Exception exception, + String timestamp + ) { + ErrorType errorType = categorizeException(exception); + return new RemoteSearchFailure( + id, + remoteConfigId, + experimentId, + query, + queryText, + errorType.name(), + exception.getMessage(), + timestamp, + Status.FAILED.name() + ); + } + + /** + * Categorize exception into error types + */ + private static ErrorType categorizeException(Exception exception) { + String message = exception.getMessage(); + if (message == null) { + return ErrorType.UNKNOWN_ERROR; + } + + String lowerMessage = message.toLowerCase(Locale.ROOT); + if (lowerMessage.contains("timeout") || lowerMessage.contains("timed out")) { + return ErrorType.CONNECTION_TIMEOUT; + } else if (lowerMessage.contains("unauthorized") || lowerMessage.contains("authentication")) { + return ErrorType.AUTH_FAILURE; + } else if (lowerMessage.contains("rate limit") || lowerMessage.contains("too many requests")) { + return ErrorType.RATE_LIMIT_EXCEEDED; + } else if (lowerMessage.contains("server error") || lowerMessage.contains("500")) { + return ErrorType.REMOTE_SERVER_ERROR; + } else if (lowerMessage.contains("network") || lowerMessage.contains("connection")) { + return ErrorType.NETWORK_ERROR; + } else if (lowerMessage.contains("response") || lowerMessage.contains("parse")) { + return ErrorType.INVALID_RESPONSE; + } else { + return ErrorType.UNKNOWN_ERROR; + } + } + + // Getters + public String getId() { + return id; + } + + public String getRemoteConfigId() { + return remoteConfigId; + } + + /** + * Get configuration ID for DAO compatibility (returns remote config ID) + */ + public String getConfigurationId() { + return remoteConfigId; + } + + public String getExperimentId() { + return experimentId; + } + + public String getQuery() { + return query; + } + + public String getQueryText() { + return queryText; + } + + public String getErrorType() { + return errorType; + } + + public String getErrorMessage() { + return errorMessage; + } + + public String getTimestamp() { + return timestamp; + } + + public String getStatus() { + return status; + } + + /** + * Create RemoteSearchFailure from source map for DAO operations + */ + public static RemoteSearchFailure fromSourceMap(java.util.Map sourceMap) { + return new RemoteSearchFailure( + (String) sourceMap.get(ID), + (String) sourceMap.get(REMOTE_CONFIG_ID), + (String) sourceMap.get(EXPERIMENT_ID), + (String) sourceMap.get(QUERY), + (String) sourceMap.get(QUERY_TEXT), + (String) sourceMap.get(ERROR_TYPE), + (String) sourceMap.get(ERROR_MESSAGE), + (String) sourceMap.get(TIMESTAMP), + (String) sourceMap.get(STATUS) + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/plugin/SearchRelevancePlugin.java b/src/main/java/org/opensearch/searchrelevance/plugin/SearchRelevancePlugin.java index 7d884941..dc6670d8 100644 --- a/src/main/java/org/opensearch/searchrelevance/plugin/SearchRelevancePlugin.java +++ b/src/main/java/org/opensearch/searchrelevance/plugin/SearchRelevancePlugin.java @@ -48,6 +48,9 @@ import org.opensearch.searchrelevance.dao.JudgmentCacheDao; import org.opensearch.searchrelevance.dao.JudgmentDao; import org.opensearch.searchrelevance.dao.QuerySetDao; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; import org.opensearch.searchrelevance.dao.SearchConfigurationDao; import org.opensearch.searchrelevance.executors.ExperimentTaskManager; import org.opensearch.searchrelevance.executors.SearchRelevanceExecutor; @@ -58,15 +61,19 @@ import org.opensearch.searchrelevance.rest.RestDeleteExperimentAction; import org.opensearch.searchrelevance.rest.RestDeleteJudgmentAction; import org.opensearch.searchrelevance.rest.RestDeleteQuerySetAction; +import org.opensearch.searchrelevance.rest.RestDeleteRemoteSearchConfigurationAction; import org.opensearch.searchrelevance.rest.RestDeleteSearchConfigurationAction; import org.opensearch.searchrelevance.rest.RestGetExperimentAction; import org.opensearch.searchrelevance.rest.RestGetJudgmentAction; import org.opensearch.searchrelevance.rest.RestGetQuerySetAction; +import org.opensearch.searchrelevance.rest.RestGetRemoteSearchConfigurationAction; import org.opensearch.searchrelevance.rest.RestGetSearchConfigurationAction; import org.opensearch.searchrelevance.rest.RestPutExperimentAction; import org.opensearch.searchrelevance.rest.RestPutJudgmentAction; import org.opensearch.searchrelevance.rest.RestPutQuerySetAction; +import org.opensearch.searchrelevance.rest.RestPutRemoteSearchConfigurationAction; import org.opensearch.searchrelevance.rest.RestPutSearchConfigurationAction; +import org.opensearch.searchrelevance.rest.RestRemoteSearchExecuteAction; import org.opensearch.searchrelevance.rest.RestSearchRelevanceStatsAction; import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; import org.opensearch.searchrelevance.stats.events.EventStatsManager; @@ -120,6 +127,9 @@ public class SearchRelevancePlugin extends Plugin implements ActionPlugin, Syste private JudgmentDao judgmentDao; private EvaluationResultDao evaluationResultDao; private JudgmentCacheDao judgmentCacheDao; + private RemoteSearchConfigurationDao remoteSearchConfigurationDao; + private RemoteSearchCacheDao remoteSearchCacheDao; + private RemoteSearchFailureDao remoteSearchFailureDao; private MLAccessor mlAccessor; private MetricsHelper metricsHelper; private SearchRelevanceSettingsAccessor settingsAccessor; @@ -158,6 +168,9 @@ public Collection createComponents( this.judgmentDao = new JudgmentDao(searchRelevanceIndicesManager); this.evaluationResultDao = new EvaluationResultDao(searchRelevanceIndicesManager); this.judgmentCacheDao = new JudgmentCacheDao(searchRelevanceIndicesManager); + this.remoteSearchConfigurationDao = new RemoteSearchConfigurationDao(searchRelevanceIndicesManager); + this.remoteSearchCacheDao = new RemoteSearchCacheDao(searchRelevanceIndicesManager); + this.remoteSearchFailureDao = new RemoteSearchFailureDao(searchRelevanceIndicesManager); MachineLearningNodeClient mlClient = new MachineLearningNodeClient(client); this.mlAccessor = new MLAccessor(mlClient); SearchRelevanceExecutor.initialize(threadPool); @@ -165,7 +178,10 @@ public Collection createComponents( client, evaluationResultDao, experimentVariantDao, - threadPool + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao ); this.metricsHelper = new MetricsHelper(clusterService, client, judgmentDao, evaluationResultDao, experimentVariantDao); this.settingsAccessor = new SearchRelevanceSettingsAccessor(clusterService, environment.settings()); @@ -213,7 +229,12 @@ public List getRestHandlers( new RestPutExperimentAction(settingsAccessor), new RestGetExperimentAction(settingsAccessor), new RestDeleteExperimentAction(settingsAccessor), - new RestSearchRelevanceStatsAction(settingsAccessor, clusterUtil) + new RestSearchRelevanceStatsAction(settingsAccessor, clusterUtil), + // Remote search capability + new RestPutRemoteSearchConfigurationAction(settingsAccessor, remoteSearchConfigurationDao), + new RestGetRemoteSearchConfigurationAction(settingsAccessor, remoteSearchConfigurationDao), + new RestDeleteRemoteSearchConfigurationAction(settingsAccessor, remoteSearchConfigurationDao), + new RestRemoteSearchExecuteAction(settingsAccessor, remoteSearchConfigurationDao, remoteSearchCacheDao, remoteSearchFailureDao) ); } diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestDeleteRemoteSearchConfigurationAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestDeleteRemoteSearchConfigurationAction.java new file mode 100644 index 00000000..66062397 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestDeleteRemoteSearchConfigurationAction.java @@ -0,0 +1,96 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.rest; + +import static java.util.Collections.singletonList; +import static org.opensearch.rest.RestRequest.Method.DELETE; +import static org.opensearch.searchrelevance.common.PluginConstants.DOCUMENT_ID; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CONFIGURATIONS_URL; + +import java.io.IOException; +import java.util.List; +import java.util.Locale; + +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.opensearch.action.DocWriteResponse; +import org.opensearch.action.delete.DeleteResponse; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.rest.RestStatus; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.rest.BaseRestHandler; +import org.opensearch.rest.BytesRestResponse; +import org.opensearch.rest.RestRequest; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.exception.SearchRelevanceException; +import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; +import org.opensearch.transport.client.node.NodeClient; + +import lombok.AllArgsConstructor; + +/** + * REST action to delete a RemoteSearchConfiguration by ID. + * Route: DELETE /_plugins/_search_relevance/remote_search_configurations/{id} + */ +@AllArgsConstructor +public class RestDeleteRemoteSearchConfigurationAction extends BaseRestHandler { + private static final Logger LOGGER = LogManager.getLogger(RestDeleteRemoteSearchConfigurationAction.class); + private static final String ACTION_NAME = "delete_remote_search_configuration_action"; + + private final SearchRelevanceSettingsAccessor settingsAccessor; + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + + @Override + public String getName() { + return ACTION_NAME; + } + + @Override + public List routes() { + return singletonList(new Route(DELETE, String.format(Locale.ROOT, "%s/{%s}", REMOTE_SEARCH_CONFIGURATIONS_URL, DOCUMENT_ID))); + } + + @Override + protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient client) throws IOException { + if (!settingsAccessor.isWorkbenchEnabled()) { + return channel -> channel.sendResponse(new BytesRestResponse(RestStatus.FORBIDDEN, "Search Relevance Workbench is disabled")); + } + final String id = request.param(DOCUMENT_ID); + if (id == null || id.isBlank()) { + throw new SearchRelevanceException("id cannot be null or empty", RestStatus.BAD_REQUEST); + } + + return channel -> remoteSearchConfigurationDao.deleteRemoteSearchConfiguration(id, new ActionListener() { + @Override + public void onResponse(DeleteResponse deleteResponse) { + try { + XContentBuilder builder = channel.newBuilder(); + deleteResponse.toXContent(builder, request); + channel.sendResponse( + new BytesRestResponse( + deleteResponse.getResult() == DocWriteResponse.Result.NOT_FOUND ? RestStatus.NOT_FOUND : RestStatus.OK, + builder + ) + ); + } catch (IOException e) { + onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + try { + LOGGER.error("Failed to delete remote search configuration {}", id, e); + channel.sendResponse(new BytesRestResponse(channel, RestStatus.INTERNAL_SERVER_ERROR, e)); + } catch (IOException ex) { + LOGGER.error("Failed to send error response", ex); + } + } + }); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestGetRemoteSearchConfigurationAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestGetRemoteSearchConfigurationAction.java new file mode 100644 index 00000000..779ddbdc --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestGetRemoteSearchConfigurationAction.java @@ -0,0 +1,141 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.rest; + +import static org.opensearch.rest.RestRequest.Method.GET; +import static org.opensearch.searchrelevance.common.PluginConstants.DOCUMENT_ID; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CONFIGURATIONS_URL; + +import java.io.IOException; +import java.util.List; +import java.util.Locale; + +import org.opensearch.ExceptionsHelper; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.rest.RestStatus; +import org.opensearch.core.xcontent.ToXContent; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.rest.BaseRestHandler; +import org.opensearch.rest.BytesRestResponse; +import org.opensearch.rest.RestRequest; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.model.RemoteSearchConfiguration; +import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; +import org.opensearch.transport.client.node.NodeClient; + +/** + * REST action to get or list remote search configurations. + * Routes: + * - GET /_plugins/_search_relevance/remote_search_configurations/{id} + * - GET /_plugins/_search_relevance/remote_search_configurations + */ +public class RestGetRemoteSearchConfigurationAction extends BaseRestHandler { + + private static final String ACTION_NAME = "get_remote_search_configuration_action"; + + private final SearchRelevanceSettingsAccessor settingsAccessor; + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + + public RestGetRemoteSearchConfigurationAction( + SearchRelevanceSettingsAccessor settingsAccessor, + RemoteSearchConfigurationDao remoteSearchConfigurationDao + ) { + this.settingsAccessor = settingsAccessor; + this.remoteSearchConfigurationDao = remoteSearchConfigurationDao; + } + + @Override + public String getName() { + return ACTION_NAME; + } + + @Override + public java.util.List routes() { + return List.of( + new Route(GET, String.format(Locale.ROOT, "%s/{%s}", REMOTE_SEARCH_CONFIGURATIONS_URL, DOCUMENT_ID)), + new Route(GET, REMOTE_SEARCH_CONFIGURATIONS_URL) + ); + } + + @Override + protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient client) throws IOException { + if (!settingsAccessor.isWorkbenchEnabled()) { + return channel -> channel.sendResponse(new BytesRestResponse(RestStatus.FORBIDDEN, "Search Relevance Workbench is disabled")); + } + + final String id = request.param(DOCUMENT_ID); + if (id != null && !id.isBlank()) { + // Get single configuration + return channel -> remoteSearchConfigurationDao.getRemoteSearchConfiguration( + id, + new ActionListener() { + @Override + public void onResponse(RemoteSearchConfiguration cfg) { + try { + if (cfg == null) { + channel.sendResponse(new BytesRestResponse(RestStatus.NOT_FOUND, "RemoteSearchConfiguration not found")); + return; + } + XContentBuilder builder = channel.newBuilder(); + cfg.toXContent(builder, ToXContent.EMPTY_PARAMS); + channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); + } catch (IOException e) { + onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + try { + channel.sendResponse(new BytesRestResponse(channel, ExceptionsHelper.status(e), e)); + } catch (IOException ex) { + try { + channel.sendResponse(new BytesRestResponse(channel, ex)); + } catch (IOException ioEx) { + // ignored secondary failure + } + } + } + } + ); + } + + // List configurations + return channel -> remoteSearchConfigurationDao.listRemoteSearchConfigurations( + new ActionListener>() { + @Override + public void onResponse(List list) { + try { + XContentBuilder builder = channel.newBuilder(); + builder.startArray(); + for (RemoteSearchConfiguration cfg : list) { + cfg.toXContent(builder, ToXContent.EMPTY_PARAMS); + } + builder.endArray(); + channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); + } catch (IOException e) { + onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + try { + channel.sendResponse(new BytesRestResponse(channel, ExceptionsHelper.status(e), e)); + } catch (IOException ex) { + try { + channel.sendResponse(new BytesRestResponse(channel, ex)); + } catch (IOException ioEx) { + // ignored secondary failure + } + } + } + } + ); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestPutExperimentAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestPutExperimentAction.java index c777d26e..82848fa8 100644 --- a/src/main/java/org/opensearch/searchrelevance/rest/RestPutExperimentAction.java +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestPutExperimentAction.java @@ -116,6 +116,15 @@ protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient cli } break; + case REMOTE_SEARCH_EVALUATION: + if (searchConfigurationList == null || searchConfigurationList.isEmpty()) { + throw new SearchRelevanceException( + "REMOTE_SEARCH_EVALUATION requires at least 1 remote configuration", + RestStatus.BAD_REQUEST + ); + } + break; + default: throw new IllegalArgumentException("Unsupported experiment type: " + experimentType); } diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestPutQuerySetAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestPutQuerySetAction.java index 5c2b5ec1..e626b1c6 100644 --- a/src/main/java/org/opensearch/searchrelevance/rest/RestPutQuerySetAction.java +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestPutQuerySetAction.java @@ -88,6 +88,11 @@ protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient cli List querySetQueries; if (sampling.equals(MANUAL)) { List rawQueries = (List) source.get(QUERY_SET_QUERIES); + if (rawQueries == null) { + return channel -> channel.sendResponse( + new BytesRestResponse(RestStatus.BAD_REQUEST, "Missing required field: " + QUERY_SET_QUERIES) + ); + } if (rawQueries.size() > settingsAccessor.getMaxQuerySetAllowed()) { return channel -> channel.sendResponse(new BytesRestResponse(RestStatus.FORBIDDEN, "Query Set Limit Exceeded.")); } diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestPutRemoteSearchConfigurationAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestPutRemoteSearchConfigurationAction.java new file mode 100644 index 00000000..254920eb --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestPutRemoteSearchConfigurationAction.java @@ -0,0 +1,200 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.rest; + +import static java.util.Collections.singletonList; +import static org.opensearch.rest.RestRequest.Method.POST; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_CONFIGURATIONS_URL; + +import java.io.IOException; +import java.util.Map; + +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.opensearch.ExceptionsHelper; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.rest.RestStatus; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.core.xcontent.XContentParser; +import org.opensearch.rest.BaseRestHandler; +import org.opensearch.rest.BytesRestResponse; +import org.opensearch.rest.RestRequest; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.model.RemoteSearchConfiguration; +import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; +import org.opensearch.searchrelevance.utils.TimeUtils; +import org.opensearch.transport.client.node.NodeClient; + +/** + * REST action to create or update a RemoteSearchConfiguration. + * Route: POST /_plugins/_search_relevance/remote_search_configurations + */ +public class RestPutRemoteSearchConfigurationAction extends BaseRestHandler { + private static final Logger LOGGER = LogManager.getLogger(RestPutRemoteSearchConfigurationAction.class); + private static final String ACTION_NAME = "put_remote_search_configuration_action"; + + private final SearchRelevanceSettingsAccessor settingsAccessor; + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + + public RestPutRemoteSearchConfigurationAction( + SearchRelevanceSettingsAccessor settingsAccessor, + RemoteSearchConfigurationDao remoteSearchConfigurationDao + ) { + this.settingsAccessor = settingsAccessor; + this.remoteSearchConfigurationDao = remoteSearchConfigurationDao; + } + + @Override + public String getName() { + return ACTION_NAME; + } + + @Override + public java.util.List routes() { + return singletonList(new Route(POST, REMOTE_SEARCH_CONFIGURATIONS_URL)); + } + + @Override + protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient client) throws IOException { + if (!settingsAccessor.isWorkbenchEnabled()) { + return channel -> channel.sendResponse(new BytesRestResponse(RestStatus.FORBIDDEN, "Search Relevance Workbench is disabled")); + } + + XContentParser parser = request.contentParser(); + Map source = parser.map(); + + String id = asString(source.get(RemoteSearchConfiguration.ID)); + if (id == null || id.isBlank()) { + throw new IllegalArgumentException("id is required"); + } + + String name = asString(source.get(RemoteSearchConfiguration.NAME)); + String description = asString(source.get(RemoteSearchConfiguration.DESCRIPTION)); + String connectionUrl = asString(source.get(RemoteSearchConfiguration.CONNECTION_URL)); + String username = asString(source.get(RemoteSearchConfiguration.USERNAME)); + String password = asString(source.get(RemoteSearchConfiguration.PASSWORD)); + String queryTemplate = asString(source.get(RemoteSearchConfiguration.QUERY_TEMPLATE)); + String responseTemplate = asString(source.get(RemoteSearchConfiguration.RESPONSE_TEMPLATE)); + int maxRps = asInt( + source.getOrDefault( + RemoteSearchConfiguration.MAX_REQUESTS_PER_SECOND, + RemoteSearchConfiguration.DEFAULT_MAX_REQUESTS_PER_SECOND + ) + ); + int maxConcurrent = asInt( + source.getOrDefault( + RemoteSearchConfiguration.MAX_CONCURRENT_REQUESTS, + RemoteSearchConfiguration.DEFAULT_MAX_CONCURRENT_REQUESTS + ) + ); + long cacheMinutes = asLong( + source.getOrDefault(RemoteSearchConfiguration.CACHE_DURATION_MINUTES, RemoteSearchConfiguration.DEFAULT_CACHE_DURATION_MINUTES) + ); + boolean refreshCache = asBoolean(source.getOrDefault(RemoteSearchConfiguration.REFRESH_CACHE, Boolean.FALSE)); + @SuppressWarnings("unchecked") + Map metadata = (Map) source.get(RemoteSearchConfiguration.METADATA); + + String timestamp = TimeUtils.getTimestamp(); + + RemoteSearchConfiguration configuration = new RemoteSearchConfiguration( + id, + name, + description, + connectionUrl, + username, + password, + queryTemplate, + responseTemplate, + maxRps, + maxConcurrent, + cacheMinutes, + refreshCache, + metadata, + timestamp + ); + + return channel -> remoteSearchConfigurationDao.createRemoteSearchConfiguration(configuration, new ActionListener() { + @Override + public void onResponse(IndexResponse response) { + try { + XContentBuilder builder = channel.newBuilder(); + builder.startObject(); + builder.field("id", id); + builder.field("result", response.getResult().name()); + builder.field("status", response.status().name()); + builder.endObject(); + channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); + } catch (IOException e) { + onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + try { + LOGGER.error("Failed to create remote search configuration {}", id, e); + channel.sendResponse(new BytesRestResponse(channel, ExceptionsHelper.status(e), e)); + } catch (IOException ex) { + try { + channel.sendResponse(new BytesRestResponse(channel, ex)); + } catch (IOException ioEx) { + // ignored secondary failure + } + } + } + }); + } + + private static String asString(Object o) { + if (o == null) { + return null; + } + if (o instanceof String) { + return (String) o; + } + if (o instanceof Map) { + // Convert Map to proper JSON string + try (XContentBuilder builder = XContentFactory.jsonBuilder()) { + builder.map((Map) o); + return builder.toString(); + } catch (IOException e) { + LOGGER.warn("Failed to serialize Map to JSON, falling back to toString(): {}", e.getMessage()); + return String.valueOf(o); + } + } + return String.valueOf(o); + } + + private static int asInt(Object o) { + if (o == null) return 0; + if (o instanceof Number) return ((Number) o).intValue(); + try { + return Integer.parseInt(String.valueOf(o)); + } catch (NumberFormatException e) { + return 0; + } + } + + private static long asLong(Object o) { + if (o == null) return 0L; + if (o instanceof Number) return ((Number) o).longValue(); + try { + return Long.parseLong(String.valueOf(o)); + } catch (NumberFormatException e) { + return 0L; + } + } + + private static boolean asBoolean(Object o) { + if (o == null) return false; + if (o instanceof Boolean) return (Boolean) o; + return Boolean.parseBoolean(String.valueOf(o)); + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/rest/RestRemoteSearchExecuteAction.java b/src/main/java/org/opensearch/searchrelevance/rest/RestRemoteSearchExecuteAction.java new file mode 100644 index 00000000..0653666e --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/rest/RestRemoteSearchExecuteAction.java @@ -0,0 +1,219 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.rest; + +import static java.util.Collections.singletonList; +import static org.opensearch.rest.RestRequest.Method.POST; +import static org.opensearch.searchrelevance.common.PluginConstants.REMOTE_SEARCH_EXECUTE_URL; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Locale; +import java.util.Map; + +import org.opensearch.ExceptionsHelper; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.rest.RestStatus; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.core.xcontent.XContentParser; +import org.opensearch.rest.BaseRestHandler; +import org.opensearch.rest.BytesRestResponse; +import org.opensearch.rest.RestRequest; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; +import org.opensearch.searchrelevance.executors.RemoteResponseMapper; +import org.opensearch.searchrelevance.executors.RemoteSearchExecutor; +import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; +import org.opensearch.searchrelevance.utils.ResponseValidationUtils; +import org.opensearch.transport.client.node.NodeClient; + +/** + * REST action for executing a remote search via configured remote search configuration. + * Route: POST /_plugins/_search_relevance/remote_search/execute + * Body: { "remoteConfigId": "...", "queryText": "...", "size": 3, "query": "{...}" } + * + * If "query" is omitted, the configuration's queryTemplate should produce the outbound payload. + */ +public class RestRemoteSearchExecuteAction extends BaseRestHandler { + + private static final String ACTION_NAME = "remote_search_execute_action"; + private final SearchRelevanceSettingsAccessor settingsAccessor; + private final RemoteSearchConfigurationDao remoteSearchConfigurationDao; + private final RemoteSearchCacheDao remoteSearchCacheDao; + private final RemoteSearchFailureDao remoteSearchFailureDao; + + public RestRemoteSearchExecuteAction( + SearchRelevanceSettingsAccessor settingsAccessor, + RemoteSearchConfigurationDao remoteSearchConfigurationDao, + RemoteSearchCacheDao remoteSearchCacheDao, + RemoteSearchFailureDao remoteSearchFailureDao + ) { + this.settingsAccessor = settingsAccessor; + this.remoteSearchConfigurationDao = remoteSearchConfigurationDao; + this.remoteSearchCacheDao = remoteSearchCacheDao; + this.remoteSearchFailureDao = remoteSearchFailureDao; + } + + @Override + public String getName() { + return ACTION_NAME; + } + + @Override + public List routes() { + return singletonList(new Route(POST, REMOTE_SEARCH_EXECUTE_URL)); + } + + @Override + protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient client) throws IOException { + if (!settingsAccessor.isWorkbenchEnabled()) { + return channel -> channel.sendResponse(new BytesRestResponse(RestStatus.FORBIDDEN, "Search Relevance Workbench is disabled")); + } + + XContentParser parser = request.contentParser(); + Map source = parser.map(); + + String remoteConfigId = asString(source.get("remoteConfigId")); + String queryText = asString(source.get("queryText")); + String query = asString(source.getOrDefault("query", "{}")); + int size = asInt(source.getOrDefault("size", 3)); + + if (remoteConfigId == null || remoteConfigId.isBlank()) { + throw new IllegalArgumentException("remoteConfigId is required"); + } + final String queryTextFinal = (queryText == null) ? "" : queryText; + + RemoteSearchExecutor executor = new RemoteSearchExecutor( + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao, + new RemoteResponseMapper() + ); + + final String experimentId = String.format(Locale.ROOT, "rest_execute_%d", System.currentTimeMillis()); + + return channel -> executor.executeRemoteSearch( + remoteConfigId, + query, + queryTextFinal, + size, + experimentId, + new ActionListener() { + @Override + public void onResponse(RemoteSearchExecutor.RemoteSearchResponse remoteResponse) { + try { + // Prefer mapped response if available + String json = remoteResponse.getMappedResponse() != null && !remoteResponse.getMappedResponse().isBlank() + ? remoteResponse.getMappedResponse() + : remoteResponse.getRawResponse(); + + // Use response validation utility to ensure proper format + json = ResponseValidationUtils.ensureValidOpenSearchResponse(json); + + // Validate JSON structure before processing + Map data; + try ( + org.opensearch.core.xcontent.XContentParser xParser = org.opensearch.common.xcontent.XContentFactory + .jsonBuilder() + .contentType() + .xContent() + .createParser(null, null, json) + ) { + data = xParser.map(); + + // Additional validation: ensure we have a valid response structure + if (data == null || data.isEmpty()) { + // Return empty response if parsing resulted in null/empty map + XContentBuilder builder = channel.newBuilder(); + builder.startObject() + .startObject("hits") + .startObject("total") + .field("value", 0) + .field("relation", "eq") + .endObject() + .field("max_score", (String) null) + .startArray("hits") + .endArray() + .endObject() + .field("took", 0) + .field("timed_out", false) + .endObject(); + channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); + return; + } + + // Ensure hits structure exists for jq compatibility + if (!data.containsKey("hits")) { + // If no hits structure, wrap the response appropriately + Map wrappedData = new HashMap<>(); + Map hitsContainer = new HashMap<>(); + Map total = new HashMap<>(); + total.put("value", 0); + total.put("relation", "eq"); + hitsContainer.put("total", total); + hitsContainer.put("max_score", null); + hitsContainer.put("hits", new ArrayList<>()); + wrappedData.put("hits", hitsContainer); + wrappedData.put("took", data.getOrDefault("took", 0)); + wrappedData.put("timed_out", false); + data = wrappedData; + } + } catch (Exception parseException) { + // JSON parsing failed - return error response with proper structure + XContentBuilder builder = channel.newBuilder(); + builder.startObject() + .startObject("error") + .field("type", "remote_response_parse_exception") + .field("reason", "Failed to parse remote response: " + parseException.getMessage()) + .endObject() + .endObject(); + channel.sendResponse(new BytesRestResponse(RestStatus.INTERNAL_SERVER_ERROR, builder)); + return; + } + + XContentBuilder builder = channel.newBuilder(); + builder.map(data); + channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder)); + } catch (Exception e) { + onFailure(e); + } + } + + @Override + public void onFailure(Exception e) { + try { + channel.sendResponse(new BytesRestResponse(channel, ExceptionsHelper.status(e), e)); + } catch (IOException ex) { + try { + channel.sendResponse(new BytesRestResponse(channel, ex)); + } catch (IOException ioEx) { + // ignored secondary failure + } + } + } + } + ); + } + + private static String asString(Object o) { + return o == null ? null : String.valueOf(o); + } + + private static int asInt(Object o) { + if (o == null) return 0; + if (o instanceof Number) return ((Number) o).intValue(); + try { + return Integer.parseInt(String.valueOf(o)); + } catch (NumberFormatException e) { + return 0; + } + } +} diff --git a/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatName.java b/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatName.java index b3430450..8907255b 100644 --- a/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatName.java +++ b/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatName.java @@ -58,7 +58,18 @@ public enum EventStatName implements StatName { "experiments", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.V_3_1_0 - ),; + ), + REMOTE_SEARCH_REQUESTS("remote_search_requests", "remote_search", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.CURRENT), + REMOTE_SEARCH_SUCCESSES("remote_search_successes", "remote_search", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.CURRENT), + REMOTE_SEARCH_FAILURES("remote_search_failures", "remote_search", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.CURRENT), + REMOTE_SEARCH_RATE_LIMIT_HITS( + "remote_search_rate_limit_hits", + "remote_search", + EventStatType.TIMESTAMPED_EVENT_COUNTER, + Version.CURRENT + ), + REMOTE_SEARCH_CACHE_HITS("remote_search_cache_hits", "remote_search", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.CURRENT), + REMOTE_SEARCH_CACHE_MISSES("remote_search_cache_misses", "remote_search", EventStatType.TIMESTAMPED_EVENT_COUNTER, Version.CURRENT),; private final String nameString; private final String path; diff --git a/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatsManager.java b/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatsManager.java index 2b2dbe87..6498ea61 100644 --- a/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatsManager.java +++ b/src/main/java/org/opensearch/searchrelevance/stats/events/EventStatsManager.java @@ -61,6 +61,10 @@ public static void increment(EventStatName eventStatName) { * @param eventStatName The name of the event stat to increment */ public void inc(EventStatName eventStatName) { + // Treat as NOOP if settings accessor is not initialized or stats are disabled + if (settingsAccessor == null) { + return; + } if (settingsAccessor.isStatsEnabled()) { eventStatName.getEventStat().increment(); } diff --git a/src/main/java/org/opensearch/searchrelevance/transport/experiment/PutExperimentTransportAction.java b/src/main/java/org/opensearch/searchrelevance/transport/experiment/PutExperimentTransportAction.java index 3e8cad6a..c6b915a9 100644 --- a/src/main/java/org/opensearch/searchrelevance/transport/experiment/PutExperimentTransportAction.java +++ b/src/main/java/org/opensearch/searchrelevance/transport/experiment/PutExperimentTransportAction.java @@ -34,6 +34,7 @@ import org.opensearch.searchrelevance.executors.ExperimentTaskManager; import org.opensearch.searchrelevance.experiment.HybridOptimizerExperimentProcessor; import org.opensearch.searchrelevance.experiment.PointwiseExperimentProcessor; +import org.opensearch.searchrelevance.experiment.RemoteSearchExperimentProcessor; import org.opensearch.searchrelevance.metrics.MetricsHelper; import org.opensearch.searchrelevance.model.AsyncStatus; import org.opensearch.searchrelevance.model.Experiment; @@ -59,6 +60,7 @@ public class PutExperimentTransportAction extends HandledTransportAction(), queryTextWithReferences); + } else { + // Then get SearchConfigurations asynchronously + fetchSearchConfigurationsAsync(experimentId, request, queryTextWithReferences); + } + } catch (Exception e) { handleAsyncFailure(experimentId, request, "Failed to process QuerySet", e); } @@ -346,6 +357,29 @@ private void executeExperimentEvaluation( error -> handleFailure(error, hasFailure, experimentId, request) ) ); + } else if (request.getType() == ExperimentType.REMOTE_SEARCH_EVALUATION) { + // Treat searchConfigurationList as list of remote configuration IDs + remoteSearchExperimentProcessor.processRemoteExperiment( + experimentId, + queryText, + request.getSearchConfigurationList(), + judgmentList, + request.getSize(), + hasFailure, + ActionListener.wrap( + queryResults -> handleQueryResults( + queryText, + queryResults, + finalResults, + pendingQueries, + experimentId, + request, + hasFailure, + judgmentList + ), + error -> handleFailure(error, hasFailure, experimentId, request) + ) + ); } else { throw new SearchRelevanceException("Unknown experimentType" + request.getType(), RestStatus.BAD_REQUEST); } @@ -386,6 +420,12 @@ private void handleQueryResults( // Results already contain the proper format with evaluationId, searchConfigurationId, queryText finalResults.addAll(pointwiseResults); } + } else if (request.getType() == ExperimentType.REMOTE_SEARCH_EVALUATION) { + // For REMOTE_SEARCH_EVALUATION, reuse the same 'results' structure + List> remoteResults = (List>) queryResults.get("results"); + if (remoteResults != null) { + finalResults.addAll(remoteResults); + } } else { // For other experiment types, use generic format queryResults.put(QUERY_TEXT, queryText); diff --git a/src/main/java/org/opensearch/searchrelevance/utils/ResponseValidationUtils.java b/src/main/java/org/opensearch/searchrelevance/utils/ResponseValidationUtils.java new file mode 100644 index 00000000..85fb9273 --- /dev/null +++ b/src/main/java/org/opensearch/searchrelevance/utils/ResponseValidationUtils.java @@ -0,0 +1,244 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.utils; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.core.xcontent.XContentParser; + +import lombok.extern.log4j.Log4j2; + +/** + * Utility class for validating and ensuring proper response formats for remote search responses. + * This helps prevent jq errors and ensures consistent response structures. + */ +@Log4j2 +public class ResponseValidationUtils { + + /** + * Validate that a JSON response has the proper OpenSearch structure for jq compatibility + * + * @param jsonResponse The JSON response to validate + * @return true if the response is valid, false otherwise + */ + public static boolean isValidOpenSearchResponse(String jsonResponse) { + if (jsonResponse == null || jsonResponse.trim().isEmpty() || "null".equals(jsonResponse.trim())) { + return false; + } + + try { + Map responseMap = parseJsonToMap(jsonResponse); + + // Check for error structure (valid but indicates an error) + if (responseMap.containsKey("error")) { + return true; // Error responses are valid JSON structures + } + + // Check for hits structure (required for search responses) + if (!responseMap.containsKey("hits")) { + return false; + } + + Object hitsObj = responseMap.get("hits"); + if (!(hitsObj instanceof Map)) { + return false; + } + + @SuppressWarnings("unchecked") + Map hits = (Map) hitsObj; + + // Validate hits structure has required fields + if (!hits.containsKey("total") || !hits.containsKey("hits")) { + return false; + } + + // Validate hits.hits is an array + Object hitsArray = hits.get("hits"); + if (!(hitsArray instanceof List)) { + return false; + } + + return true; + + } catch (Exception e) { + log.debug("Response validation failed: {}", e.getMessage()); + return false; + } + } + + /** + * Ensure a response has the proper OpenSearch structure, creating one if necessary + * + * @param jsonResponse The JSON response to validate and fix + * @return A valid OpenSearch-structured JSON response + */ + public static String ensureValidOpenSearchResponse(String jsonResponse) { + if (isValidOpenSearchResponse(jsonResponse)) { + return jsonResponse; + } + + log.debug("Response validation failed, creating empty OpenSearch response"); + return createEmptyOpenSearchResponse(); + } + + /** + * Create a standard empty OpenSearch response structure + * + * @return JSON string representing an empty OpenSearch response + */ + public static String createEmptyOpenSearchResponse() { + try { + Map total = new HashMap<>(); + total.put("value", 0); + total.put("relation", "eq"); + + Map hitsContainer = new HashMap<>(); + hitsContainer.put("total", total); + hitsContainer.put("max_score", null); + hitsContainer.put("hits", new ArrayList<>()); + + Map response = new HashMap<>(); + response.put("hits", hitsContainer); + response.put("took", 0); + response.put("timed_out", false); + + return mapToJson(response); + } catch (Exception e) { + log.error("Failed to create empty OpenSearch response: {}", e.getMessage()); + // Fallback to hardcoded JSON + return "{\"hits\":{\"total\":{\"value\":0,\"relation\":\"eq\"},\"max_score\":null,\"hits\":[]},\"took\":0,\"timed_out\":false}"; + } + } + + /** + * Create an error response in OpenSearch format + * + * @param errorType The type of error + * @param errorReason The error reason/message + * @return JSON string representing an error response + */ + public static String createErrorResponse(String errorType, String errorReason) { + try { + Map error = new HashMap<>(); + error.put("type", errorType != null ? errorType : "remote_search_exception"); + error.put("reason", errorReason != null ? errorReason : "Unknown error"); + + Map response = new HashMap<>(); + response.put("error", error); + + return mapToJson(response); + } catch (Exception e) { + log.error("Failed to create error response: {}", e.getMessage()); + // Fallback to hardcoded JSON + String safeReason = errorReason != null ? errorReason.replace("\"", "\\\"") : "Unknown error"; + return "{\"error\":{\"type\":\"remote_search_exception\",\"reason\":\"" + safeReason + "\"}}"; + } + } + + /** + * Validate that a response contains the expected number of hits + * + * @param jsonResponse The JSON response to check + * @param expectedMinHits Minimum expected number of hits + * @return true if the response has at least the expected number of hits + */ + public static boolean hasMinimumHits(String jsonResponse, int expectedMinHits) { + try { + Map responseMap = parseJsonToMap(jsonResponse); + + if (!responseMap.containsKey("hits")) { + return false; + } + + @SuppressWarnings("unchecked") + Map hits = (Map) responseMap.get("hits"); + + if (!hits.containsKey("hits")) { + return false; + } + + @SuppressWarnings("unchecked") + List hitsArray = (List) hits.get("hits"); + + return hitsArray.size() >= expectedMinHits; + + } catch (Exception e) { + log.debug("Failed to check minimum hits: {}", e.getMessage()); + return false; + } + } + + /** + * Extract the total number of hits from a response + * + * @param jsonResponse The JSON response + * @return The total number of hits, or 0 if not found + */ + public static int getTotalHits(String jsonResponse) { + try { + Map responseMap = parseJsonToMap(jsonResponse); + + if (!responseMap.containsKey("hits")) { + return 0; + } + + @SuppressWarnings("unchecked") + Map hits = (Map) responseMap.get("hits"); + + if (!hits.containsKey("total")) { + return 0; + } + + Object totalObj = hits.get("total"); + if (totalObj instanceof Map) { + @SuppressWarnings("unchecked") + Map total = (Map) totalObj; + Object valueObj = total.get("value"); + if (valueObj instanceof Number) { + return ((Number) valueObj).intValue(); + } + } else if (totalObj instanceof Number) { + return ((Number) totalObj).intValue(); + } + + return 0; + + } catch (Exception e) { + log.debug("Failed to extract total hits: {}", e.getMessage()); + return 0; + } + } + + /** + * Parse JSON string to Map + */ + private static Map parseJsonToMap(String json) throws Exception { + if (json == null || json.trim().isEmpty()) { + return new HashMap<>(); + } + + try (XContentParser parser = XContentFactory.jsonBuilder().contentType().xContent().createParser(null, null, json)) { + return parser.map(); + } + } + + /** + * Convert Map to JSON string + */ + private static String mapToJson(Map map) throws Exception { + try (XContentBuilder builder = XContentFactory.jsonBuilder()) { + builder.map(map); + return builder.toString(); + } + } +} diff --git a/src/main/plugin-metadata/plugin-security.policy b/src/main/plugin-metadata/plugin-security.policy index d205ccda..19a15e40 100644 --- a/src/main/plugin-metadata/plugin-security.policy +++ b/src/main/plugin-metadata/plugin-security.policy @@ -4,4 +4,33 @@ grant { permission java.lang.RuntimePermission "accessDeclaredMembers"; permission java.lang.reflect.ReflectPermission "suppressAccessChecks"; permission java.lang.RuntimePermission "setContextClassLoader"; -}; \ No newline at end of file +}; + +// Grant network permissions to all protection domains for remote search functionality +grant { + permission java.net.SocketPermission "*:*", "connect,resolve"; + permission java.net.URLPermission "http://*", "*:*"; + permission java.net.URLPermission "https://*", "*:*"; + permission java.lang.RuntimePermission "accessClassInPackage.jdk.internal.net.http"; + permission java.lang.RuntimePermission "accessClassInPackage.jdk.internal.net.http.common"; + permission java.lang.RuntimePermission "modifyThread"; + permission java.lang.RuntimePermission "modifyThreadGroup"; +}; + +// Grant specific permissions to java.net.http module +grant codeBase "jrt:/java.net.http" { + permission java.net.SocketPermission "*:*", "connect,resolve"; + permission java.net.URLPermission "http://*", "*:*"; + permission java.net.URLPermission "https://*", "*:*"; + permission java.lang.RuntimePermission "accessClassInPackage.jdk.internal.net.http"; + permission java.lang.RuntimePermission "accessClassInPackage.jdk.internal.net.http.common"; + permission java.lang.RuntimePermission "modifyThread"; + permission java.lang.RuntimePermission "modifyThreadGroup"; +}; + +// Grant permissions to java.base module for networking +grant codeBase "jrt:/java.base" { + permission java.net.SocketPermission "*:*", "connect,resolve"; + permission java.lang.RuntimePermission "modifyThread"; + permission java.lang.RuntimePermission "modifyThreadGroup"; +}; diff --git a/src/main/resources/mappings/remote_search_cache.json b/src/main/resources/mappings/remote_search_cache.json new file mode 100644 index 00000000..7cb76f90 --- /dev/null +++ b/src/main/resources/mappings/remote_search_cache.json @@ -0,0 +1,33 @@ +{ + "properties": { + "cacheKey": { + "type": "keyword" + }, + "remoteConfigId": { + "type": "keyword" + }, + "query": { + "type": "text", + "index": false + }, + "queryText": { + "type": "keyword" + }, + "cachedResponse": { + "type": "text", + "index": false + }, + "mappedResponse": { + "type": "text", + "index": false + }, + "cacheTimestamp": { + "type": "date", + "format": "epoch_millis" + }, + "expirationTimestamp": { + "type": "date", + "format": "epoch_millis" + } + } +} diff --git a/src/main/resources/mappings/remote_search_configuration.json b/src/main/resources/mappings/remote_search_configuration.json new file mode 100644 index 00000000..1d630ca1 --- /dev/null +++ b/src/main/resources/mappings/remote_search_configuration.json @@ -0,0 +1,57 @@ +{ + "properties": { + "id": { + "type": "keyword" + }, + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "description": { + "type": "text" + }, + "connectionUrl": { + "type": "keyword" + }, + "username": { + "type": "keyword" + }, + "password": { + "type": "keyword", + "index": false + }, + "queryTemplate": { + "type": "text", + "index": false + }, + "responseTemplate": { + "type": "text", + "index": false + }, + "maxRequestsPerSecond": { + "type": "integer" + }, + "maxConcurrentRequests": { + "type": "integer" + }, + "cacheDurationMinutes": { + "type": "long" + }, + "refreshCache": { + "type": "boolean" + }, + "metadata": { + "type": "object", + "enabled": false + }, + "timestamp": { + "type": "date", + "format": "strict_date_optional_time||epoch_millis" + } + } +} diff --git a/src/main/resources/mappings/remote_search_failure.json b/src/main/resources/mappings/remote_search_failure.json new file mode 100644 index 00000000..0a1d4a25 --- /dev/null +++ b/src/main/resources/mappings/remote_search_failure.json @@ -0,0 +1,41 @@ +{ + "mappings": { + "properties": { + "id": { + "type": "keyword" + }, + "remoteConfigId": { + "type": "keyword" + }, + "experimentId": { + "type": "keyword" + }, + "query": { + "type": "text", + "index": false + }, + "queryText": { + "type": "keyword" + }, + "errorType": { + "type": "keyword" + }, + "errorMessage": { + "type": "text" + }, + "timestamp": { + "type": "date", + "format": "strict_date_optional_time||epoch_millis" + }, + "status": { + "type": "keyword" + } + } + }, + "settings": { + "index": { + "number_of_shards": 1, + "number_of_replicas": 0 + } + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDaoTests.java b/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDaoTests.java new file mode 100644 index 00000000..077598e1 --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchCacheDaoTests.java @@ -0,0 +1,468 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.dao; + +import static org.junit.Assert.*; +import static org.mockito.ArgumentMatchers.*; +import static org.mockito.Mockito.*; + +import java.time.Instant; +import java.time.temporal.ChronoUnit; +import java.util.Map; +import java.util.concurrent.CountDownLatch; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.lucene.search.TotalHits; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; +import org.opensearch.action.delete.DeleteResponse; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.action.search.SearchResponse; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.common.bytes.BytesArray; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.search.SearchHit; +import org.opensearch.search.SearchHits; +import org.opensearch.search.builder.SearchSourceBuilder; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndices; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager; +import org.opensearch.searchrelevance.model.RemoteSearchCache; + +public class RemoteSearchCacheDaoTests extends org.apache.lucene.tests.util.LuceneTestCase { + + @Mock + private SearchRelevanceIndicesManager indicesManager; + + private RemoteSearchCacheDao cacheDao; + + // @Before + public void setUp() throws Exception { + super.setUp(); + MockitoAnnotations.openMocks(this); + cacheDao = new RemoteSearchCacheDao(indicesManager); + } + + public void testStoreCache() throws InterruptedException { + // Create test cache entry + RemoteSearchCache cache = new RemoteSearchCache( + "test-cache-id", + "config-1", + "test-query-hash", + "test query", + "{\"response\": \"data\"}", + "{\"mapped\": \"response\"}", + Instant.now().toEpochMilli(), + Instant.now().toEpochMilli() + (60L * 60 * 1000) // 60 minutes from now + ); + + // Mock successful index response + IndexResponse mockResponse = mock(IndexResponse.class); + when(mockResponse.getId()).thenReturn("test-cache-id"); + + // Stub indices manager update call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(3); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .updateDocEfficient( + eq("test-cache-id"), + any(XContentBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + + // Test store operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.storeCache(cache, new ActionListener() { + @Override + public void onResponse(IndexResponse response) { + result.set(response); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + + // Verify indices manager was called with correct arguments + verify(indicesManager, times(1)).updateDocEfficient( + eq("test-cache-id"), + any(XContentBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + } + + public void testGetCacheHit() throws InterruptedException { + String cacheKey = "test-cache-key"; + + // Create test cache data + Map sourceMap = Map.of( + RemoteSearchCache.ID_FIELD, + cacheKey, + RemoteSearchCache.CONFIGURATION_ID_FIELD, + "config-1", + RemoteSearchCache.QUERY_HASH_FIELD, + "query-hash", + RemoteSearchCache.QUERY_TEXT_FIELD, + "test query", + RemoteSearchCache.RAW_RESPONSE_FIELD, + "{\"response\": \"data\"}", + RemoteSearchCache.MAPPED_RESPONSE_FIELD, + "{\"mapped\": \"response\"}", + RemoteSearchCache.TIMESTAMP_FIELD, + Instant.now().toEpochMilli(), + RemoteSearchCache.EXPIRATION_TIMESTAMP, + Instant.now().toEpochMilli() + (60L * 60 * 1000) + ); + + // Prepare SearchResponse with one real hit + String json = "{\"cacheKey\":\"" + + cacheKey + + "\",\"remoteConfigId\":\"config-1\",\"query\":\"query-hash\",\"queryText\":\"test query\",\"cachedResponse\":\"{\\\"response\\\": \\\"data\\\"}\",\"mappedResponse\":\"{\\\"mapped\\\": \\\"response\\\"}\",\"cacheTimestamp\":" + + sourceMap.get(RemoteSearchCache.TIMESTAMP_FIELD) + + ",\"expirationTimestamp\":" + + sourceMap.get(RemoteSearchCache.EXPIRATION_TIMESTAMP) + + "}"; + SearchHit hit = new SearchHit(1, cacheKey, Map.of(), Map.of()); + hit.sourceRef(new BytesArray(json)); + SearchHits searchHits = new SearchHits(new SearchHit[] { hit }, new TotalHits(1, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub indices manager get by id + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager).getDocByDocId(eq(cacheKey), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Test get operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.getCache(cacheKey, new ActionListener() { + @Override + public void onResponse(RemoteSearchCache cache) { + result.set(cache); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertEquals(cacheKey, result.get().getId()); + assertEquals("config-1", result.get().getConfigurationId()); + } + + public void testGetCacheMiss() throws InterruptedException { + String cacheKey = "non-existent-key"; + + // Prepare mocked SearchResponse with zero hits + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(0, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub indices manager get by id + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager).getDocByDocId(eq(cacheKey), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Test get operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.getCache(cacheKey, new ActionListener() { + @Override + public void onResponse(RemoteSearchCache cache) { + result.set(cache); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNull(result.get()); // Should be null for cache miss + } + + public void testGetExpiredCache() throws InterruptedException { + String cacheKey = "expired-cache-key"; + + // Create expired cache data (timestamp from 2 hours ago, expiration 1 hour ago) + Instant expiredTime = Instant.now().minus(2, ChronoUnit.HOURS); + Instant expirationTime = Instant.now().minus(1, ChronoUnit.HOURS); + Map sourceMap = Map.of( + RemoteSearchCache.ID_FIELD, + cacheKey, + RemoteSearchCache.CONFIGURATION_ID_FIELD, + "config-1", + RemoteSearchCache.QUERY_HASH_FIELD, + "query-hash", + RemoteSearchCache.QUERY_TEXT_FIELD, + "test query", + RemoteSearchCache.RAW_RESPONSE_FIELD, + "{\"response\": \"data\"}", + RemoteSearchCache.MAPPED_RESPONSE_FIELD, + "{\"mapped\": \"response\"}", + RemoteSearchCache.TIMESTAMP_FIELD, + expiredTime.toEpochMilli(), + RemoteSearchCache.EXPIRATION_TIMESTAMP, + expirationTime.toEpochMilli() + ); + + // Search response with one expired real hit + String json = "{\"cacheKey\":\"" + + cacheKey + + "\",\"remoteConfigId\":\"config-1\",\"query\":\"query-hash\",\"queryText\":\"test query\",\"cachedResponse\":\"{\\\"response\\\": \\\"data\\\"}\",\"mappedResponse\":\"{\\\"mapped\\\": \\\"response\\\"}\",\"cacheTimestamp\":" + + expiredTime.toEpochMilli() + + ",\"expirationTimestamp\":" + + expirationTime.toEpochMilli() + + "}"; + SearchHit hit = new SearchHit(1, cacheKey, Map.of(), Map.of()); + hit.sourceRef(new BytesArray(json)); + SearchHits searchHits = new SearchHits(new SearchHit[] { hit }, new TotalHits(1, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockGetResponse = mock(SearchResponse.class); + when(mockGetResponse.getHits()).thenReturn(searchHits); + + // Stub indices manager get by id + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockGetResponse); + return null; + }).when(indicesManager).getDocByDocId(eq(cacheKey), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Stub delete call + DeleteResponse mockDeleteResponse = mock(DeleteResponse.class); + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockDeleteResponse); + return null; + }).when(indicesManager).deleteDocByDocId(eq(cacheKey), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Test get operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.getCache(cacheKey, new ActionListener() { + @Override + public void onResponse(RemoteSearchCache cache) { + result.set(cache); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNull(result.get()); // Should be null for expired cache + + // Verify delete was called for cleanup + verify(indicesManager, times(1)).deleteDocByDocId( + eq(cacheKey), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + } + + public void testDeleteCache() throws InterruptedException { + String cacheKey = "cache-to-delete"; + + // Mock successful delete response + DeleteResponse mockResponse = mock(DeleteResponse.class); + + // Stub indices manager delete call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager).deleteDocByDocId(eq(cacheKey), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Test delete operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.deleteCache(cacheKey, new ActionListener() { + @Override + public void onResponse(DeleteResponse response) { + result.set(response); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + + // Verify delete call + verify(indicesManager, times(1)).deleteDocByDocId( + eq(cacheKey), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + } + + public void testClearCacheForConfiguration() throws InterruptedException { + String configurationId = "config-to-clear"; + + // Create search response with cache entries + SearchHit hit1 = new SearchHit(1, "cache-1", Map.of(), Map.of()); + SearchHit hit2 = new SearchHit(2, "cache-2", Map.of(), Map.of()); + SearchHits searchHits = new SearchHits(new SearchHit[] { hit1, hit2 }, new TotalHits(2, TotalHits.Relation.EQUAL_TO), 1.0f); + + SearchResponse mockSearchResponse = mock(SearchResponse.class); + when(mockSearchResponse.getHits()).thenReturn(searchHits); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockSearchResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + + // Stub delete responses + DeleteResponse mockDeleteResponse = mock(DeleteResponse.class); + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockDeleteResponse); + return null; + }).when(indicesManager).deleteDocByDocId(anyString(), eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), any(ActionListener.class)); + + // Test clear operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference error = new AtomicReference<>(); + + cacheDao.clearCacheForConfiguration(configurationId, new ActionListener() { + @Override + public void onResponse(Void response) { + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + + // Verify search and delete calls + verify(indicesManager, times(1)).listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + verify(indicesManager, times(2)).deleteDocByDocId( + anyString(), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + } + + public void testGetCacheStats() throws InterruptedException { + // Create search response with aggregations + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(100L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + when(mockResponse.getAggregations()).thenReturn(null); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_CACHE), + any(ActionListener.class) + ); + + // Test stats operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference> result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + cacheDao.getCacheStats(new ActionListener>() { + @Override + public void onResponse(Map stats) { + result.set(stats); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertTrue(result.get().containsKey("total_entries")); + assertTrue(result.get().containsKey("aggregations")); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDaoTests.java b/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDaoTests.java new file mode 100644 index 00000000..f89ab83b --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/dao/RemoteSearchFailureDaoTests.java @@ -0,0 +1,486 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.dao; + +import static org.junit.Assert.*; +import static org.mockito.ArgumentMatchers.*; +import static org.mockito.Mockito.*; + +import java.time.Instant; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CountDownLatch; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.lucene.search.TotalHits; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; +import org.opensearch.action.index.IndexResponse; +import org.opensearch.action.search.SearchResponse; +import org.opensearch.core.action.ActionListener; +import org.opensearch.core.common.bytes.BytesArray; +import org.opensearch.core.xcontent.XContentBuilder; +import org.opensearch.search.SearchHit; +import org.opensearch.search.SearchHits; +import org.opensearch.search.builder.SearchSourceBuilder; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndices; +import org.opensearch.searchrelevance.indices.SearchRelevanceIndicesManager; +import org.opensearch.searchrelevance.model.RemoteSearchFailure; + +public class RemoteSearchFailureDaoTests extends org.apache.lucene.tests.util.LuceneTestCase { + + @Mock + private SearchRelevanceIndicesManager indicesManager; + + private RemoteSearchFailureDao failureDao; + + // @Before + public void setUp() throws Exception { + super.setUp(); + MockitoAnnotations.openMocks(this); + failureDao = new RemoteSearchFailureDao(indicesManager); + } + + public void testRecordFailure() throws InterruptedException { + // Create test failure + RemoteSearchFailure failure = new RemoteSearchFailure( + "failure-1", + "config-1", + "experiment-1", + "test query", + "test query text", + "NETWORK_ERROR", + "Connection timeout", + Instant.now().toString(), + "FAILED" + ); + + // Mock successful index response + IndexResponse mockResponse = mock(IndexResponse.class); + when(mockResponse.getId()).thenReturn("failure-1"); + + // Stub indices manager upsert call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(3); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .updateDoc( + eq("failure-1"), + any(XContentBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test record operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.recordFailure(failure, new ActionListener() { + @Override + public void onResponse(IndexResponse response) { + result.set(response); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + + // Verify indices manager call + verify(indicesManager, times(1)).updateDoc( + eq("failure-1"), + any(XContentBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + } + + public void testGetRecentFailures() throws InterruptedException { + String configurationId = "config-1"; + + // Prepare two real hits with source + SearchHit hit1 = new SearchHit(1, "failure-1", Map.of(), Map.of()); + hit1.sourceRef( + new BytesArray( + "{\"id\":\"failure-1\",\"remoteConfigId\":\"config-1\",\"errorType\":\"CONNECTION_TIMEOUT\",\"errorMessage\":\"Timeout\",\"timestamp\":\"2023-01-01T00:00:00Z\",\"status\":\"FAILED\"}" + ) + ); + SearchHit hit2 = new SearchHit(2, "failure-2", Map.of(), Map.of()); + hit2.sourceRef( + new BytesArray( + "{\"id\":\"failure-2\",\"remoteConfigId\":\"config-1\",\"errorType\":\"AUTH_FAILURE\",\"errorMessage\":\"Unauthorized\",\"timestamp\":\"2023-01-01T01:00:00Z\",\"status\":\"FAILED\"}" + ) + ); + + SearchHits searchHits = new SearchHits(new SearchHit[] { hit1, hit2 }, new TotalHits(2, TotalHits.Relation.EQUAL_TO), 1.0f); + + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test get recent failures + CountDownLatch latch = new CountDownLatch(1); + AtomicReference> result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.getRecentFailures(configurationId, 10, new ActionListener>() { + @Override + public void onResponse(List failures) { + result.set(failures); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertEquals(2, result.get().size()); + + // Verify call + verify(indicesManager, times(1)).listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + } + + public void testGetFailureStats() throws InterruptedException { + String configurationId = "config-1"; + int hours = 24; + + // Create search response with total hits = 10 + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(10L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + when(mockResponse.getAggregations()).thenReturn(null); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test get failure stats + CountDownLatch latch = new CountDownLatch(1); + AtomicReference> result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.getFailureStats(configurationId, hours, new ActionListener>() { + @Override + public void onResponse(Map stats) { + result.set(stats); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertTrue(result.get().containsKey("total_failures")); + assertTrue(result.get().containsKey("time_range_hours")); + assertTrue(result.get().containsKey("configuration_id")); + assertTrue(result.get().containsKey("aggregations")); + assertEquals(10L, result.get().get("total_failures")); + assertEquals(hours, result.get().get("time_range_hours")); + assertEquals(configurationId, result.get().get("configuration_id")); + } + + public void testHasExcessiveFailures() throws InterruptedException { + String configurationId = "config-1"; + int maxFailures = 5; + int timeWindowMinutes = 30; + + // Create search response indicating excessive failures + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(7L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test excessive failures check + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.hasExcessiveFailures(configurationId, maxFailures, timeWindowMinutes, new ActionListener() { + @Override + public void onResponse(Boolean hasExcessive) { + result.set(hasExcessive); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertTrue(result.get()); // Should be true since 7 > 5 + } + + public void testHasExcessiveFailuresWithinLimit() throws InterruptedException { + String configurationId = "config-1"; + int maxFailures = 5; + int timeWindowMinutes = 30; + + // Create search response indicating failures within limit + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(3L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test excessive failures check + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.hasExcessiveFailures(configurationId, maxFailures, timeWindowMinutes, new ActionListener() { + @Override + public void onResponse(Boolean hasExcessive) { + result.set(hasExcessive); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertFalse(result.get()); // Should be false since 3 < 5 + } + + public void testCleanupOldFailures() throws InterruptedException { + int retentionDays = 30; + + // Create search response with old failures + SearchHit hit1 = new SearchHit(1, "old-failure-1", Map.of(), Map.of()); + SearchHit hit2 = new SearchHit(2, "old-failure-2", Map.of(), Map.of()); + SearchHits searchHits = new SearchHits(new SearchHit[] { hit1, hit2 }, new TotalHits(2, TotalHits.Relation.EQUAL_TO), 1.0f); + + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test cleanup operation + CountDownLatch latch = new CountDownLatch(1); + AtomicReference result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.cleanupOldFailures(retentionDays, new ActionListener() { + @Override + public void onResponse(Integer deletedCount) { + result.set(deletedCount); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertEquals(Integer.valueOf(2), result.get()); // Should find 2 old failures + } + + public void testGetErrorPatterns() throws InterruptedException { + String configurationId = "config-1"; + int days = 7; + + // Create search response with aggregations + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(15L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + when(mockResponse.getAggregations()).thenReturn(null); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test get error patterns + CountDownLatch latch = new CountDownLatch(1); + AtomicReference> result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.getErrorPatterns(configurationId, days, new ActionListener>() { + @Override + public void onResponse(Map patterns) { + result.set(patterns); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertTrue(result.get().containsKey("total_failures")); + assertTrue(result.get().containsKey("analysis_period_days")); + assertTrue(result.get().containsKey("configuration_id")); + assertTrue(result.get().containsKey("error_analysis")); + assertEquals(15L, result.get().get("total_failures")); + assertEquals(days, result.get().get("analysis_period_days")); + assertEquals(configurationId, result.get().get("configuration_id")); + } + + public void testGetErrorPatternsAllConfigurations() throws InterruptedException { + int days = 7; + + // Create search response with aggregations + SearchHits searchHits = new SearchHits(new SearchHit[0], new TotalHits(25L, TotalHits.Relation.EQUAL_TO), 1.0f); + SearchResponse mockResponse = mock(SearchResponse.class); + when(mockResponse.getHits()).thenReturn(searchHits); + when(mockResponse.getAggregations()).thenReturn(null); + + // Stub list docs call + doAnswer(invocation -> { + @SuppressWarnings("unchecked") + ActionListener listener = invocation.getArgument(2); + listener.onResponse(mockResponse); + return null; + }).when(indicesManager) + .listDocsBySearchRequest( + any(SearchSourceBuilder.class), + eq(SearchRelevanceIndices.REMOTE_SEARCH_FAILURE), + any(ActionListener.class) + ); + + // Test get error patterns for all configurations (null configurationId) + CountDownLatch latch = new CountDownLatch(1); + AtomicReference> result = new AtomicReference<>(); + AtomicReference error = new AtomicReference<>(); + + failureDao.getErrorPatterns(null, days, new ActionListener>() { + @Override + public void onResponse(Map patterns) { + result.set(patterns); + latch.countDown(); + } + + @Override + public void onFailure(Exception e) { + error.set(e); + latch.countDown(); + } + }); + + assertTrue(latch.await(5, TimeUnit.SECONDS)); + assertNull(error.get()); + assertNotNull(result.get()); + assertTrue(result.get().containsKey("total_failures")); + assertTrue(result.get().containsKey("analysis_period_days")); + assertTrue(result.get().containsKey("configuration_id")); + assertTrue(result.get().containsKey("error_analysis")); + assertEquals(25L, result.get().get("total_failures")); + assertEquals(days, result.get().get("analysis_period_days")); + assertNull(result.get().get("configuration_id")); // Should be null for all configurations + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/executors/ExperimentTaskManagerTests.java b/src/test/java/org/opensearch/searchrelevance/executors/ExperimentTaskManagerTests.java index 4b653bf2..175fa23c 100644 --- a/src/test/java/org/opensearch/searchrelevance/executors/ExperimentTaskManagerTests.java +++ b/src/test/java/org/opensearch/searchrelevance/executors/ExperimentTaskManagerTests.java @@ -28,6 +28,9 @@ import org.opensearch.core.action.ActionListener; import org.opensearch.searchrelevance.dao.EvaluationResultDao; import org.opensearch.searchrelevance.dao.ExperimentVariantDao; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; import org.opensearch.searchrelevance.model.AsyncStatus; import org.opensearch.searchrelevance.model.ExperimentType; import org.opensearch.searchrelevance.model.ExperimentVariant; @@ -44,6 +47,9 @@ public class ExperimentTaskManagerTests extends OpenSearchTestCase { private ClusterService clusterService; private EvaluationResultDao evaluationResultDao; private ExperimentVariantDao experimentVariantDao; + private RemoteSearchConfigurationDao remoteSearchConfigurationDao; + private RemoteSearchCacheDao remoteSearchCacheDao; + private RemoteSearchFailureDao remoteSearchFailureDao; private ThreadPool threadPool; private ExecutorService immediateExecutor; @@ -54,6 +60,9 @@ public void setUp() throws Exception { clusterService = mock(ClusterService.class); evaluationResultDao = mock(EvaluationResultDao.class); experimentVariantDao = mock(ExperimentVariantDao.class); + remoteSearchConfigurationDao = mock(RemoteSearchConfigurationDao.class); + remoteSearchCacheDao = mock(RemoteSearchCacheDao.class); + remoteSearchFailureDao = mock(RemoteSearchFailureDao.class); threadPool = mock(ThreadPool.class); // Create an immediate executor @@ -163,7 +172,15 @@ private List createTestVariants(String experimentId, int coun public void testDynamicConcurrencyControlInitialization() { // Test that ExperimentTaskManager initializes with dynamic concurrency limits - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); Map metrics = taskManager.getConcurrencyMetrics(); @@ -189,7 +206,15 @@ public void testDynamicConcurrencyControlInitialization() { public void testConcurrencyLimitCalculationLogic() { // Test the actual concurrency calculation logic with current system - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); Map metrics = taskManager.getConcurrencyMetrics(); int maxConcurrentTasks = (Integer) metrics.get("max_concurrent_tasks"); @@ -203,7 +228,15 @@ public void testConcurrencyLimitCalculationLogic() { public void testConcurrencyMetricsConsistency() { // Test that metrics are consistent and make sense - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); Map metrics = taskManager.getConcurrencyMetrics(); @@ -224,7 +257,15 @@ public void testConcurrencyMetricsConsistency() { public void testConcurrencyLimitBoundaries() { // Test that the concurrency calculation respects minimum and maximum bounds - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); Map metrics = taskManager.getConcurrencyMetrics(); int maxConcurrentTasks = (Integer) metrics.get("max_concurrent_tasks"); @@ -245,7 +286,15 @@ public void testConcurrencyLimitBoundaries() { public void testDynamicConcurrencyScaling() { // Test that dynamic concurrency scales appropriately with processor count - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); Map metrics = taskManager.getConcurrencyMetrics(); int maxConcurrentTasks = (Integer) metrics.get("max_concurrent_tasks"); @@ -267,7 +316,15 @@ public void testDynamicConcurrencyScaling() { public void testConfigMapInitialization() throws Exception { // Arrange - ExperimentTaskManager taskManager = new ExperimentTaskManager(client, evaluationResultDao, experimentVariantDao, threadPool); + ExperimentTaskManager taskManager = new ExperimentTaskManager( + client, + evaluationResultDao, + experimentVariantDao, + threadPool, + remoteSearchConfigurationDao, + remoteSearchCacheDao, + remoteSearchFailureDao + ); String experimentId = "test-experiment"; String searchConfigId = "test-config"; Map initialConfigMap = new HashMap<>(); diff --git a/src/test/java/org/opensearch/searchrelevance/executors/RemoteResponseMapperTests.java b/src/test/java/org/opensearch/searchrelevance/executors/RemoteResponseMapperTests.java new file mode 100644 index 00000000..28dd7bad --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/executors/RemoteResponseMapperTests.java @@ -0,0 +1,476 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; + +import java.util.Map; + +import org.opensearch.common.xcontent.XContentFactory; +import org.opensearch.core.xcontent.XContentParser; + +/** + * Tests for RemoteResponseMapper + */ +public class RemoteResponseMapperTests extends org.apache.lucene.tests.util.LuceneTestCase { + + private RemoteResponseMapper mapper; + + // @Before + public void setUp() throws Exception { + super.setUp(); + mapper = new RemoteResponseMapper(); + } + + public void testMapResponseWithEmptyInput() throws Exception { + String result = mapper.mapResponse(null, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should be 0", 0, total.get("value")); + } + + public void testMapResponseAlreadyOpenSearchFormat() throws Exception { + String opensearchResponse = """ + { + "hits": { + "total": {"value": 2, "relation": "eq"}, + "max_score": 1.5, + "hits": [ + {"_id": "1", "_score": 1.5, "_source": {"title": "Test Doc 1"}}, + {"_id": "2", "_score": 1.0, "_source": {"title": "Test Doc 2"}} + ] + }, + "took": 5, + "timed_out": false + } + """; + + String result = mapper.mapResponse(opensearchResponse, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should be 2", 2, total.get("value")); + } + + public void testMapResponseCommonFormat() throws Exception { + String commonResponse = """ + { + "results": [ + {"id": "doc1", "title": "First Document", "score": 0.95}, + {"id": "doc2", "title": "Second Document", "score": 0.87} + ], + "took": 10 + } + """; + + String result = mapper.mapResponse(commonResponse, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should be 2", 2, total.get("value")); + + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hits.get("hits"); + assertEquals("Should have 2 hits", 2, hitsList.size()); + + @SuppressWarnings("unchecked") + Map firstHit = (Map) hitsList.get(0); + assertEquals("First hit ID should be doc1", "doc1", firstHit.get("_id")); + assertEquals("First hit score should be 0.95", 0.95, firstHit.get("_score")); + } + + public void testMapResponseWithTemplate() throws Exception { + String remoteResponse = """ + { + "data": { + "search_results": [ + {"document_id": "123", "relevance_score": 0.92, "content": {"title": "Test Title"}}, + {"document_id": "456", "relevance_score": 0.78, "content": {"title": "Another Title"}} + ], + "total_count": 2, + "execution_time": 15 + } + } + """; + + String template = """ + { + "hits": { + "total": {"path": "data.total_count", "type": "integer"}, + "hits": { + "path": "data.search_results", + "mapping": { + "_id": "document_id", + "_score": "relevance_score", + "_source": "content" + } + } + }, + "took": {"path": "data.execution_time", "type": "integer"} + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + assertTrue("Should contain took", parsed.containsKey("took")); + assertEquals("Took should be 15", 15, parsed.get("took")); + } + + public void testMapResponseWithSimpleTemplate() throws Exception { + String remoteResponse = """ + { + "search": { + "documents": [ + {"id": "1", "score": 1.5, "data": {"title": "Document 1"}}, + {"id": "2", "score": 1.2, "data": {"title": "Document 2"}} + ], + "count": 2 + } + } + """; + + String template = """ + { + "hits": { + "total": "search.count", + "hits": "search.documents" + } + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + assertEquals("Total should be 2", 2, hits.get("total")); + + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hits.get("hits"); + assertEquals("Should have 2 hits", 2, hitsList.size()); + } + + public void testMapResponseWithArrayAccess() throws Exception { + String remoteResponse = """ + { + "results": [ + {"docs": [{"id": "1", "title": "First"}, {"id": "2", "title": "Second"}]}, + {"docs": [{"id": "3", "title": "Third"}]} + ] + } + """; + + String template = """ + { + "first_doc_id": "results[0].docs[0].id", + "first_doc_title": "results[0].docs[0].title", + "second_batch_first_doc": "results[1].docs[0].id" + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertEquals("First doc ID should be 1", "1", parsed.get("first_doc_id")); + assertEquals("First doc title should be First", "First", parsed.get("first_doc_title")); + assertEquals("Second batch first doc should be 3", "3", parsed.get("second_batch_first_doc")); + } + + public void testMapResponseWithTypeTransformation() throws Exception { + String remoteResponse = """ + { + "stats": { + "total": "100", + "score": "95.5", + "active": "true" + } + } + """; + + String template = """ + { + "total_count": {"path": "stats.total", "type": "integer"}, + "average_score": {"path": "stats.score", "type": "double"}, + "is_active": {"path": "stats.active", "type": "boolean"} + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertEquals("Total should be integer 100", 100, parsed.get("total_count")); + assertEquals("Score should be double 95.5", 95.5, parsed.get("average_score")); + assertEquals("Active should be boolean true", true, parsed.get("is_active")); + } + + public void testMapResponseWithDefaultValues() throws Exception { + String remoteResponse = """ + { + "partial_data": { + "available": "yes" + } + } + """; + + String template = """ + { + "available": "partial_data.available", + "missing_field": {"path": "partial_data.missing", "default": "not_found"}, + "missing_number": {"path": "partial_data.count", "type": "integer", "default": 0} + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertEquals("Available should be yes", "yes", parsed.get("available")); + assertEquals("Missing field should use default", "not_found", parsed.get("missing_field")); + assertEquals("Missing number should use default", 0, parsed.get("missing_number")); + } + + public void testMapResponseWithDocumentsFormat() throws Exception { + String documentsResponse = """ + { + "documents": [ + {"id": "doc1", "title": "Document 1", "score": 0.9}, + {"id": "doc2", "title": "Document 2", "score": 0.8} + ], + "total": 2 + } + """; + + String result = mapper.mapResponse(documentsResponse, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should be 2", 2, total.get("value")); + + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hits.get("hits"); + assertEquals("Should have 2 hits", 2, hitsList.size()); + } + + public void testMapResponseWithUnknownFormat() throws Exception { + String unknownResponse = """ + { + "custom_field": "custom_value", + "nested": { + "data": "some data" + } + } + """; + + String result = mapper.mapResponse(unknownResponse, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should be 1", 1, total.get("value")); + + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hits.get("hits"); + assertEquals("Should have 1 hit", 1, hitsList.size()); + + @SuppressWarnings("unchecked") + Map firstHit = (Map) hitsList.get(0); + @SuppressWarnings("unchecked") + Map source = (Map) firstHit.get("_source"); + assertEquals("Source should contain custom_field", "custom_value", source.get("custom_field")); + } + + public void testMapResponseWithInvalidJson() throws Exception { + String invalidJson = "{ invalid json }"; + + String result = mapper.mapResponse(invalidJson, null); + assertNotNull("Result should not be null", result); + + // Should return the original response when parsing fails + assertEquals("Should return original response", invalidJson, result); + } + + public void testMapResponseWithErrorTemplate() throws Exception { + String remoteResponse = """ + { + "data": { + "results": [] + } + } + """; + + String invalidTemplate = "{ invalid template json }"; + + String result = mapper.mapResponse(remoteResponse, invalidTemplate); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain error", parsed.containsKey("error")); + + @SuppressWarnings("unchecked") + Map error = (Map) parsed.get("error"); + assertEquals("Error type should be remote_mapping_exception", "remote_mapping_exception", error.get("type")); + } + + public void testMapResponseSolrFormat() throws Exception { + String solrResponse = """ + { + "responseHeader": {"QTime": 7}, + "response": { + "numFound": 2, + "start": 0, + "docs": [ + {"id": "A1", "title": "Doc A", "score": 1.23}, + {"title": "Doc B"} + ] + } + } + """; + + String result = mapper.mapResponse(solrResponse, null); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Should contain hits", parsed.containsKey("hits")); + assertTrue("Should contain took", parsed.containsKey("took")); + assertEquals("Took should map from QTime", 7, parsed.get("took")); + + @SuppressWarnings("unchecked") + Map hits = (Map) parsed.get("hits"); + @SuppressWarnings("unchecked") + Map total = (Map) hits.get("total"); + assertEquals("Total should equal numFound", 2, total.get("value")); + + @SuppressWarnings("unchecked") + java.util.List hitsList = (java.util.List) hits.get("hits"); + assertEquals("Should have 2 hits", 2, hitsList.size()); + + @SuppressWarnings("unchecked") + Map firstHit = (Map) hitsList.get(0); + assertEquals("First hit ID should be A1", "A1", firstHit.get("_id")); + assertEquals("First hit score should be 1.23", 1.23, firstHit.get("_score")); + + @SuppressWarnings("unchecked") + Map secondHit = (Map) hitsList.get(1); + // When id is missing, fallback to ordinal index string ("1") + assertEquals("Second hit ID should fallback to index string", "1", secondHit.get("_id")); + // Score should fallback to 1.0 when missing + assertEquals("Second hit score should fallback to 1.0", 1.0, secondHit.get("_score")); + } + + public void testMapResponseWithTemplateArrayInsertion() throws Exception { + String remoteResponse = """ + { + "data": { + "items": [ + {"id": "1", "title": "A"}, + {"id": "2", "title": "B"} + ] + } + } + """; + + String template = """ + { + "items": ${data.items} + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Result should contain items", parsed.containsKey("items")); + + @SuppressWarnings("unchecked") + java.util.List items = (java.util.List) parsed.get("items"); + assertEquals("Should have 2 items", 2, items.size()); + + @SuppressWarnings("unchecked") + Map first = (Map) items.get(0); + assertEquals("First item id should be 1", "1", first.get("id")); + } + + public void testMapResponseWithTemplateObjectInsertion() throws Exception { + String remoteResponse = """ + { + "meta": { + "stats": { + "count": 5, + "ok": true + } + } + } + """; + + String template = """ + { + "stats": ${meta.stats} + } + """; + + String result = mapper.mapResponse(remoteResponse, template); + assertNotNull("Result should not be null", result); + + Map parsed = parseJson(result); + assertTrue("Result should contain stats", parsed.containsKey("stats")); + + @SuppressWarnings("unchecked") + Map stats = (Map) parsed.get("stats"); + assertEquals("Count should be 5", 5, stats.get("count")); + assertEquals("Ok should be true", true, stats.get("ok")); + } + + /** + * Helper method to parse JSON string to Map + */ + private Map parseJson(String json) throws Exception { + try (XContentParser parser = XContentFactory.jsonBuilder().contentType().xContent().createParser(null, null, json)) { + return parser.map(); + } + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutorTests.java b/src/test/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutorTests.java new file mode 100644 index 00000000..d938b889 --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/executors/RemoteSearchExecutorTests.java @@ -0,0 +1,288 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.ArgumentMatchers.eq; +import static org.mockito.Mockito.doAnswer; +import static org.mockito.Mockito.when; + +import java.util.Map; +import java.util.concurrent.CountDownLatch; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.lucene.tests.util.LuceneTestCase.SuppressSysoutChecks; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; +import org.opensearch.core.action.ActionListener; +import org.opensearch.searchrelevance.dao.RemoteSearchCacheDao; +import org.opensearch.searchrelevance.dao.RemoteSearchConfigurationDao; +import org.opensearch.searchrelevance.dao.RemoteSearchFailureDao; +import org.opensearch.searchrelevance.executors.RemoteSearchExecutor.RemoteSearchResponse; +import org.opensearch.searchrelevance.model.RemoteSearchConfiguration; + +/** + * Tests for RemoteSearchExecutor + */ +@SuppressSysoutChecks(bugUrl = "https://github.com/opensearch-project/search-relevance/issues/XXX") +public class RemoteSearchExecutorTests extends org.apache.lucene.tests.util.LuceneTestCase { + + @Mock + private RemoteSearchConfigurationDao mockDao; + + @Mock + private RemoteSearchCacheDao mockCacheDao; + + @Mock + private RemoteSearchFailureDao mockFailureDao; + + @Mock + private RemoteResponseMapper mockResponseMapper; + + private RemoteSearchExecutor remoteSearchExecutor; + + @Override + public void setUp() throws Exception { + super.setUp(); + MockitoAnnotations.openMocks(this); + + // Setup default mock behaviors + setupDefaultMockBehaviors(); + + // Create RemoteSearchExecutor with standard constructor (no HttpClient needed) + remoteSearchExecutor = new RemoteSearchExecutor(mockDao, mockCacheDao, mockFailureDao, mockResponseMapper); + } + + private void setupDefaultMockBehaviors() { + // Mock cache DAO to return null (cache miss) by default + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(mockCacheDao).getCachedResponse(any(), any()); + + // Mock cache DAO to succeed when caching + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(mockCacheDao).cacheResponse(any(), any()); + + // Mock failure DAO to succeed when recording failures + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(mockFailureDao).recordFailure(any(), any()); + + // Mock response mapper to return input by default + when(mockResponseMapper.mapResponse(any(), any())).thenAnswer(invocation -> invocation.getArgument(0)); + } + + public void testExecuteRemoteSearchConfigNotFound() throws Exception { + String configId = "nonexistent-config"; + String query = "{\"query\":{\"match\":{\"title\":\"test\"}}}"; + String queryText = "test"; + String experimentId = "exp-123"; + + // Mock DAO response with null config + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(mockDao).getRemoteSearchConfiguration(eq(configId), any()); + + // Execute test + CountDownLatch latch = new CountDownLatch(1); + AtomicReference responseRef = new AtomicReference<>(); + AtomicReference errorRef = new AtomicReference<>(); + + remoteSearchExecutor.executeRemoteSearch(configId, query, queryText, 10, experimentId, ActionListener.wrap(response -> { + responseRef.set(response); + latch.countDown(); + }, error -> { + errorRef.set(error); + latch.countDown(); + })); + + // Wait for async completion + assertTrue("Request should complete within timeout", latch.await(5, TimeUnit.SECONDS)); + + // Verify error + assertNotNull("Error should not be null", errorRef.get()); + assertTrue("Error should be IllegalArgumentException", errorRef.get() instanceof IllegalArgumentException); + assertTrue("Error message should mention config not found", errorRef.get().getMessage().contains("Remote configuration not found")); + } + + public void testExecuteRemoteSearchWithInvalidUrl() throws Exception { + String configId = "test-config-invalid-url"; + String query = "{\"query\":{\"match\":{\"title\":\"test\"}}}"; + String queryText = "test"; + String experimentId = "exp-123"; + + // Create config with invalid URL + RemoteSearchConfiguration config = new RemoteSearchConfiguration( + configId, + "Invalid URL Config", + "Test configuration with invalid URL", + "invalid-url-format", + "user", + "pass", + "${query}", + null, + 10, + 5, + 60, + false, + Map.of(), + "2025-01-29T12:00:00Z" + ); + + // Mock DAO response + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(config); + return null; + }).when(mockDao).getRemoteSearchConfiguration(eq(configId), any()); + + // Execute test + CountDownLatch latch = new CountDownLatch(1); + AtomicReference responseRef = new AtomicReference<>(); + AtomicReference errorRef = new AtomicReference<>(); + + remoteSearchExecutor.executeRemoteSearch(configId, query, queryText, 10, experimentId, ActionListener.wrap(response -> { + responseRef.set(response); + latch.countDown(); + }, error -> { + errorRef.set(error); + latch.countDown(); + })); + + // Wait for async completion + assertTrue("Request should complete within timeout", latch.await(10, TimeUnit.SECONDS)); + + // Verify error due to invalid URL + assertNotNull("Error should not be null", errorRef.get()); + assertTrue("Error should be RuntimeException", errorRef.get() instanceof RuntimeException); + assertTrue("Error message should mention remote search failed", errorRef.get().getMessage().contains("Remote search failed")); + } + + public void testRateLimitingConcurrentRequests() throws Exception { + String configId = "test-config-rate-limit"; + RemoteSearchConfiguration config = new RemoteSearchConfiguration( + configId, + "Rate Limited Config", + "Test configuration with low concurrent limit", + "https://httpbin.org/delay/2", // Use httpbin for testing with delay + "user", + "pass", + "${query}", + null, + 10, // requests per second + 1, // max concurrent requests (low limit for testing) + 60, + false, + Map.of(), + "2025-01-29T12:00:00Z" + ); + + // Mock DAO response + doAnswer(invocation -> { + ActionListener listener = invocation.getArgument(1); + listener.onResponse(config); + return null; + }).when(mockDao).getRemoteSearchConfiguration(eq(configId), any()); + + // Start first request (should succeed but take time) + CountDownLatch firstLatch = new CountDownLatch(1); + AtomicReference firstError = new AtomicReference<>(); + + remoteSearchExecutor.executeRemoteSearch( + configId, + "{\"query\":{}}", + "test1", + 10, + "exp-1", + ActionListener.wrap(response -> firstLatch.countDown(), error -> { + firstError.set(error); + firstLatch.countDown(); + }) + ); + + // Start second request immediately (should fail due to concurrent limit) + CountDownLatch secondLatch = new CountDownLatch(1); + AtomicReference secondError = new AtomicReference<>(); + + remoteSearchExecutor.executeRemoteSearch( + configId, + "{\"query\":{}}", + "test2", + 10, + "exp-2", + ActionListener.wrap(response -> secondLatch.countDown(), error -> { + secondError.set(error); + secondLatch.countDown(); + }) + ); + + // Wait for second request to fail quickly + assertTrue("Second request should complete quickly", secondLatch.await(2, TimeUnit.SECONDS)); + + // Verify second request failed due to rate limiting + assertNotNull("Second request should have failed", secondError.get()); + assertTrue("Error should mention rate limit", secondError.get().getMessage().contains("Rate limit exceeded")); + + // Wait for first request to complete (or timeout) + firstLatch.await(15, TimeUnit.SECONDS); + } + + public void testConstructorInitialization() { + // Test that constructor properly initializes all dependencies + assertNotNull("RemoteSearchExecutor should be initialized", remoteSearchExecutor); + + // Test basic functionality by calling with null config ID (should trigger DAO call) + CountDownLatch latch = new CountDownLatch(1); + + remoteSearchExecutor.executeRemoteSearch( + "test-config", + "{}", + "test", + 10, + "exp-1", + ActionListener.wrap(response -> latch.countDown(), error -> latch.countDown()) + ); + + // Just verify the call doesn't throw an exception immediately + assertTrue("Constructor should create functional executor", true); + } + + /** + * Create a test configuration with default values + */ + private RemoteSearchConfiguration createTestConfiguration(String configId) { + return new RemoteSearchConfiguration( + configId, + "Test Configuration", + "Test configuration for unit tests", + "https://httpbin.org/post", // Use httpbin for testing + "testuser", + "testpass", + "${query}", + null, + 10, // max requests per second + 5, // max concurrent requests + 60, // cache duration minutes + false, + Map.of("test", "metadata"), + "2025-01-29T12:00:00Z" + ); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/executors/SearchResponseProcessorDocIdsTests.java b/src/test/java/org/opensearch/searchrelevance/executors/SearchResponseProcessorDocIdsTests.java new file mode 100644 index 00000000..54179f76 --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/executors/SearchResponseProcessorDocIdsTests.java @@ -0,0 +1,105 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.executors; + +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.Mockito.doAnswer; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.verify; + +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.atomic.AtomicBoolean; + +import org.opensearch.core.action.ActionListener; +import org.opensearch.searchrelevance.dao.EvaluationResultDao; +import org.opensearch.searchrelevance.dao.ExperimentVariantDao; +import org.opensearch.searchrelevance.model.AsyncStatus; +import org.opensearch.searchrelevance.model.EvaluationResult; +import org.opensearch.searchrelevance.model.ExperimentType; +import org.opensearch.searchrelevance.model.ExperimentVariant; + +/** + * Tests for SearchResponseProcessor.processDocIds (remote engine mapped path) + */ +public class SearchResponseProcessorDocIdsTests extends org.apache.lucene.tests.util.LuceneTestCase { + + public void testProcessDocIdsPersistsEvaluationResult() throws Exception { + // Mocks + EvaluationResultDao evaluationResultDao = mock(EvaluationResultDao.class); + ExperimentVariantDao experimentVariantDao = mock(ExperimentVariantDao.class); + + // Stub DAO calls to immediately succeed + doAnswer(invocation -> { + // Simulate success callback to allow downstream variant update + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(evaluationResultDao).putEvaluationResultEfficient(any(EvaluationResult.class), any(ActionListener.class)); + + doAnswer(invocation -> { + // Acknowledge variant write success + ActionListener listener = invocation.getArgument(1); + listener.onResponse(null); + return null; + }).when(experimentVariantDao).putExperimentVariantEfficient(any(ExperimentVariant.class), any(ActionListener.class)); + + // Under test + SearchResponseProcessor processor = new SearchResponseProcessor(evaluationResultDao, experimentVariantDao); + + // Inputs + String experimentId = "exp-1"; + String searchConfigId = "remote-config-1"; + String queryText = "test query"; + int size = 10; + List judgmentIds = List.of("j1", "j2"); + Map docIdToScores = Map.of("A", "3", "B", "5"); + String evaluationId = "eval-1"; + + ExperimentVariant variant = new ExperimentVariant( + "variant-1", + "2025-08-13T00:00:00Z", + ExperimentType.REMOTE_SEARCH_EVALUATION, + AsyncStatus.PROCESSING, + experimentId, + Map.of("remoteConfigId", searchConfigId), + Map.of() + ); + + ExperimentTaskContext taskContext = new ExperimentTaskContext( + experimentId, + searchConfigId, + queryText, + 1, + new ConcurrentHashMap<>(), + new CompletableFuture<>(), + new AtomicBoolean(false), + experimentVariantDao, + ExperimentType.REMOTE_SEARCH_EVALUATION + ); + + // Act + processor.processDocIds( + List.of("A", "B"), + variant, + experimentId, + searchConfigId, + queryText, + size, + judgmentIds, + docIdToScores, + evaluationId, + taskContext + ); + + // Assert - evaluation result persisted via efficient path + verify(evaluationResultDao).putEvaluationResultEfficient(any(EvaluationResult.class), any(ActionListener.class)); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/experiment/PointwiseExperimentProcessorTests.java b/src/test/java/org/opensearch/searchrelevance/experiment/PointwiseExperimentProcessorTests.java index 2535cbb4..4aaa6e22 100644 --- a/src/test/java/org/opensearch/searchrelevance/experiment/PointwiseExperimentProcessorTests.java +++ b/src/test/java/org/opensearch/searchrelevance/experiment/PointwiseExperimentProcessorTests.java @@ -23,7 +23,6 @@ import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicBoolean; -import org.junit.Before; import org.mockito.Mock; import org.mockito.MockitoAnnotations; import org.opensearch.action.search.SearchResponse; @@ -49,9 +48,10 @@ public class PointwiseExperimentProcessorTests extends OpenSearchTestCase { private PointwiseExperimentProcessor processor; - @Before + // @Before @SneakyThrows - public void setUp() { + public void setUp() throws Exception { + super.setUp(); super.setUp(); MockitoAnnotations.openMocks(this); processor = new PointwiseExperimentProcessor(judgmentDao, taskManager); diff --git a/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchCacheTests.java b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchCacheTests.java new file mode 100644 index 00000000..2dc8e56b --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchCacheTests.java @@ -0,0 +1,164 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; + +import java.io.IOException; + +import org.opensearch.common.xcontent.XContentType; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * Tests for RemoteSearchCache model + */ +public class RemoteSearchCacheTests extends org.apache.lucene.tests.util.LuceneTestCase { + + public void testRemoteSearchCacheCreation() { + long currentTime = System.currentTimeMillis(); + long expirationTime = currentTime + 3600000; // 1 hour later + + RemoteSearchCache cache = new RemoteSearchCache( + "cache-key-123", + "remote-config-1", + "{\"query\": {\"match\": {\"content\": \"test query\"}}}", + "test query", + "{\"hits\": {\"total\": {\"value\": 10}}}", + "{\"took\": 5, \"hits\": {\"total\": {\"value\": 10}}}", + currentTime, + expirationTime + ); + + assertEquals("cache-key-123", cache.getCacheKey()); + assertEquals("remote-config-1", cache.getRemoteConfigId()); + assertEquals("{\"query\": {\"match\": {\"content\": \"test query\"}}}", cache.getQuery()); + assertEquals("test query", cache.getQueryText()); + assertEquals("{\"hits\": {\"total\": {\"value\": 10}}}", cache.getCachedResponse()); + assertEquals("{\"took\": 5, \"hits\": {\"total\": {\"value\": 10}}}", cache.getMappedResponse()); + assertEquals(currentTime, cache.getCacheTimestamp()); + assertEquals(expirationTime, cache.getExpirationTimestamp()); + } + + public void testRemoteSearchCacheToXContent() throws IOException { + long currentTime = System.currentTimeMillis(); + long expirationTime = currentTime + 3600000; + + RemoteSearchCache cache = new RemoteSearchCache( + "cache-key-123", + "remote-config-1", + "{\"query\": {\"match\": {\"content\": \"test query\"}}}", + "test query", + "{\"hits\": {\"total\": {\"value\": 10}}}", + "{\"took\": 5, \"hits\": {\"total\": {\"value\": 10}}}", + currentTime, + expirationTime + ); + + XContentBuilder builder = cache.toXContent(XContentBuilder.builder(XContentType.JSON.xContent()), null); + assertNotNull(builder); + + String jsonString = builder.toString(); + assertNotNull(jsonString); + + // Verify key fields are present in JSON + assertTrue(jsonString.contains("cache-key-123")); + assertTrue(jsonString.contains("remote-config-1")); + assertTrue(jsonString.contains("test query")); + } + + public void testCacheExpiration() { + long currentTime = System.currentTimeMillis(); + + // Create expired cache entry + RemoteSearchCache expiredCache = new RemoteSearchCache( + "expired-key", + "config-1", + "{}", + "query", + "{}", + "{}", + currentTime - 7200000, // 2 hours ago + currentTime - 3600000 // 1 hour ago (expired) + ); + + assertTrue("Cache should be expired", expiredCache.isExpired()); + + // Create non-expired cache entry + RemoteSearchCache validCache = new RemoteSearchCache( + "valid-key", + "config-1", + "{}", + "query", + "{}", + "{}", + currentTime, + currentTime + 3600000 // 1 hour from now + ); + + assertFalse("Cache should not be expired", validCache.isExpired()); + } + + public void testGenerateCacheKey() { + String cacheKey1 = RemoteSearchCache.generateCacheKey("config-1", "{\"query\": \"test\"}", "test"); + String cacheKey2 = RemoteSearchCache.generateCacheKey("config-1", "{\"query\": \"test\"}", "test"); + String cacheKey3 = RemoteSearchCache.generateCacheKey("config-2", "{\"query\": \"test\"}", "test"); + String cacheKey4 = RemoteSearchCache.generateCacheKey("config-1", "{\"query\": \"different\"}", "test"); + + // Same inputs should generate same cache key + assertEquals("Same inputs should generate same cache key", cacheKey1, cacheKey2); + + // Different config ID should generate different cache key + assertFalse("Different config should generate different cache key", cacheKey1.equals(cacheKey3)); + + // Different query should generate different cache key + assertFalse("Different query should generate different cache key", cacheKey1.equals(cacheKey4)); + + // Cache keys should be non-null and non-empty + assertNotNull(cacheKey1); + assertFalse(cacheKey1.isEmpty()); + } + + public void testRemoteSearchCacheConstants() { + // Verify field name constants + assertEquals("cacheKey", RemoteSearchCache.CACHE_KEY); + assertEquals("remoteConfigId", RemoteSearchCache.REMOTE_CONFIG_ID); + assertEquals("query", RemoteSearchCache.QUERY); + assertEquals("queryText", RemoteSearchCache.QUERY_TEXT); + assertEquals("cachedResponse", RemoteSearchCache.CACHED_RESPONSE); + assertEquals("mappedResponse", RemoteSearchCache.MAPPED_RESPONSE); + assertEquals("cacheTimestamp", RemoteSearchCache.CACHE_TIMESTAMP); + assertEquals("expirationTimestamp", RemoteSearchCache.EXPIRATION_TIMESTAMP); + } + + public void testRemoteSearchCacheWithNullValues() throws IOException { + RemoteSearchCache cache = new RemoteSearchCache( + null, // null cache key + null, // null remote config ID + null, // null query + null, // null query text + null, // null cached response + null, // null mapped response + 0, // zero timestamp + 0 // zero expiration + ); + + // Should handle null values gracefully + XContentBuilder builder = cache.toXContent(XContentBuilder.builder(XContentType.JSON.xContent()), null); + assertNotNull(builder); + + String jsonString = builder.toString(); + assertNotNull(jsonString); + + // Should contain empty strings for null values + assertTrue(jsonString.contains("\"cacheKey\":\"\"")); + assertTrue(jsonString.contains("\"remoteConfigId\":\"\"")); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchConfigurationTests.java b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchConfigurationTests.java new file mode 100644 index 00000000..9b0f728b --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchConfigurationTests.java @@ -0,0 +1,143 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; + +import java.io.IOException; +import java.util.HashMap; +import java.util.Map; + +import org.opensearch.common.xcontent.XContentType; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * Tests for RemoteSearchConfiguration model + */ +public class RemoteSearchConfigurationTests extends org.apache.lucene.tests.util.LuceneTestCase { + + public void testRemoteSearchConfigurationCreation() { + Map metadata = new HashMap<>(); + metadata.put("environment", "test"); + + RemoteSearchConfiguration config = new RemoteSearchConfiguration( + "test-config-1", + "Test Remote Config", + "Test configuration for remote OpenSearch cluster", + "https://remote-cluster.example.com:9200", + "testuser", + "testpass", + "{\"query\": {\"match\": {\"content\": \"%SearchText%\"}}}", + "{\"response_structure\": {\"hits\": {\"total_path\": \"$.hits.total.value\"}}}", + 10, + 5, + 60, + false, + metadata, + "2025-01-29T10:00:00Z" + ); + + assertEquals("test-config-1", config.getId()); + assertEquals("Test Remote Config", config.getName()); + assertEquals("Test configuration for remote OpenSearch cluster", config.getDescription()); + assertEquals("https://remote-cluster.example.com:9200", config.getConnectionUrl()); + assertEquals("testuser", config.getUsername()); + assertEquals("testpass", config.getPassword()); + assertEquals("{\"query\": {\"match\": {\"content\": \"%SearchText%\"}}}", config.getQueryTemplate()); + assertEquals("{\"response_structure\": {\"hits\": {\"total_path\": \"$.hits.total.value\"}}}", config.getResponseTemplate()); + assertEquals(10, config.getMaxRequestsPerSecond()); + assertEquals(5, config.getMaxConcurrentRequests()); + assertEquals(60, config.getCacheDurationMinutes()); + assertEquals(false, config.isRefreshCache()); + assertEquals(metadata, config.getMetadata()); + assertEquals("2025-01-29T10:00:00Z", config.getTimestamp()); + } + + public void testRemoteSearchConfigurationToXContent() throws IOException { + Map metadata = new HashMap<>(); + metadata.put("environment", "test"); + + RemoteSearchConfiguration config = new RemoteSearchConfiguration( + "test-config-1", + "Test Remote Config", + "Test configuration", + "https://remote-cluster.example.com:9200", + "testuser", + "testpass", + "{\"query\": {\"match\": {\"content\": \"%SearchText%\"}}}", + "{\"response_structure\": {\"hits\": {\"total_path\": \"$.hits.total.value\"}}}", + RemoteSearchConfiguration.DEFAULT_MAX_REQUESTS_PER_SECOND, + RemoteSearchConfiguration.DEFAULT_MAX_CONCURRENT_REQUESTS, + RemoteSearchConfiguration.DEFAULT_CACHE_DURATION_MINUTES, + false, + metadata, + "2025-01-29T10:00:00Z" + ); + + XContentBuilder builder = config.toXContent(XContentBuilder.builder(XContentType.JSON.xContent()), null); + assertNotNull(builder); + + String jsonString = builder.toString(); + assertNotNull(jsonString); + + // Verify key fields are present in JSON + assert (jsonString.contains("test-config-1")); + assert (jsonString.contains("Test Remote Config")); + assert (jsonString.contains("https://remote-cluster.example.com:9200")); + assert (jsonString.contains("testuser")); + assert (jsonString.contains("%SearchText%")); + } + + public void testRemoteSearchConfigurationDefaults() { + RemoteSearchConfiguration config = new RemoteSearchConfiguration( + "test-config-2", + "Test Config 2", + null, // null description + "https://example.com", + null, // null username + null, // null password + null, // null query template + null, // null response template + RemoteSearchConfiguration.DEFAULT_MAX_REQUESTS_PER_SECOND, + RemoteSearchConfiguration.DEFAULT_MAX_CONCURRENT_REQUESTS, + RemoteSearchConfiguration.DEFAULT_CACHE_DURATION_MINUTES, + false, + null, // null metadata + null // null timestamp + ); + + assertEquals("test-config-2", config.getId()); + assertEquals("Test Config 2", config.getName()); + assertEquals(null, config.getDescription()); + assertEquals("https://example.com", config.getConnectionUrl()); + assertEquals(null, config.getUsername()); + assertEquals(null, config.getPassword()); + assertEquals(null, config.getQueryTemplate()); + assertEquals(null, config.getResponseTemplate()); + assertEquals(10, config.getMaxRequestsPerSecond()); + assertEquals(5, config.getMaxConcurrentRequests()); + assertEquals(60, config.getCacheDurationMinutes()); + assertEquals(false, config.isRefreshCache()); + assertEquals(null, config.getMetadata()); + assertEquals(null, config.getTimestamp()); + } + + public void testRemoteSearchConfigurationConstants() { + assertEquals(10, RemoteSearchConfiguration.DEFAULT_MAX_REQUESTS_PER_SECOND); + assertEquals(5, RemoteSearchConfiguration.DEFAULT_MAX_CONCURRENT_REQUESTS); + assertEquals(60, RemoteSearchConfiguration.DEFAULT_CACHE_DURATION_MINUTES); + + // Verify field name constants + assertEquals("id", RemoteSearchConfiguration.ID); + assertEquals("name", RemoteSearchConfiguration.NAME); + assertEquals("connectionUrl", RemoteSearchConfiguration.CONNECTION_URL); + assertEquals("queryTemplate", RemoteSearchConfiguration.QUERY_TEMPLATE); + assertEquals("responseTemplate", RemoteSearchConfiguration.RESPONSE_TEMPLATE); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchFailureTests.java b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchFailureTests.java new file mode 100644 index 00000000..9fdca763 --- /dev/null +++ b/src/test/java/org/opensearch/searchrelevance/model/RemoteSearchFailureTests.java @@ -0,0 +1,255 @@ +/* + * SPDX-License-Identifier: Apache-2.0 + * + * The OpenSearch Contributors require contributions made to + * this file be licensed under the Apache-2.0 license or a + * compatible open source license. + */ +package org.opensearch.searchrelevance.model; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; + +import java.io.IOException; +import java.net.ConnectException; +import java.net.SocketTimeoutException; + +import org.opensearch.common.xcontent.XContentType; +import org.opensearch.core.xcontent.XContentBuilder; + +/** + * Tests for RemoteSearchFailure model + */ +public class RemoteSearchFailureTests extends org.apache.lucene.tests.util.LuceneTestCase { + + public void testRemoteSearchFailureCreation() { + RemoteSearchFailure failure = new RemoteSearchFailure( + "failure-1", + "remote-config-1", + "experiment-1", + "{\"query\": {\"match\": {\"content\": \"test\"}}}", + "test query", + "CONNECTION_TIMEOUT", + "Connection timed out after 30 seconds", + "2025-01-29T10:00:00Z", + "FAILED" + ); + + assertEquals("failure-1", failure.getId()); + assertEquals("remote-config-1", failure.getRemoteConfigId()); + assertEquals("experiment-1", failure.getExperimentId()); + assertEquals("{\"query\": {\"match\": {\"content\": \"test\"}}}", failure.getQuery()); + assertEquals("test query", failure.getQueryText()); + assertEquals("CONNECTION_TIMEOUT", failure.getErrorType()); + assertEquals("Connection timed out after 30 seconds", failure.getErrorMessage()); + assertEquals("2025-01-29T10:00:00Z", failure.getTimestamp()); + assertEquals("FAILED", failure.getStatus()); + } + + public void testRemoteSearchFailureToXContent() throws IOException { + RemoteSearchFailure failure = new RemoteSearchFailure( + "failure-1", + "remote-config-1", + "experiment-1", + "{\"query\": {\"match\": {\"content\": \"test\"}}}", + "test query", + "AUTH_FAILURE", + "Authentication failed", + "2025-01-29T10:00:00Z", + "FAILED" + ); + + XContentBuilder builder = failure.toXContent(XContentBuilder.builder(XContentType.JSON.xContent()), null); + assertNotNull(builder); + + String jsonString = builder.toString(); + assertNotNull(jsonString); + + // Verify key fields are present in JSON + assertTrue(jsonString.contains("failure-1")); + assertTrue(jsonString.contains("remote-config-1")); + assertTrue(jsonString.contains("experiment-1")); + assertTrue(jsonString.contains("AUTH_FAILURE")); + assertTrue(jsonString.contains("Authentication failed")); + } + + public void testFromExceptionWithTimeout() { + SocketTimeoutException timeoutException = new SocketTimeoutException("Read timed out"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-2", + "config-2", + "experiment-2", + "{\"query\": \"test\"}", + "test", + timeoutException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("failure-2", failure.getId()); + assertEquals("config-2", failure.getRemoteConfigId()); + assertEquals("experiment-2", failure.getExperimentId()); + assertEquals("CONNECTION_TIMEOUT", failure.getErrorType()); + assertEquals("Read timed out", failure.getErrorMessage()); + assertEquals("FAILED", failure.getStatus()); + } + + public void testFromExceptionWithAuthFailure() { + Exception authException = new RuntimeException("401 Unauthorized - Authentication failed"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-3", + "config-3", + "experiment-3", + "{\"query\": \"test\"}", + "test", + authException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("AUTH_FAILURE", failure.getErrorType()); + assertTrue(failure.getErrorMessage().contains("Authentication failed")); + } + + public void testFromExceptionWithNetworkError() { + ConnectException networkException = new ConnectException("Connection refused"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-4", + "config-4", + "experiment-4", + "{\"query\": \"test\"}", + "test", + networkException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("NETWORK_ERROR", failure.getErrorType()); + assertEquals("Connection refused", failure.getErrorMessage()); + } + + public void testFromExceptionWithRateLimit() { + Exception rateLimitException = new RuntimeException("429 Too Many Requests - Rate limit exceeded"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-5", + "config-5", + "experiment-5", + "{\"query\": \"test\"}", + "test", + rateLimitException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("RATE_LIMIT_EXCEEDED", failure.getErrorType()); + assertTrue(failure.getErrorMessage().contains("Rate limit exceeded")); + } + + public void testFromExceptionWithServerError() { + Exception serverException = new RuntimeException("500 Internal Server Error"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-6", + "config-6", + "experiment-6", + "{\"query\": \"test\"}", + "test", + serverException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("REMOTE_SERVER_ERROR", failure.getErrorType()); + assertTrue(failure.getErrorMessage().contains("500")); + } + + public void testFromExceptionWithInvalidResponse() { + Exception parseException = new RuntimeException("Failed to parse response JSON"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-7", + "config-7", + "experiment-7", + "{\"query\": \"test\"}", + "test", + parseException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("INVALID_RESPONSE", failure.getErrorType()); + assertTrue(failure.getErrorMessage().contains("parse")); + } + + public void testFromExceptionWithUnknownError() { + Exception unknownException = new RuntimeException("Some unexpected error"); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-8", + "config-8", + "experiment-8", + "{\"query\": \"test\"}", + "test", + unknownException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("UNKNOWN_ERROR", failure.getErrorType()); + assertEquals("Some unexpected error", failure.getErrorMessage()); + } + + public void testFromExceptionWithNullMessage() { + Exception nullMessageException = new RuntimeException((String) null); + + RemoteSearchFailure failure = RemoteSearchFailure.fromException( + "failure-9", + "config-9", + "experiment-9", + "{\"query\": \"test\"}", + "test", + nullMessageException, + "2025-01-29T10:00:00Z" + ); + + assertEquals("UNKNOWN_ERROR", failure.getErrorType()); + assertEquals(null, failure.getErrorMessage()); + } + + public void testRemoteSearchFailureConstants() { + // Verify field name constants + assertEquals("id", RemoteSearchFailure.ID); + assertEquals("remoteConfigId", RemoteSearchFailure.REMOTE_CONFIG_ID); + assertEquals("experimentId", RemoteSearchFailure.EXPERIMENT_ID); + assertEquals("query", RemoteSearchFailure.QUERY); + assertEquals("queryText", RemoteSearchFailure.QUERY_TEXT); + assertEquals("errorType", RemoteSearchFailure.ERROR_TYPE); + assertEquals("errorMessage", RemoteSearchFailure.ERROR_MESSAGE); + assertEquals("timestamp", RemoteSearchFailure.TIMESTAMP); + assertEquals("status", RemoteSearchFailure.STATUS); + } + + public void testErrorTypeEnum() { + // Verify all error types are available + RemoteSearchFailure.ErrorType[] errorTypes = RemoteSearchFailure.ErrorType.values(); + assertEquals(7, errorTypes.length); + + // Verify specific error types exist + assertEquals("CONNECTION_TIMEOUT", RemoteSearchFailure.ErrorType.CONNECTION_TIMEOUT.name()); + assertEquals("AUTH_FAILURE", RemoteSearchFailure.ErrorType.AUTH_FAILURE.name()); + assertEquals("INVALID_RESPONSE", RemoteSearchFailure.ErrorType.INVALID_RESPONSE.name()); + assertEquals("RATE_LIMIT_EXCEEDED", RemoteSearchFailure.ErrorType.RATE_LIMIT_EXCEEDED.name()); + assertEquals("REMOTE_SERVER_ERROR", RemoteSearchFailure.ErrorType.REMOTE_SERVER_ERROR.name()); + assertEquals("NETWORK_ERROR", RemoteSearchFailure.ErrorType.NETWORK_ERROR.name()); + assertEquals("UNKNOWN_ERROR", RemoteSearchFailure.ErrorType.UNKNOWN_ERROR.name()); + } + + public void testStatusEnum() { + // Verify all status values are available + RemoteSearchFailure.Status[] statuses = RemoteSearchFailure.Status.values(); + assertEquals(3, statuses.length); + + // Verify specific statuses exist + assertEquals("FAILED", RemoteSearchFailure.Status.FAILED.name()); + assertEquals("RETRY_PENDING", RemoteSearchFailure.Status.RETRY_PENDING.name()); + assertEquals("RESOLVED", RemoteSearchFailure.Status.RESOLVED.name()); + } +} diff --git a/src/test/java/org/opensearch/searchrelevance/plugin/SearchRelevancePluginTests.java b/src/test/java/org/opensearch/searchrelevance/plugin/SearchRelevancePluginTests.java index 3e99a703..2ac44024 100644 --- a/src/test/java/org/opensearch/searchrelevance/plugin/SearchRelevancePluginTests.java +++ b/src/test/java/org/opensearch/searchrelevance/plugin/SearchRelevancePluginTests.java @@ -182,7 +182,7 @@ public void testIsAnSystemIndexPlugin() { } public void testTotalRestHandlers() { - assertEquals(14, plugin.getRestHandlers(Settings.EMPTY, null, null, null, null, null, null).size()); + assertEquals(18, plugin.getRestHandlers(Settings.EMPTY, null, null, null, null, null, null).size()); } public void testQuerySetTransportIsAdded() { diff --git a/src/test/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsActionTests.java b/src/test/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsActionTests.java index eb861bfc..be1e0522 100644 --- a/src/test/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsActionTests.java +++ b/src/test/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsActionTests.java @@ -20,7 +20,6 @@ import java.util.HashMap; import java.util.Map; -import org.junit.Before; import org.mockito.ArgumentCaptor; import org.mockito.Mock; import org.mockito.MockitoAnnotations; @@ -59,8 +58,9 @@ public class RestSearchRelevanceStatsActionTests extends SearchRelevanceRestTest @Mock private ClusterUtil clusterUtil; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); MockitoAnnotations.openMocks(this); threadPool = new TestThreadPool(this.getClass().getSimpleName() + "ThreadPool"); @@ -79,8 +79,12 @@ public void setup() { @Override public void tearDown() throws Exception { super.tearDown(); - threadPool.shutdown(); - client.close(); + if (threadPool != null) { + threadPool.shutdown(); + } + if (client != null) { + client.close(); + } } public void test_execute() throws Exception { diff --git a/src/test/java/org/opensearch/searchrelevance/stats/events/EventStatsManagerTests.java b/src/test/java/org/opensearch/searchrelevance/stats/events/EventStatsManagerTests.java index bb5fa206..3ace3180 100644 --- a/src/test/java/org/opensearch/searchrelevance/stats/events/EventStatsManagerTests.java +++ b/src/test/java/org/opensearch/searchrelevance/stats/events/EventStatsManagerTests.java @@ -12,7 +12,6 @@ import java.util.EnumSet; import java.util.Map; -import org.junit.Before; import org.mockito.Mock; import org.mockito.MockitoAnnotations; import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; @@ -29,8 +28,9 @@ public class EventStatsManagerTests extends OpenSearchTestCase { private EventStatsManager eventStatsManager; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); MockitoAnnotations.openMocks(this); eventStatsManager = new EventStatsManager(); eventStatsManager.initialize(mockSettingsAccessor); diff --git a/src/test/java/org/opensearch/searchrelevance/stats/events/TimestampedEventStatTests.java b/src/test/java/org/opensearch/searchrelevance/stats/events/TimestampedEventStatTests.java index 9e775ae7..7a340ff8 100644 --- a/src/test/java/org/opensearch/searchrelevance/stats/events/TimestampedEventStatTests.java +++ b/src/test/java/org/opensearch/searchrelevance/stats/events/TimestampedEventStatTests.java @@ -12,21 +12,21 @@ import java.util.concurrent.TimeUnit; -import org.junit.Before; -import org.mockito.Spy; +import org.mockito.MockitoAnnotations; import org.opensearch.test.OpenSearchTestCase; public class TimestampedEventStatTests extends OpenSearchTestCase { private static final long BUCKET_INTERVAL_MS = 60 * 1000; // 60 seconds private static final EventStatName STAT_NAME = EventStatName.LLM_JUDGMENT_RATING_GENERATIONS; - @Spy private TimestampedEventStat stat; private long currentTime; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); + MockitoAnnotations.openMocks(this); stat = spy(new TimestampedEventStat(STAT_NAME)); currentTime = System.currentTimeMillis(); doAnswer(inv -> currentTime).when(stat).getCurrentTimeInMillis(); diff --git a/src/test/java/org/opensearch/searchrelevance/stats/info/InfoStatsManagerTests.java b/src/test/java/org/opensearch/searchrelevance/stats/info/InfoStatsManagerTests.java index 1f2f8e0a..a98ece51 100644 --- a/src/test/java/org/opensearch/searchrelevance/stats/info/InfoStatsManagerTests.java +++ b/src/test/java/org/opensearch/searchrelevance/stats/info/InfoStatsManagerTests.java @@ -11,7 +11,6 @@ import java.util.Map; import java.util.Set; -import org.junit.Before; import org.mockito.Mock; import org.mockito.MockitoAnnotations; import org.opensearch.searchrelevance.settings.SearchRelevanceSettingsAccessor; @@ -24,8 +23,9 @@ public class InfoStatsManagerTests extends OpenSearchTestCase { private InfoStatsManager infoStatsManager; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); MockitoAnnotations.openMocks(this); infoStatsManager = new InfoStatsManager(mockSettingsAccessor); } diff --git a/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsResponseTests.java b/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsResponseTests.java index 61707d98..a382a466 100644 --- a/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsResponseTests.java +++ b/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsResponseTests.java @@ -20,7 +20,6 @@ import java.util.List; import java.util.Map; -import org.junit.Before; import org.mockito.Mock; import org.mockito.MockitoAnnotations; import org.opensearch.action.FailedNodeException; @@ -49,8 +48,9 @@ public class SearchRelevanceStatsResponseTests extends OpenSearchTestCase { @Mock private StreamOutput mockStreamOutput; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); MockitoAnnotations.openMocks(this); clusterName = new ClusterName("test-cluster"); nodes = new ArrayList<>(); diff --git a/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsTransportActionTests.java b/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsTransportActionTests.java index c1fb568c..0fb65b08 100644 --- a/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsTransportActionTests.java +++ b/src/test/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsTransportActionTests.java @@ -18,7 +18,6 @@ import java.util.List; import java.util.Map; -import org.junit.Before; import org.mockito.Mock; import org.mockito.MockitoAnnotations; import org.opensearch.action.FailedNodeException; @@ -64,8 +63,9 @@ public class SearchRelevanceStatsTransportActionTests extends OpenSearchTestCase private static InfoStatName infoStatName = InfoStatName.CLUSTER_VERSION; private static EventStatName eventStatName = EventStatName.LLM_JUDGMENT_RATING_GENERATIONS; - @Before - public void setup() { + @Override + public void setUp() throws Exception { + super.setUp(); MockitoAnnotations.openMocks(this); clusterName = new ClusterName("test-cluster"); when(clusterService.getClusterName()).thenReturn(clusterName); diff --git a/src/test/scripts/demo_remote_query.sh b/src/test/scripts/demo_remote_query.sh new file mode 100755 index 00000000..aa8be4ef --- /dev/null +++ b/src/test/scripts/demo_remote_query.sh @@ -0,0 +1,2018 @@ +#!/bin/bash + +# +# Copyright OpenSearch Contributors +# SPDX-License-Identifier: Apache-2.0 +# + +# Enhanced Remote Query Experiment Demonstration Script +# +# This script demonstrates the complete integration of remote search with the Search Relevance Workbench +# experiment framework. It creates an experiment that compares local OpenSearch results with remote Solr +# results using standard Information Retrieval metrics. +# +# The script: +# 1. Sets up identical data in OpenSearch and Solr +# 2. Creates local and remote search configurations +# 3. Creates a query set and judgments from ESCI data +# 4. Runs a PAIRWISE_COMPARISON experiment between local and remote configurations +# 5. Shows IR metrics comparison (NDCG, MAP, MRR, Precision@K) +# 6. Demonstrates query template transformation and response mapping +# 7. Executes search comparisons between local and remote systems +# +# This script executes live remote queries via Search Relevance plugin endpoints +# (/_plugins/_search_relevance/remote_search_configurations and /remote_search/execute) +# against a running Solr instance and demonstrates complete experiment integration. + +set -o pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +# Configuration +OPENSEARCH_URL="http://localhost:9200" +SOLR_URL="http://localhost:8983" +SOLR_CORE="ecommerce" +ECOMMERCE_DATA_FILE="esci_us_opensearch-2025-06-06.json" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# Experiment configuration +LOCAL_CONFIG_NAME="opensearch_local" +REMOTE_CONFIG_NAME="solr_remote" +QUERY_SET_NAME="esci_demo_queries" +JUDGMENT_SET_NAME="esci_demo_judgments" +EXPERIMENT_NAME="local_vs_remote_comparison" + +# Command-line options +SKIP_CLEANUP=false +SKIP_DATA_LOADING=false + +# Global variables for cleanup tracking +SOLR_CONTAINER_STARTED=false +TEMP_FILES=() +CREATED_RESOURCES=() + +# Parse command-line arguments +parse_arguments() { + while [[ $# -gt 0 ]]; do + case $1 in + --no-cleanup) + SKIP_CLEANUP=true + log_info "Skip cleanup mode enabled - Solr will remain running and data will be preserved" + shift + ;; + --skip-data-loading) + SKIP_DATA_LOADING=true + SKIP_CLEANUP=true # If skipping data loading, also skip cleanup + log_info "Skip data loading mode enabled - will skip data setup and cleanup steps" + shift + ;; + -h|--help) + show_usage + exit 0 + ;; + *) + log_error "Unknown option: $1" + show_usage + exit 1 + ;; + esac + done +} + +# Show usage information +show_usage() { + echo "Usage: $0 [OPTIONS]" + echo "" + echo "Enhanced Remote Query Experiment Demonstration Script" + echo "" + echo "Options:" + echo " --no-cleanup Skip cleanup - leave all resources for inspection" + echo " --skip-data-loading Skip data loading and setup steps" + echo " -h, --help Show this help message" + echo "" + echo "This script demonstrates the complete integration of remote search with" + echo "the Search Relevance Workbench experiment framework." +} + +# Logging functions +log_info() { + >&2 echo -e "${BLUE}[INFO]${NC} $1" +} + +log_success() { + >&2 echo -e "${GREEN}[SUCCESS]${NC} $1" +} + +log_warning() { + >&2 echo -e "${YELLOW}[WARNING]${NC} $1" +} + +log_error() { + >&2 echo -e "${RED}[ERROR]${NC} $1" +} + +log_section() { + >&2 echo + >&2 echo -e "${CYAN}=== $1 ===${NC}" + >&2 echo +} + +# Remote query helpers (use plugin REST endpoints instead of hardcoded Solr calls) +REMOTE_CONFIG_ID="solr_demo" +# Index where remote search configurations are stored (must exist before POST) +REMOTE_SEARCH_CONFIG_INDEX="search-relevance-remote-search-config" + +create_remote_config_solr() { + log_info "Creating/validating remote search configuration for Solr..." + + local payload + payload=$(cat <<'JSON' +{ + "id": "solr_demo", + "name": "Solr Remote Search", + "description": "Local Solr core via Docker", + "connectionUrl": "http://localhost:8983/solr/ecommerce/select", + "queryTemplate": "defType=edismax&q=${queryText}&q.op=OR&mm=2<=1&qf=title^0.2+category^0.5+bullet_points^0.2+description^3.0+brand^5.0+color^2.0&pf=brand^8+description^5&ps=1&tie=0.0&fq=brand:AVACRAFT&wt=json&rows=${size}", + "maxRequestsPerSecond": 10, + "maxConcurrentRequests": 5, + "cacheDurationMinutes": 60 +} +JSON +) + + # Create (idempotent) remote config + local resp http_code body + resp=$(curl -s -w "%{http_code}" -X POST "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations" \ + -H "Content-Type: application/json" \ + -d "$payload") + http_code="${resp: -3}" + body="${resp%???}" + + if [[ "$http_code" == "200" || "$http_code" == "201" || "$http_code" == "409" ]]; then + log_info "Remote config create returned HTTP $http_code" + else + log_error "Failed to create remote search configuration (HTTP $http_code)" + if command -v jq &> /dev/null; then + echo "$body" | jq '.' + else + echo "$body" + fi + return 1 + fi + + # Verify remote config exists + resp=$(curl -s -w "%{http_code}" "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations/$REMOTE_CONFIG_ID") + http_code="${resp: -3}" + body="${resp%???}" + if [[ "$http_code" != "200" ]]; then + log_error "Failed to fetch remote search configuration '$REMOTE_CONFIG_ID' (HTTP $http_code)" + if command -v jq &> /dev/null; then + echo "$body" | jq '.' + else + echo "$body" + fi + return 1 + fi + + log_success "Remote search configuration '$REMOTE_CONFIG_ID' is available" + return 0 +} + +remote_search_execute() { + # Usage: remote_search_execute "" [size] + local query_text="$1" + local size="${2:-3}" + + local payload + payload=$(cat < /dev/null; then + echo "$body" | jq '.' + else + echo "$body" + fi + return 1 + fi + + # Echo body to stdout for caller to consume + echo "$body" + return 0 +} + +remote_endpoints_available() { + # Returns 0 if remote endpoints are available, 1 otherwise. + # We check the GET route and also attempt to enable the workbench setting if we get 403. + local resp http_code body + resp=$(curl -s -w "%{http_code}" "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations" || true) + http_code="${resp: -3}" + body="${resp%???}" + + if [[ "$http_code" == "200" ]]; then + return 0 + fi + + if [[ "$http_code" == "403" ]]; then + # Try to enable the Workbench setting and re-check once + if command -v curl >/dev/null 2>&1; then + local payload='{"persistent":{"plugins.search_relevance.workbench_enabled": true}}' + curl -s -X PUT "$OPENSEARCH_URL/_cluster/settings" -H "Content-Type: application/json" -d "$payload" >/dev/null 2>&1 || true + fi + resp=$(curl -s -w "%{http_code}" "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations" || true) + http_code="${resp: -3}" + if [[ "$http_code" == "200" ]]; then + return 0 + fi + fi + + # If we get 400 with "no handler found", the running plugin version likely doesn't include remote endpoints + return 1 +} + + +# Enhanced error handling +handle_error() { + local exit_code=$? + log_error "Script failed with exit code $exit_code" + cleanup + exit $exit_code +} + +trap handle_error ERR + +# Check dependencies +check_dependencies() { + log_info "Checking dependencies..." + + local missing_deps=() + for cmd in docker curl jq; do + if ! command -v $cmd &> /dev/null; then + missing_deps+=("$cmd") + fi + done + + if [ ${#missing_deps[@]} -ne 0 ]; then + log_error "Missing required dependencies: ${missing_deps[*]}" + log_info "Please install the missing dependencies (jq is required to compute metrics) and try again" + exit 1 + fi + + log_success "All required dependencies are available" +} + +# Wait for service to be ready +wait_for_service() { + local url=$1 + local service_name=$2 + local max_attempts=${3:-30} + local attempt=1 + + log_info "Waiting for $service_name to be ready at $url..." + + while [ $attempt -le $max_attempts ]; do + if curl -s --max-time 5 "$url" > /dev/null 2>&1; then + log_success "$service_name is ready" + return 0 + fi + + if [ $((attempt % 5)) -eq 0 ]; then + log_info "Still waiting for $service_name... (attempt $attempt/$max_attempts)" + else + echo -n "." + fi + sleep 2 + ((attempt++)) + done + + echo + log_error "$service_name failed to start within $((max_attempts * 2)) seconds" + return 1 +} + +# Wait for Solr core to be fully ready +wait_for_solr_core() { + local core_name=$1 + local max_attempts=${2:-30} + local attempt=1 + + log_info "Waiting for Solr core '$core_name' to be fully ready..." + + while [ $attempt -le $max_attempts ]; do + local core_status=$(curl -s "$SOLR_URL/solr/admin/cores?action=STATUS&core=$core_name" 2>/dev/null) + + # Check if core exists and is active + if echo "$core_status" | grep -q "\"$core_name\"" && echo "$core_status" | grep -q '"instanceDir"'; then + # Additional check: try to ping the core + if curl -s "$SOLR_URL/solr/$core_name/admin/ping" > /dev/null 2>&1; then + log_success "Solr core '$core_name' is fully ready" + return 0 + fi + fi + + if [ $((attempt % 5)) -eq 0 ]; then + log_info "Still waiting for core '$core_name'... (attempt $attempt/$max_attempts)" + else + echo -n "." + fi + sleep 3 + ((attempt++)) + done + + echo + log_error "Solr core '$core_name' failed to become ready within $((max_attempts * 3)) seconds" + return 1 +} + +# Start Solr container with comprehensive error handling +start_solr() { + log_info "Starting Solr container..." + + # Check if Docker is running + if ! docker info > /dev/null 2>&1; then + log_error "Docker is not running. Please start Docker and try again." + exit 1 + fi + + # Check if Solr container is already running + if docker ps --format "table {{.Names}}" | grep -q "^solr_demo$"; then + log_warning "Solr container already running, stopping it first..." + docker stop solr_demo || true + docker rm solr_demo || true + sleep 2 + fi + + # Remove any existing container with the same name + if docker ps -a --format "table {{.Names}}" | grep -q "^solr_demo$"; then + log_info "Removing existing Solr container..." + docker rm solr_demo || true + fi + + # Check if port 8983 is available (use lsof on macOS if netstat fails) + if command -v netstat &> /dev/null && netstat -tuln 2>/dev/null | grep -q ":8983 "; then + log_error "Port 8983 is already in use. Please stop the service using this port." + exit 1 + elif command -v lsof &> /dev/null && lsof -i :8983 &> /dev/null; then + log_error "Port 8983 is already in use. Please stop the service using this port." + exit 1 + fi + + # Start Solr container + log_info "Starting new Solr container..." + if docker run -d \ + --name solr_demo \ + -p 8983:8983 \ + solr:9 \ + solr-precreate $SOLR_CORE; then + SOLR_CONTAINER_STARTED=true + log_success "Solr container started successfully" + else + log_error "Failed to start Solr container" + exit 1 + fi + + # Wait for Solr to be ready + if ! wait_for_service "$SOLR_URL/solr/admin/cores" "Solr" 60; then + log_error "Solr failed to start properly" + exit 1 + fi + + # Wait for the specific core to be fully ready + if ! wait_for_solr_core "$SOLR_CORE" 60; then + log_error "Solr core '$SOLR_CORE' failed to initialize properly" + exit 1 + fi +} + +# Configure Solr schema for ESCI data +configure_solr_schema() { + log_info "Configuring Solr schema for ESCI data..." + + # Double-check that core is ready and responsive + local core_status=$(curl -s "$SOLR_URL/solr/admin/cores?action=STATUS&core=$SOLR_CORE") + if ! echo "$core_status" | grep -q "\"$SOLR_CORE\""; then + log_error "Solr core '$SOLR_CORE' not found" + exit 1 + fi + + # Additional wait to ensure core is fully initialized + log_info "Ensuring core is fully initialized..." + sleep 5 + + # Add field definitions for ESCI data structure - one field at a time + local fields=("title:text_general" "category:text_general" "bullet_points:text_general" "description:text_general" "brand:string" "color:string") + + for field_def in "${fields[@]}"; do + local field_name="${field_def%:*}" + local field_type="${field_def#*:}" + + log_info "Adding field: $field_name ($field_type)" + + local schema_update='{ + "add-field": { + "name": "'$field_name'", + "type": "'$field_type'", + "stored": true, + "indexed": true + } + }' + + local response=$(curl -s -w "%{http_code}" -X POST "$SOLR_URL/solr/$SOLR_CORE/schema" \ + -H "Content-Type: application/json" \ + -d "$schema_update") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|400)$ ]]; then + # 400 is acceptable as field might already exist + log_info "Field $field_name added successfully (or already exists)" + else + log_warning "Failed to add field $field_name (HTTP $http_code): $response_body" + fi + + sleep 1 + done + + log_success "Solr schema configuration completed" +} + +# Download data file with error handling +download_data_file() { + if [ ! -f "$ECOMMERCE_DATA_FILE" ]; then + log_info "Downloading ESCI data file..." + local data_url="https://o19s-public-datasets.s3.amazonaws.com/esci_us_opensearch-2025-06-06.json" + + if command -v wget &> /dev/null; then + if ! wget -q --timeout=30 --tries=3 "$data_url"; then + log_error "Failed to download data file with wget" + exit 1 + fi + elif command -v curl &> /dev/null; then + if ! curl -s --max-time 30 --retry 3 -O "$data_url"; then + log_error "Failed to download data file with curl" + exit 1 + fi + else + log_error "Neither wget nor curl available for downloading data file" + exit 1 + fi + + # Verify file was downloaded and is not empty + if [ ! -s "$ECOMMERCE_DATA_FILE" ]; then + log_error "Downloaded data file is empty or corrupted" + exit 1 + fi + + log_success "Data file downloaded successfully" + else + log_info "Data file already exists, skipping download" + fi +} + +# Build a tiny NDJSON subset from judgments (keeps index+doc pairs) +build_subset_from_judgments() { + local judgments_file="$1" + local subset_file="esci_us_subset.ndjson" + + # Fast path: if subset already exists and is non-empty, skip rebuild + if [ -s "$subset_file" ]; then + log_info "Subset file already exists ($subset_file); skipping rebuild" + echo "$subset_file" + return 0 + fi + + log_info "Building subset NDJSON from judgments at $judgments_file" + + if [ ! -f "$judgments_file" ]; then + log_error "Judgments file not found: $judgments_file" + return 1 + fi + + # Extract unique docIds + local docids_file + docids_file=$(mktemp) + TEMP_FILES+=("$docids_file") + if ! jq -r '.judgmentRatings[].ratings[].docId' "$judgments_file" | sort -u > "$docids_file"; then + log_error "Failed to extract docIds from judgments" + return 1 + fi + + # Verify source data is present + if [ ! -f "$ECOMMERCE_DATA_FILE" ]; then + log_error "Source data file not found for subset extraction: $ECOMMERCE_DATA_FILE" + return 1 + fi + + # Build subset using awk membership check (portable on macOS/BSD awk) + awk -v idsfile="$docids_file" ' + BEGIN { + # Load docIds into a map + while ((getline id < idsfile) > 0) { + ids[id] = 1 + } + prev = "" + } + # Remember index action lines to pair with next doc + /^[[:space:]]*{"index"/ { + prev = $0 + next + } + { + # Check if this line contains any of our target docIds + found = 0 + for (id in ids) { + if (index($0, id)) { + found = 1 + break + } + } + if (found) { + if (prev != "") print prev + print $0 + } + prev = "" + } + ' "$ECOMMERCE_DATA_FILE" > "$subset_file" + + if [ ! -s "$subset_file" ]; then + log_error "Subset file is empty; no matching documents found" + return 1 + fi + + log_success "Subset built: $subset_file ($(wc -l < "$subset_file") lines)" + echo "$subset_file" +} + +# Transform OpenSearch NDJSON to Solr JSON format +transform_data_for_solr() { + log_info "Transforming ESCI data for Solr..." + + download_data_file + + # Transform NDJSON to Solr JSON format + local solr_data_file="esci_us_solr.json" + TEMP_FILES+=("$solr_data_file") + + log_info "Converting data format..." + + # Create Solr-compatible JSON + echo '{"add": [' > "$solr_data_file" + + # Process the NDJSON file and convert to Solr format + local first_doc=true + local doc_count=0 + + while IFS= read -r line; do + # Skip index lines (they start with {"index":) + if [[ $line == *'"index"'* ]]; then + continue + fi + + # Validate JSON line + # Skip empty lines + if [ -z "$line" ]; then + continue + fi + + if ! echo "$line" | jq empty 2>/dev/null; then + log_warning "Skipping invalid JSON line" + continue + fi + + # Add comma separator for all but first document + if [ "$first_doc" = false ]; then + echo "," >> "$solr_data_file" + fi + first_doc=false + + # Transform the document with error handling + if command -v jq &> /dev/null; then + if ! echo "$line" | jq '{ + "doc": { + "id": (.asin // .id // "unknown"), + "title": (.title // ""), + "category": (if .category | type == "array" then .category | join(" > ") else (.category // "") end), + "bullet_points": (.bullet_points // .bullets // ""), + "description": (.description // ""), + "brand": (.brand // ""), + "color": (.color // "") + } + }' >> "$solr_data_file" 2>/dev/null; then + log_warning "Failed to transform document, skipping" + continue + fi + else + # Fallback transformation without jq (basic sed/awk approach) + # This is a simplified transformation that extracts basic fields + local id=$(echo "$line" | sed -n 's/.*"asin":"\([^"]*\)".*/\1/p') + if [ -z "$id" ]; then + id=$(echo "$line" | sed -n 's/.*"id":"\([^"]*\)".*/\1/p') + fi + local title=$(echo "$line" | sed -n 's/.*"title":"\([^"]*\)".*/\1/p' | sed 's/\\/\\\\/g' | sed 's/"/\\"/g') + local brand=$(echo "$line" | sed -n 's/.*"brand":"\([^"]*\)".*/\1/p' | sed 's/\\/\\\\/g' | sed 's/"/\\"/g') + local color=$(echo "$line" | sed -n 's/.*"color":"\([^"]*\)".*/\1/p' | sed 's/\\/\\\\/g' | sed 's/"/\\"/g') + + if [ -n "$id" ]; then + cat >> "$solr_data_file" << EOF +{ + "doc": { + "id": "$id", + "title": "$title", + "category": "", + "bullet_points": "", + "description": "", + "brand": "$brand", + "color": "$color" + } +} +EOF + else + log_warning "Failed to extract document ID, skipping" + continue + fi + fi + + ((doc_count++)) + + done < "$ECOMMERCE_DATA_FILE" + + echo ']}' >> "$solr_data_file" + + if [ $doc_count -eq 0 ]; then + log_error "No documents were successfully transformed" + return 1 + fi + + log_success "Data transformation completed: $solr_data_file ($doc_count documents)" + echo "$solr_data_file" +} + +# Load data into Solr +load_data_to_solr() { + local solr_data_file=$1 + + log_info "Loading data into Solr..." + + # Verify file exists and is not empty + if [ ! -s "$solr_data_file" ]; then + log_error "Solr data file is missing or empty" + exit 1 + fi + + # Ensure a clean core before loading to avoid duplicate counts + curl -s -X POST "$SOLR_URL/solr/$SOLR_CORE/update?commit=true" \ + -H "Content-Type: application/json" \ + -d '{"delete":{"query":"*:*"}}' > /dev/null 2>&1 || true + + local response=$(curl -s -w "%{http_code}" -X POST "$SOLR_URL/solr/$SOLR_CORE/update?commit=true&overwrite=true" \ + -H "Content-Type: application/json" \ + -d @"$solr_data_file") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [ "$http_code" != "200" ]; then + log_error "Failed to load data into Solr (HTTP $http_code)" + echo "Response: $response_body" + exit 1 + fi + + # Wait a moment for commit to complete + sleep 2 + + # Verify data was loaded + local doc_count_response=$(curl -s "$SOLR_URL/solr/$SOLR_CORE/select?q=*:*&rows=0") + if command -v jq &> /dev/null; then + local doc_count=$(echo "$doc_count_response" | jq -r '.response.numFound // 0') + else + local doc_count=$(echo "$doc_count_response" | grep -o '"numFound":[0-9]*' | cut -d: -f2 || echo "0") + fi + + if [ "$doc_count" -eq 0 ]; then + log_error "No documents found in Solr after loading" + exit 1 + fi + + log_success "Loaded $doc_count documents into Solr" +} + +# Setup OpenSearch data +setup_opensearch_data() { + log_info "Setting up OpenSearch data..." + + # Wait for OpenSearch to be ready + if ! wait_for_service "$OPENSEARCH_URL" "OpenSearch" 30; then + log_error "OpenSearch is not available at $OPENSEARCH_URL" + log_info "Start OpenSearch and install the Search Relevance plugin, then re-run this script." + exit 1 + fi + + # Check if search relevance plugin is available + local plugins_response=$(curl -s "$OPENSEARCH_URL/_cat/plugins") + if ! echo "$plugins_response" | grep -q "search-relevance"; then + log_error "Search Relevance plugin is not installed or enabled" + log_info "Please ensure the plugin is installed and the cluster setting is enabled:" + log_info "PUT /_cluster/settings" + log_info '{"persistent": {"plugins.search_relevance.workbench_enabled": true}}' + exit 1 + fi + + # Ensure Workbench is enabled (required for Search Relevance REST APIs) + local settings_payload='{"persistent":{"plugins.search_relevance.workbench_enabled": true}}' + local set_resp set_code + set_resp=$(curl -s -w "%{http_code}" -X PUT "$OPENSEARCH_URL/_cluster/settings" -H "Content-Type: application/json" -d "$settings_payload" || true) + set_code="${set_resp: -3}" + if [[ "$set_code" != "200" ]]; then + log_warning "Failed to enable Workbench setting (HTTP $set_code); remote endpoints may be unavailable." + else + log_info "Workbench setting enabled" + fi + + # Clean up existing data + log_info "Cleaning up existing OpenSearch data..." + curl -s -X DELETE "$OPENSEARCH_URL/ecommerce" > /dev/null 2>&1 || true + curl -s -X DELETE "$OPENSEARCH_URL/search-relevance-*" > /dev/null 2>&1 || true + curl -s -X DELETE "$OPENSEARCH_URL/.plugins-search-relevance-*" > /dev/null 2>&1 || true + + sleep 2 + + download_data_file + + # Load ESCI data into OpenSearch + log_info "Loading data into OpenSearch ecommerce index..." + + # Load data in smaller chunks for reliability + local chunk_size=100 + local total_lines + total_lines=$(wc -l < "$ECOMMERCE_DATA_FILE" 2>/dev/null || echo "1000") + # Load the entire dataset for parity with Solr + local max_lines=$total_lines + local chunks=$(( (max_lines + chunk_size - 1) / chunk_size )) + + for (( i=0; i/dev/null || echo "") + if [ -z "$chunk_data" ]; then + log_warning "No data in chunk $((i+1)), skipping" + continue + fi + + local response=$(echo "$chunk_data" | curl -s -w "%{http_code}" -X POST "$OPENSEARCH_URL/ecommerce/_bulk" \ + -H 'Content-Type: application/x-ndjson' \ + --data-binary @-) + + local http_code="${response: -3}" + if [ "$http_code" != "200" ]; then + log_warning "Failed to load chunk $((i+1)) (HTTP $http_code)" + fi + + sleep 1 + done + + # Refresh index + curl -s -X POST "$OPENSEARCH_URL/ecommerce/_refresh" > /dev/null + + # Verify data was loaded + local doc_count_response=$(curl -s "$OPENSEARCH_URL/ecommerce/_count") + if command -v jq &> /dev/null; then + local doc_count=$(echo "$doc_count_response" | jq -r '.count // 0') + else + local doc_count=$(echo "$doc_count_response" | grep -o '"count":[0-9]*' | cut -d: -f2 || echo "0") + fi + + if [ "$doc_count" -eq 0 ]; then + log_error "No documents found in OpenSearch after loading" + exit 1 + fi + + log_success "Loaded $doc_count documents into OpenSearch" +} + +# Create local search configuration +create_local_search_config() { + log_info "Creating local search configuration..." + + local config_payload=$(cat <<'JSON' +{ + "name": "opensearch_local", + "index": "ecommerce", + "query": "{\"query\": {\"multi_match\": {\"query\": \"%SearchText%\", \"fields\": [\"title^2\", \"category\", \"bullet_points\", \"description\", \"brand\", \"color\"]}}}", + "searchPipeline": "" +} +JSON +) + + local response=$(curl -s -w "%{http_code}" -X PUT "$OPENSEARCH_URL/_plugins/_search_relevance/search_configurations" \ + -H "Content-Type: application/json" \ + -d "$config_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201)$ ]]; then + local config_id=$(echo "$response_body" | jq -r '.search_configuration_id // .id') + CREATED_RESOURCES+=("search_config:$config_id") + log_success "Local search configuration created with ID: $config_id" + echo "$config_id" + else + log_error "Failed to create local search configuration (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create remote search configuration for experiments +create_remote_search_config_for_experiment() { + log_info "Creating remote search configuration for experiment..." + + local config_payload=$(cat <<'JSON' +{ + "id": "solr_remote", + "name": "Solr Remote Search", + "description": "Remote Solr search configuration for experiment comparison", + "connectionUrl": "http://localhost:8983/solr/ecommerce/select", + "queryTemplate": "defType=edismax&q=${queryText}&q.op=OR&mm=2<=1&qf=title^0.2+category^0.5+bullet_points^0.2+description^3.0+brand^5.0+color^2.0&pf=brand^8+description^5&ps=1&tie=0.0&fq=brand:AVACRAFT&wt=json&rows=${size}", + "maxRequestsPerSecond": 10, + "maxConcurrentRequests": 5, + "cacheDurationMinutes": 60 +} +JSON +) + + local response=$(curl -s -w "%{http_code}" -X POST "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations" \ + -H "Content-Type: application/json" \ + -d "$config_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201|409)$ ]]; then + CREATED_RESOURCES+=("remote_config:solr_remote") + log_success "Remote search configuration created/updated: solr_remote" + echo "solr_remote" + else + log_error "Failed to create remote search configuration (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create a second remote search configuration (variant) with an exclusion to force ranking/coverage differences +create_remote_search_config_variant() { + log_info "Creating remote search configuration VARIANT for experiment..." + + local config_payload=$(cat <<'JSON' +{ + "id": "solr_remote_variant", + "name": "Solr Remote Search (Variant)", + "description": "Variant remote Solr search configuration to produce different rankings/coverage", + "connectionUrl": "http://localhost:8983/solr/ecommerce/select", + "queryTemplate": "defType=edismax&q=${queryText}&q.op=OR&mm=2<=1&qf=title^0.2+category^0.5+bullet_points^0.2+description^3.0+brand^5.0+color^2.0&pf=brand^8+description^5&ps=1&tie=0.0&fq=-brand:AVACRAFT&wt=json&rows=${size}", + "maxRequestsPerSecond": 10, + "maxConcurrentRequests": 5, + "cacheDurationMinutes": 60 +} +JSON +) + + local response=$(curl -s -w "%{http_code}" -X POST "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations" \ + -H "Content-Type: application/json" \ + -d "$config_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201|409)$ ]]; then + CREATED_RESOURCES+=("remote_config:solr_remote_variant") + log_success "Remote search configuration VARIANT created/updated: solr_remote_variant" + echo "solr_remote_variant" + else + log_error "Failed to create remote search configuration VARIANT (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create query set from ImportJudgments.json (aligns with judged queries) +create_query_set_from_judgments() { + log_info "Creating query set from ImportJudgments.json" + + local judgments_file="$SCRIPT_DIR/../resources/judgment/ImportJudgments.json" + if [ ! -f "$judgments_file" ]; then + log_error "Judgments file not found: $judgments_file" + exit 1 + fi + + # Build unique queries array, map to [{queryText: "..."}] + local queries_json + queries_json=$(jq -c '[.judgmentRatings[].query] | unique | map({queryText: .})' "$judgments_file") + if [ -z "$queries_json" ]; then + log_error "Failed to build queries from judgments" + exit 1 + fi + + local queryset_payload + queryset_payload=$(jq -n \ + --arg name "$QUERY_SET_NAME" \ + --arg description "Queries derived from ImportJudgments.json" \ + --arg sampling "manual" \ + --argjson qs "$queries_json" \ + '{ + name: $name, + description: $description, + sampling: $sampling, + querySetQueries: $qs + }') + + local response + response=$(curl -s -w "%{http_code}" -X PUT "$OPENSEARCH_URL/_plugins/_search_relevance/query_sets" \ + -H "Content-Type: application/json" \ + -d "$queryset_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201)$ ]]; then + local queryset_id + queryset_id=$(echo "$response_body" | jq -r '.query_set_id // .id') + CREATED_RESOURCES+=("queryset:$queryset_id") + log_success "Query set created with ID: $queryset_id" + echo "$queryset_id" + else + log_error "Failed to create query set (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create query set from ESCI data +create_query_set() { + log_info "Creating query set from ESCI data..." + + local queryset_file="$SCRIPT_DIR/../data-esci/esci_us_queryset.json" + if [ ! -f "$queryset_file" ]; then + log_error "Query set file not found: $queryset_file" + exit 1 + fi + + # Extract first 10 query objects and create payload with correct schema + log_info "Debug: Extracting query objects from $queryset_file" + local queries_obj + queries_obj=$(jq '.querySetQueries[:10]' "$queryset_file") + log_info "Debug: Extracted querySetQueries JSON: ${queries_obj:0:100}..." + + local queryset_payload + queryset_payload=$(jq -n \ + --arg name "$QUERY_SET_NAME" \ + --arg description "ESCI demo queries for local vs remote comparison" \ + --arg sampling "manual" \ + --argjson qs "$queries_obj" \ + '{ + name: $name, + description: $description, + sampling: $sampling, + querySetQueries: $qs + }') + + log_info "Debug: Created payload: ${queryset_payload:0:200}..." + + local response=$(curl -s -w "%{http_code}" -X PUT "$OPENSEARCH_URL/_plugins/_search_relevance/query_sets" \ + -H "Content-Type: application/json" \ + -d "$queryset_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201)$ ]]; then + local queryset_id=$(echo "$response_body" | jq -r '.query_set_id // .id') + CREATED_RESOURCES+=("queryset:$queryset_id") + log_success "Query set created with ID: $queryset_id" + echo "$queryset_id" + else + log_error "Failed to create query set (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create judgment set from ESCI data +create_judgment_set() { + log_info "Creating judgment set from ESCI data..." + + local judgments_file="$SCRIPT_DIR/../resources/judgment/ImportJudgments.json" + if [ ! -f "$judgments_file" ]; then + log_error "Judgments file not found: $judgments_file" + exit 1 + fi + + # Use existing judgments structure + local judgments_payload=$(cat "$judgments_file") + + local response=$(curl -s -w "%{http_code}" -X PUT "$OPENSEARCH_URL/_plugins/_search_relevance/judgments" \ + -H "Content-Type: application/json" \ + -d "$judgments_payload") + + local http_code="${response: -3}" + local response_body="${response%???}" + + if [[ "$http_code" =~ ^(200|201)$ ]]; then + local judgment_id=$(echo "$response_body" | jq -r '.judgment_id // .id') + CREATED_RESOURCES+=("judgment:$judgment_id") + log_success "Judgment set created with ID: $judgment_id" + echo "$judgment_id" + else + log_error "Failed to create judgment set (HTTP $http_code)" + echo "$response_body" | jq '.' || echo "$response_body" + exit 1 + fi +} + +# Create pairwise comparison experiment +create_experiment() { + local local_config_id=$1 + local remote_config_id=$2 + local queryset_id=$3 + local judgment_id=$4 + + log_info "Creating pairwise comparison experiment..." + + local experiment_payload=$(cat < /dev/null; then + echo "$response" | jq '.' + + echo + log_info "Experiment Summary:" + echo "$response" | jq -r ' + .hits.hits[0]._source as $s | + "Experiment ID: " + ($s.id // "unknown") + "\n" + + "Status: " + ($s.status // "unknown") + "\n" + + "Type: " + ($s.type // "unknown") + "\n" + + "Query Set: " + ($s.querySetId // "unknown") + "\n" + + "Search Configurations: " + (($s.searchConfigurationList // []) | join(", ")) + ' + + # Show metrics if available + local metrics=$(echo "$response" | jq '.metrics // empty') + if [ -n "$metrics" ] && [ "$metrics" != "null" ]; then + echo + log_info "IR Metrics Comparison:" + echo "$metrics" | jq '.' + fi + else + echo "$response" + fi +} + +# Fetch and print persisted IR metrics from evaluation_result index +fetch_and_print_metrics() { + local experiment_id=$1 + + log_info "Fetching persisted IR metrics for experiment $experiment_id from evaluation_result index..." + + # Build request payload for evaluation_result documents of this experiment + local query_payload='{"size":1000,"_source":{"includes":["experimentId","metrics.metric","metrics.value"]},"query":{"term":{"experimentId":"'"$experiment_id"'"}}}' + + local resp + resp=$(curl -s -X POST "$OPENSEARCH_URL/search-relevance-evaluation-result/_search" \ + -H "Content-Type: application/json" \ + -d "$query_payload") + + if ! command -v jq >/dev/null 2>&1; then + log_warning "jq not available - showing raw evaluation_result response" + echo "$resp" + return 0 + fi + + # Validate JSON + if ! echo "$resp" | jq empty >/dev/null 2>&1; then + log_warning "Invalid JSON from evaluation_result search" + echo "$resp" + return 0 + fi + + # Total hits (compat for 7.x/8.x) + local total + total=$(echo "$resp" | jq -r '(.hits.total.value // .hits.total // 0)') + + if [[ -z "$total" || "$total" == "0" ]]; then + log_warning "No evaluation results found for experimentId=$experiment_id" + return 0 + fi + + log_info "Aggregated metrics (across all evaluation results):" + echo "$resp" | jq -r ' + [.hits.hits[]?._source] as $docs + | ($docs | length) as $n + | if $n == 0 then empty else + ($docs + | map(.metrics // []) + | add + | group_by(.metric) + | map({metric: .[0].metric, avg: (map(.value // 0) | add / length)}) + | sort_by(.metric) + | map("• " + .metric + ": " + (.avg | tostring)) + )[] + end + ' 2>/dev/null || { + log_warning "Failed to compute aggregated metrics from evaluation_result" + echo "$resp" | jq '.' 2>/dev/null || echo "$resp" + } + + return 0 +} + +# Compute aggregated metrics as JSON for an experiment (metric -> avg) +compute_aggregated_metrics_json() { + local experiment_id=$1 + + # Build request payload for evaluation_result documents of this experiment + local query_payload='{"size":1000,"_source":{"includes":["experimentId","metrics.metric","metrics.value"]},"query":{"term":{"experimentId":"'"$experiment_id"'"}}}' + + local resp + resp=$(curl -s -X POST "$OPENSEARCH_URL/search-relevance-evaluation-result/_search" \ + -H "Content-Type: application/json" \ + -d "$query_payload") + + # Require jq and valid JSON + if ! command -v jq >/dev/null 2>&1 || ! echo "$resp" | jq empty >/dev/null 2>&1; then + echo '{}' + return 0 + fi + + # Total hits (compat for 7.x/8.x) + local total + total=$(echo "$resp" | jq -r '(.hits.total.value // .hits.total // 0)') + if [[ -z "$total" || "$total" == "0" ]]; then + echo '{}' + return 0 + fi + + echo "$resp" | jq -r ' + [.hits.hits[]?._source.metrics[]?] + | group_by(.metric) + | map({key: .[0].metric, value: (map(.value // 0) | add / length)}) + | from_entries + ' +} + +# Show side-by-side aggregated metrics for two experiments (remote vs local) +show_side_by_side_metrics() { + local remote_experiment_id=$1 + local local_experiment_id=$2 + + if ! command -v jq >/dev/null 2>&1; then + log_warning "jq not available - cannot compute side-by-side metrics" + return 0 + fi + + local remote_metrics local_metrics + remote_metrics=$(compute_aggregated_metrics_json "$remote_experiment_id") + local_metrics=$(compute_aggregated_metrics_json "$local_experiment_id") + + # If both are empty, nothing to compare + if [[ "$remote_metrics" == "{}" && "$local_metrics" == "{}" ]]; then + log_info "No persisted metrics found for either experiment to compare." + return 0 + fi + + # Print comparison lines + jq -n --argjson r "$remote_metrics" --argjson l "$local_metrics" ' + ([$r,$l] | add | keys | unique) as $keys + | $keys + | map("• " + . + ": remote=" + (($r[.] // "N/A") | tostring) + ", local=" + (($l[.] // "N/A") | tostring)) + | .[] + ' +} +show_metrics_comparison() { + local experiment_id=$1 + + log_info "Metrics Comparison (live)" + log_info "========================" + + local response=$(curl -s "$OPENSEARCH_URL/_plugins/_search_relevance/experiments/$experiment_id") + + if command -v jq &> /dev/null; then + # Check if experiment has metrics (from first hit's _source) + local has_metrics=$(echo "$response" | jq -r '.hits.hits[0]._source | has("metrics")') + local status=$(echo "$response" | jq -r '.hits.hits[0]._source.status // "unknown"') + + if [ "$has_metrics" = "true" ]; then + echo "$response" | jq -r ' + (.hits.hits[0]._source) as $s | + if $s.metrics then + "Configuration Comparison:\n" + + (($s.searchConfigurationList // []) | map("• " + .) | join("\n")) + "\n\n" + + "IR Metrics Results:\n" + + ($s.metrics | to_entries | map("• " + .key + ": " + (.value | tostring)) | join("\n")) + else + "No metrics available in experiment results" + end + ' + elif [ "$status" = "COMPLETED" ]; then + log_info "Experiment completed but no metrics found in response" + log_info "This may indicate the experiment type doesn't generate comparative metrics" + elif [ "$status" = "PROCESSING" ]; then + log_info "Experiment is still running - metrics will be available when complete" + elif [ "$status" = "ERROR" ]; then + log_warning "Experiment failed - no metrics available" + echo "$response" | jq -r '.error // "No error details available"' + else + log_warning "Experiment status: $status - metrics may not be available" + fi + else + log_warning "jq not available - showing raw experiment response" + echo "$response" + fi + + echo + log_info "Note: Metrics are computed by the Search Relevance Workbench experiment framework" + log_info "using the ESCI judgment data for NDCG, MAP, MRR, and Precision@K calculations." +} + +# Test query template transformation +test_query_template() { + log_info "Testing query template via remote configuration..." + + local query_text="steel" + local opensearch_query='{"query":{"multi_match":{"query":"'$query_text'","fields":["title","category","bullet_points","description","brand","color"]}}}' + + echo + log_info "OpenSearch Query:" + if command -v jq &> /dev/null; then + echo "$opensearch_query" | jq '.' + else + echo "$opensearch_query" + fi + + # Ensure remote config exists on the plugin + if ! create_remote_config_solr; then + log_error "Remote configuration setup failed; cannot validate template." + return 1 + fi + + echo + log_info "Remote configuration details:" + local cfg + cfg=$(curl -s "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations/$REMOTE_CONFIG_ID" || true) + if command -v jq &> /dev/null; then + echo "$cfg" | jq '{id,name,connectionUrl,queryTemplatePresent: (has("queryTemplate") and (.queryTemplate|length>0)), responseTemplatePresent: (has("responseTemplate") and (.responseTemplate|length>0))}' + echo + log_info "Query template preview (with queryText and size substituted):" + local qtpl + qtpl=$(echo "$cfg" | jq -r '.queryTemplate // ""') + if [ -n "$qtpl" ]; then + echo "$qtpl" | sed "s/\${queryText}/$query_text/g" | sed "s/\${size}/10/g" + else + log_warning "No queryTemplate present in remote configuration." + fi + else + echo "$cfg" + fi + + echo + log_success "Remote query template configuration validated" +} + +# Test response template transformation +test_response_template() { + log_info "Testing response template mapping via remote query execution..." + + local query_text="steel" + local size=3 + + log_info "Executing remote search for query='$query_text', size=$size" + local remote_resp + if ! remote_resp=$(remote_search_execute "$query_text" "$size"); then + log_error "Remote response mapping test failed due to execution error." + return 1 + fi + + echo + log_info "Mapped OpenSearch-like Response (from remote):" + if command -v jq &> /dev/null; then + echo "$remote_resp" | jq '.' + else + echo "$remote_resp" + fi + + # Check if the response contains an error + if command -v jq &> /dev/null; then + if echo "$remote_resp" | jq -e '.error' > /dev/null 2>&1; then + echo + log_error "Remote response mapping failed - response contains error" + local error_type=$(echo "$remote_resp" | jq -r '.error.type // "unknown"') + local error_reason=$(echo "$remote_resp" | jq -r '.error.reason // "unknown reason"') + log_error "Error type: $error_type" + log_error "Error reason: $error_reason" + return 1 + fi + else + # Fallback check without jq + if echo "$remote_resp" | grep -q '"error"'; then + echo + log_error "Remote response mapping failed - response contains error" + return 1 + fi + fi + + echo + log_success "Remote response mapping validated through plugin endpoint" +} + +# Demonstrate search comparison +demonstrate_search_comparison() { + log_info "Demonstrating search comparison between OpenSearch (local) and Solr (remote via plugin)..." + + local test_queries=("metal frame" "steel" "keyboard" "iphone") + + for query in "${test_queries[@]}"; do + log_info "Testing query: '$query'" + + # OpenSearch local baseline + log_info "OpenSearch results:" + local os_query='{ + "query": { + "multi_match": { + "query": "'$query'", + "fields": ["title^2", "category", "bullet_points", "description", "brand", "color"] + } + }, + "size": 3 + }' + + local os_response + os_response=$(curl -s -X POST "$OPENSEARCH_URL/ecommerce/_search" \ + -H "Content-Type: application/json" \ + -d "$os_query") + + # Enhanced null/empty response handling for OpenSearch + if [ -z "$os_response" ] || [ "$os_response" = "null" ]; then + echo " (No response from OpenSearch)" + elif command -v jq &> /dev/null; then + # Validate JSON before processing with jq + if echo "$os_response" | jq empty 2>/dev/null; then + # Check if response has hits structure + if echo "$os_response" | jq -e '.hits.hits' >/dev/null 2>&1; then + echo "$os_response" | jq -r '.hits.hits[] | " - " + (._source.title // "No title") + " (Score: " + (._score | tostring) + ")"' 2>/dev/null | head -3 || echo " (Error processing OpenSearch results)" + else + echo " (OpenSearch response missing hits structure)" + fi + else + echo " (Invalid JSON response from OpenSearch)" + echo " Debug: Response first 100 chars: ${os_response:0:100}" + fi + else + echo " (JSON formatting not available without jq)" + fi + + # Remote (Solr via plugin) + log_info "Remote (Solr via plugin) results:" + local remote_resp + if remote_resp=$(remote_search_execute "$query" 3); then + # Enhanced null/empty response handling for remote + if [ -z "$remote_resp" ] || [ "$remote_resp" = "null" ]; then + echo " (No response from remote search)" + elif command -v jq &> /dev/null; then + # Validate JSON before processing with jq + if echo "$remote_resp" | jq empty 2>/dev/null; then + # Check if the response contains an error + if echo "$remote_resp" | jq -e '.error' > /dev/null 2>&1; then + log_warning "Remote execution returned error for query '$query'" + local error_type=$(echo "$remote_resp" | jq -r '.error.type // "unknown"') + echo " Error: $error_type" + elif echo "$remote_resp" | jq -e '.hits.hits' >/dev/null 2>&1; then + # Handle both mapped OpenSearch format and potential Solr array formats + echo "$remote_resp" | jq -r ' + .hits.hits[] | + " - " + ( + (._source.title | if type=="array" then .[0] else . end) // + (.title | if type=="array" then .[0] else . end) // + (._source.title) // (.title) // "No title" + ) + " (Score: " + (._score | tostring) + ")" + ' 2>/dev/null | head -3 || echo " (Error processing remote results)" + else + echo " (Remote response missing hits structure)" + fi + else + echo " (Invalid JSON response from remote search)" + echo " Debug: Response first 100 chars: ${remote_resp:0:100}" + fi + else + # Fallback check without jq + if echo "$remote_resp" | grep -q '"error"'; then + log_warning "Remote execution returned error for query '$query'" + echo " Error detected in response" + else + echo " (JSON formatting not available without jq)" + fi + fi + else + log_warning "Remote execution failed for query '$query'" + fi + + echo + done +} + +# Demonstrate Solr-only search (when OpenSearch is not available) +demonstrate_solr_only_search() { + log_info "Demonstrating Solr search directly (OpenSearch remote endpoints unavailable)" + log_warning "Showing Solr results by querying Solr API without plugin integration" + + local test_queries=("metal frame" "steel" "keyboard" "iphone") + + for query in "${test_queries[@]}"; do + log_info "Testing query: '$query'" + + # Direct Solr query + local solr_params + solr_params="q=title:(${query}) OR category:(${query}) OR bullet_points:(${query}) OR description:(${query}) OR brand:(${query}) OR color:(${query})&wt=json&rows=3" + log_info "Solr results:" + local solr_resp + solr_resp=$(curl -s "$SOLR_URL/solr/$SOLR_CORE/select?${solr_params}" || true) + if command -v jq &> /dev/null; then + echo "$solr_resp" | jq -r '.response.docs[]? | " - " + (.title // "No title")' | head -3 + else + echo " (JSON formatting not available without jq)" + fi + + echo + done + + log_info "Solr-only demonstration notes:" + log_info "• Solr is queried directly via its select handler" + log_info "• To use plugin-based remote execution, run with a Search Relevance plugin build that includes remote endpoints" +} + +# Show remote search configuration concept +show_remote_search_concept() { + log_info "Remote Search Configuration (Live)" + log_info "==================================" + + local cfg + cfg=$(curl -s "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations/$REMOTE_CONFIG_ID" || true) + + if [ -z "$cfg" ]; then + log_warning "Could not fetch remote configuration '$REMOTE_CONFIG_ID'." + return 0 + fi + + if command -v jq &> /dev/null; then + echo "$cfg" | jq '{ + id, name, connectionUrl, + maxRequestsPerSecond, maxConcurrentRequests, cacheDurationMinutes, + queryTemplatePresent: (has("queryTemplate") and (.queryTemplate|length>0)), + responseTemplatePresent: (has("responseTemplate") and (.responseTemplate|length>0)) + }' + else + echo "$cfg" + fi + + echo + log_info "How this demo used the configuration:" + log_info "• Executed remote queries using remote_search/execute" + log_info "• Query template transformed OpenSearch intent to Solr parameters" + log_info "• Response template normalized Solr results to OpenSearch-like format" +} + + +# Data parity verification (counts) +verify_data_parity() { + log_section "Verifying Data Parity Between OpenSearch and Solr" + + # OpenSearch count + local os_count_resp os_count + os_count_resp=$(curl -s "$OPENSEARCH_URL/ecommerce/_count" || true) + if command -v jq &> /dev/null && echo "$os_count_resp" | jq empty >/dev/null 2>&1; then + os_count=$(echo "$os_count_resp" | jq -r '.count // 0') + else + os_count=$(echo "$os_count_resp" | grep -o '"count":[0-9]*' | cut -d: -f2 || echo "0") + fi + + # Solr count + local solr_count_resp solr_count + solr_count_resp=$(curl -s "$SOLR_URL/solr/$SOLR_CORE/select?q=*:*&rows=0" || true) + if command -v jq &> /dev/null && echo "$solr_count_resp" | jq empty >/dev/null 2>&1; then + solr_count=$(echo "$solr_count_resp" | jq -r '.response.numFound // 0') + else + solr_count=$(echo "$solr_count_resp" | grep -o '"numFound":[0-9]*' | cut -d: -f2 || echo "0") + fi + + log_info "OpenSearch document count: $os_count" + log_info "Solr document count: $solr_count" + + if [ -n "$os_count" ] && [ -n "$solr_count" ] && [ "$os_count" -eq "$solr_count" ]; then + log_success "Data parity check passed: counts match" + else + log_warning "Data parity check failed: counts differ" + fi +} + +# Cleanup function +cleanup() { + if [ "$SKIP_CLEANUP" = true ]; then + log_info "Cleanup skipped - Solr container and data preserved for iteration" + log_info "To manually clean up later:" + log_info " docker stop solr_demo && docker rm solr_demo" + + # Show created resources for inspection + if [ ${#CREATED_RESOURCES[@]} -gt 0 ]; then + log_info "Created resources for inspection:" + for resource in "${CREATED_RESOURCES[@]}"; do + echo " - $resource" + done + fi + + # Still clean up temporary files + for file in "${TEMP_FILES[@]}"; do + if [ -f "$file" ]; then + rm -f "$file" + fi + done + + log_info "Access points for continued testing:" + log_info "• OpenSearch: $OPENSEARCH_URL" + log_info "• Solr Admin: $SOLR_URL/solr/#/$SOLR_CORE" + log_info "• OpenSearch ecommerce index: $OPENSEARCH_URL/ecommerce/_search" + log_info "• Search Relevance API: $OPENSEARCH_URL/_plugins/_search_relevance/" + return 0 + fi + + log_info "Cleaning up..." + + # Clean up OpenSearch resources + for resource in "${CREATED_RESOURCES[@]}"; do + local type="${resource%%:*}" + local id="${resource#*:}" + + case "$type" in + "experiment") + curl -s -X DELETE "$OPENSEARCH_URL/_plugins/_search_relevance/experiments/$id" > /dev/null 2>&1 || true + ;; + "search_config") + curl -s -X DELETE "$OPENSEARCH_URL/_plugins/_search_relevance/search_configurations/$id" > /dev/null 2>&1 || true + ;; + "remote_config") + curl -s -X DELETE "$OPENSEARCH_URL/_plugins/_search_relevance/remote_search_configurations/$id" > /dev/null 2>&1 || true + ;; + "queryset") + curl -s -X DELETE "$OPENSEARCH_URL/_plugins/_search_relevance/query_sets/$id" > /dev/null 2>&1 || true + ;; + "judgment") + curl -s -X DELETE "$OPENSEARCH_URL/_plugins/_search_relevance/judgments/$id" > /dev/null 2>&1 || true + ;; + esac + done + + # Stop and remove Solr container + if [ "$SOLR_CONTAINER_STARTED" = true ]; then + log_info "Stopping Solr container..." + docker stop solr_demo 2>/dev/null || true + docker rm solr_demo 2>/dev/null || true + fi + + # Remove temporary files + for file in "${TEMP_FILES[@]}"; do + if [ -f "$file" ]; then + rm -f "$file" + fi + done + + log_success "Cleanup completed" +} + +# Main execution +main() { + # Parse command-line arguments first + parse_arguments "$@" + + log_section "Remote Query Capability Demonstration" + log_info "This demo showcases the remote query capability of the OpenSearch Search Relevance plugin" + log_info "by comparing search performance between OpenSearch and Apache Solr using identical datasets." + echo + log_info "This script uses live remote query endpoints to execute Solr searches via the plugin." + log_info "Responses are normalized using the configured response template for apples-to-apples comparison." + echo + + # Set up cleanup trap + trap cleanup EXIT + + # Check dependencies + log_section "Dependency Check" + check_dependencies + + if [ "$SKIP_DATA_LOADING" = true ]; then + log_section "Data Loading Skipped" + log_info "Skipping data loading steps - assuming OpenSearch and Solr are already set up with data" + log_info "Verifying that required services and data are available..." + + # Basic connectivity checks + if ! wait_for_service "$OPENSEARCH_URL" "OpenSearch" 5; then + log_error "OpenSearch is not available at $OPENSEARCH_URL" + log_info "Start OpenSearch and install the Search Relevance plugin, then re-run this script." + exit 1 + fi + + if ! wait_for_service "$SOLR_URL/solr/admin/cores" "Solr" 5; then + log_error "Solr is not available at $SOLR_URL" + log_info "Start Solr with the '$SOLR_CORE' core, then re-run this script." + exit 1 + fi + + # Check if search relevance plugin is available + local plugins_response=$(curl -s "$OPENSEARCH_URL/_cat/plugins") + if ! echo "$plugins_response" | grep -q "search-relevance"; then + log_error "Search Relevance plugin is not installed or enabled" + exit 1 + fi + + # Ensure Workbench is enabled + local settings_payload='{"persistent":{"plugins.search_relevance.workbench_enabled": true}}' + curl -s -X PUT "$OPENSEARCH_URL/_cluster/settings" -H "Content-Type: application/json" -d "$settings_payload" > /dev/null 2>&1 || true + + # Require remote REST endpoint support + if ! remote_endpoints_available; then + log_error "Remote REST endpoints are not available in the running OpenSearch/plugin instance." + log_info "Ensure you're running OpenSearch with the Search Relevance plugin that includes remote endpoints and that Workbench is enabled." + exit 1 + fi + + log_success "Services verified - proceeding with query and comparison testing" + verify_data_parity + else + # Build subset dataset from judgments to speed up ingestion + download_data_file + local judgments_file="$SCRIPT_DIR/../resources/judgment/ImportJudgments.json" + local subset_file + subset_file=$(build_subset_from_judgments "$judgments_file") || { log_error "Subset build failed"; exit 1; } + ECOMMERCE_DATA_FILE="$subset_file" + log_info "Using subset data file: $ECOMMERCE_DATA_FILE" + + # Set up OpenSearch first (required for remote demo) + log_section "OpenSearch Setup" + setup_opensearch_data + + # Require remote REST endpoint support + if ! remote_endpoints_available; then + log_error "Remote REST endpoints are not available in the running OpenSearch/plugin instance." + log_info "Ensure you're running OpenSearch with the Search Relevance plugin that includes remote endpoints and that Workbench is enabled." + exit 1 + fi + + # Start Solr and load data + log_section "Solr Setup" + start_solr + configure_solr_schema + + log_section "Data Loading" + transform_data_for_solr + load_data_to_solr "esci_us_solr.json" + verify_data_parity + fi + + # Demonstrate template transformations (required) + log_section "Template Transformation Testing" + test_query_template + test_response_template + + # Demonstrate search comparison + log_section "Search Comparison Demonstration" + demonstrate_search_comparison + + # Show remote search concept + log_section "Remote Search Configuration" + show_remote_search_concept + + # Experiment Integration Demonstration + log_section "Experiment Integration Demonstration" + log_info "Now demonstrating the complete integration with Search Relevance Workbench experiments..." + log_info "This shows how remote search configurations can be used in formal experiments with IR metrics." + echo + + # Create search configurations for experiment + log_info "Creating search configurations for experiment comparison..." + local local_config_id + local_config_id=$(create_local_search_config) + + local remote_config_id + remote_config_id=$(create_remote_search_config_for_experiment) + + # Create variant remote configuration to ensure measurable differences + local remote_config_variant_id + remote_config_variant_id=$(create_remote_search_config_variant) + + # Create query set and judgments + log_info "Setting up query set and judgments from ESCI data..." + local queryset_id + queryset_id=$(create_query_set_from_judgments) + + local judgment_id + judgment_id=$(create_judgment_set) + + # Create and run experiments: + # - Remote-only (REMOTE_SEARCH_EVALUATION) with remote config id(s) + # - Local-only (POINTWISE_EVALUATION) with the local search configuration + log_info "Creating and executing remote and local experiments for comparison..." + + local remote_experiment_id + remote_experiment_id=$(create_remote_experiment "$remote_config_id" "$queryset_id" "$judgment_id") + + # Variant remote-only experiment + local remote_variant_experiment_id + remote_variant_experiment_id=$(create_remote_experiment "$remote_config_variant_id" "$queryset_id" "$judgment_id") + + local local_experiment_id + local_experiment_id=$(create_local_experiment "$local_config_id" "$queryset_id" "$judgment_id") + + # Create a PAIRWISE_COMPARISON experiment between local and remote to compute comparative IR metrics + # Note: PAIRWISE_COMPARISON expects Search Configuration IDs for both variants. Remote config ids (like 'solr_remote') + # are Remote Search Configuration documents and are not valid here. Guard to avoid invalid requests. + local pairwise_experiment_id="" + if [[ "$remote_config_id" =~ ^[0-9a-fA-F-]{36}$ ]]; then + pairwise_experiment_id=$(create_experiment "$local_config_id" "$remote_config_id" "$queryset_id" "$judgment_id") + else + log_warning "Skipping PAIRWISE_COMPARISON: remote_config_id '$remote_config_id' is not a Search Configuration id. Using REMOTE_SEARCH_EVALUATION for remote metrics." + fi + + # Wait for both experiments + local any_failed=false + + if wait_for_experiment "$remote_experiment_id"; then + log_section "Remote Experiment Results (Baseline)" + show_experiment_results "$remote_experiment_id" + log_section "Remote Metrics (Baseline)" + show_metrics_comparison "$remote_experiment_id" + log_section "Remote Persisted Metrics (evaluation_result, Baseline)" + fetch_and_print_metrics "$remote_experiment_id" + else + any_failed=true + log_warning "Remote experiment (baseline) did not complete successfully - metrics unavailable" + fi + + if wait_for_experiment "$remote_variant_experiment_id"; then + log_section "Remote Variant Experiment Results" + show_experiment_results "$remote_variant_experiment_id" + log_section "Remote Variant Metrics" + show_metrics_comparison "$remote_variant_experiment_id" + log_section "Remote Variant Persisted Metrics (evaluation_result)" + fetch_and_print_metrics "$remote_variant_experiment_id" + else + any_failed=true + log_warning "Remote variant experiment did not complete successfully - metrics unavailable" + fi + + if wait_for_experiment "$local_experiment_id"; then + log_section "Local Experiment Results" + show_experiment_results "$local_experiment_id" + log_section "Local Metrics" + show_metrics_comparison "$local_experiment_id" + log_section "Local Persisted Metrics (evaluation_result)" + fetch_and_print_metrics "$local_experiment_id" + else + any_failed=true + log_warning "Local experiment did not complete successfully - metrics unavailable" + fi + + # Also wait for the pairwise experiment and display its results/metrics (only if we created one) + if [ -n "$pairwise_experiment_id" ]; then + if wait_for_experiment "$pairwise_experiment_id"; then + log_section "Pairwise Experiment Results" + show_experiment_results "$pairwise_experiment_id" + log_section "Pairwise Metrics" + show_metrics_comparison "$pairwise_experiment_id" + log_section "Pairwise Persisted Metrics (evaluation_result)" + fetch_and_print_metrics "$pairwise_experiment_id" + else + any_failed=true + log_warning "Pairwise experiment did not complete successfully - metrics unavailable" + fi + else + log_info "Pairwise experiment skipped; REMOTE_SEARCH_EVALUATION and POINTWISE_EVALUATION were executed independently." + fi + + # Simple comparison guidance (requires both to have completed) + if [ "$any_failed" = false ]; then + log_section "Aggregated Metrics Comparison (Remote vs Local)" + show_side_by_side_metrics "$remote_experiment_id" "$local_experiment_id" + log_section "Remote vs Remote Variant Metrics" + show_side_by_side_metrics "$remote_experiment_id" "$remote_variant_experiment_id" + + log_section "Comparison Summary" + if [ -n "$pairwise_experiment_id" ]; then + log_info "Primary metrics are available in the Pairwise experiment: $pairwise_experiment_id." + log_info "Use NDCG/MAP/MRR/Precision@K from the pairwise results to determine which configuration performs better on this query set." + else + log_info "Pairwise metrics not available (skipped). Review REMOTE_SEARCH_EVALUATION and POINTWISE_EVALUATION outputs above, or create two local Search Configurations to run a PAIRWISE_COMPARISON." + fi + else + log_section "Comparison Summary" + log_warning "Could not produce a complete comparison because one or both experiments did not complete." + fi + + echo + log_info "Access points for further exploration:" + log_info "• OpenSearch: $OPENSEARCH_URL" + log_info "• Solr Admin: $SOLR_URL/solr/#/$SOLR_CORE" + log_info "• OpenSearch ecommerce index: $OPENSEARCH_URL/ecommerce/_search" + log_info "• Search Relevance API: $OPENSEARCH_URL/_plugins/_search_relevance/" + if [ -n "${experiment_id:-}" ]; then + log_info "• Experiment details: $OPENSEARCH_URL/_plugins/_search_relevance/experiments/$experiment_id" + fi + echo + + if [ "$SKIP_CLEANUP" = true ]; then + log_info "Demo completed. All resources preserved for continued exploration." + log_info "Run the script again with --skip-data-loading to quickly test different configurations." + else + log_info "Demo completed. Resources will be cleaned up on exit." + fi +} + +# Run main function +main "$@" diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.delete_remote_search_configuration.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.delete_remote_search_configuration.json new file mode 100644 index 00000000..ea919b5d --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.delete_remote_search_configuration.json @@ -0,0 +1,25 @@ +{ + "search_relevance.delete_remote_search_configuration": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "Delete a remote search configuration by id" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/remote_search_configurations/{id}", + "methods": ["DELETE"], + "parts": { + "id": { + "type": "string", + "required": true, + "description": "Remote search configuration id" + } + } + } + ] + }, + "params": {} + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.get_remote_search_configuration.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.get_remote_search_configuration.json new file mode 100644 index 00000000..605adf04 --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.get_remote_search_configuration.json @@ -0,0 +1,25 @@ +{ + "search_relevance.get_remote_search_configuration": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "Get a remote search configuration by id" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/remote_search_configurations/{id}", + "methods": ["GET"], + "parts": { + "id": { + "type": "string", + "required": true, + "description": "Remote search configuration id" + } + } + } + ] + }, + "params": {} + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.json new file mode 100644 index 00000000..0ccb9532 --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.json @@ -0,0 +1,19 @@ +{ + "search_relevance": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "Placeholder root API for Search Relevance REST spec; required to satisfy filename-to-API name constraint" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/_noop", + "methods": ["GET"], + "parts": {} + } + ] + }, + "params": {} + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.list_remote_search_configurations.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.list_remote_search_configurations.json new file mode 100644 index 00000000..26c87f83 --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.list_remote_search_configurations.json @@ -0,0 +1,19 @@ +{ + "search_relevance.list_remote_search_configurations": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "List remote search configurations" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/remote_search_configurations", + "methods": ["GET"], + "parts": {} + } + ] + }, + "params": {} + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.put_remote_search_configuration.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.put_remote_search_configuration.json new file mode 100644 index 00000000..0c899957 --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.put_remote_search_configuration.json @@ -0,0 +1,20 @@ +{ + "search_relevance.put_remote_search_configuration": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "Create or update a remote search configuration" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/remote_search_configurations", + "methods": ["POST"], + "parts": {} + } + ] + }, + "params": {}, + "body": { "required": true } + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.remote_search_execute.json b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.remote_search_execute.json new file mode 100644 index 00000000..0daaf37b --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/api/search_relevance.remote_search_execute.json @@ -0,0 +1,20 @@ +{ + "search_relevance.remote_search_execute": { + "stability": "experimental", + "documentation": { + "url": "", + "description": "Execute a remote search using a configured remote search configuration" + }, + "url": { + "paths": [ + { + "path": "/_plugins/_search_relevance/remote_search/execute", + "methods": ["POST"], + "parts": {} + } + ] + }, + "params": {}, + "body": { "required": true } + } +} diff --git a/src/yamlRestTest/resources/rest-api-spec/test/20_remote_search_configurations.yml b/src/yamlRestTest/resources/rest-api-spec/test/20_remote_search_configurations.yml new file mode 100644 index 00000000..3c0d9927 --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/test/20_remote_search_configurations.yml @@ -0,0 +1,100 @@ +"Remote Search Configurations CRUD, validation, and gating": + # Enable Workbench for positive/validation tests + - do: + cluster.put_settings: + body: + persistent: + plugins.search_relevance.workbench_enabled: true + + # Create configuration (POST) - minimal valid body plus optional fields + - do: + search_relevance.put_remote_search_configuration: + body: + id: "test-remote-1" + name: "Test Remote" + description: "YAML REST test config" + connectionUrl: "https://example.com/api" + username: "user" + password: "pass" + queryTemplate: '{"q":"{{queryText}}","size":{{size}}}' + responseTemplate: "" + maxRequestsPerSecond: 2 + maxConcurrentRequests: 1 + cacheDurationMinutes: 5 + refreshCache: false + metadata: + env: "test" + - match: { id: "test-remote-1" } + + # Get configuration by id (GET) + - do: + search_relevance.get_remote_search_configuration: + id: test-remote-1 + - match: { id: "test-remote-1" } + - match: { name: "Test Remote" } + + # Update same configuration (POST with same id) - change name and refreshCache + - do: + search_relevance.put_remote_search_configuration: + body: + id: "test-remote-1" + name: "Test Remote (updated)" + description: "YAML REST test config" + connectionUrl: "https://example.com/api" + username: "user" + password: "pass" + queryTemplate: '{"q":"{{queryText}}","size":{{size}}}' + responseTemplate: "" + maxRequestsPerSecond: 2 + maxConcurrentRequests: 1 + cacheDurationMinutes: 5 + refreshCache: true + metadata: + env: "test" + - match: { id: "test-remote-1" } + + # Get configuration by id (verify updated name) + - do: + search_relevance.get_remote_search_configuration: + id: test-remote-1 + - match: { id: "test-remote-1" } + - match: { name: "Test Remote (updated)" } + + # Missing id should be a bad request (POST) + - do: + catch: bad_request + search_relevance.put_remote_search_configuration: + body: + name: "Missing ID" + connectionUrl: "https://example.com/api" + + # Delete configuration by id (DELETE) + - do: + search_relevance.delete_remote_search_configuration: + id: test-remote-1 + + # Delete again should return not_found + - do: + catch: missing + search_relevance.delete_remote_search_configuration: + id: test-remote-1 + + # Disable Workbench to verify gating returns 403 for list/put/get + - do: + cluster.put_settings: + body: + persistent: + plugins.search_relevance.workbench_enabled: false + + # List (GET) should be forbidden + - do: + catch: forbidden + search_relevance.list_remote_search_configurations: {} + + # Put (POST) should be forbidden + - do: + catch: forbidden + search_relevance.put_remote_search_configuration: + body: + id: "test-remote-2" + connectionUrl: "https://example.com/api" diff --git a/src/yamlRestTest/resources/rest-api-spec/test/21_remote_search_execute.yml b/src/yamlRestTest/resources/rest-api-spec/test/21_remote_search_execute.yml new file mode 100644 index 00000000..7a71953a --- /dev/null +++ b/src/yamlRestTest/resources/rest-api-spec/test/21_remote_search_execute.yml @@ -0,0 +1,42 @@ +"Remote Search Execute validation and gating": + # Enable Workbench to test validation behavior + - do: + cluster.put_settings: + body: + persistent: + plugins.search_relevance.workbench_enabled: true + + # Missing remoteConfigId should be a bad request + - do: + catch: bad_request + search_relevance.remote_search_execute: + body: + queryText: "test" + size: 3 + query: "{}" + + # Non-existent remoteConfigId should be a bad request + - do: + catch: missing + search_relevance.remote_search_execute: + body: + remoteConfigId: "no_such_config_id" + queryText: "test" + size: 3 + query: "{}" + + # Disable Workbench to verify gating returns 403 + - do: + cluster.put_settings: + body: + persistent: + plugins.search_relevance.workbench_enabled: false + + - do: + catch: forbidden + search_relevance.remote_search_execute: + body: + remoteConfigId: "any" + queryText: "test" + size: 1 + query: "{}"