fix: add graceful error handling for VirusTotal invalid domains by jaspermayone · Pull Request #111 · phishdirectory/legacy-api

jaspermayone · 2025-06-29T14:15:35Z

Summary

Add null checks in parseData.ts to prevent crashes when VirusTotal returns null
Add domain validation before VirusTotal API calls using existing validateDomain utility
Enable proper error logging with informative warning messages
Apply consistent domain validation to both check and report methods

Test plan

Test with valid domains (should work as before)
Test with invalid domains (should skip VT call and log warning)
Test with unresolvable domains (should handle gracefully)
Verify no crashes occur in parseData.ts
Check logs for transparency warnings

Fixes #37

Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved reliability by adding extra checks to prevent errors when analyzing virus scan data with missing information.
- Enhanced error handling during domain checks and reports, ensuring invalid domains are safely skipped and users experience fewer disruptions.
Chores
- Updated domain validation to include DNS resolution for more accurate domain verification.
- Streamlined domain sanitization by making it synchronous across multiple services, improving performance and consistency.

- Add null checks in parseData.ts to prevent crashes when VirusTotal returns null - Add domain validation before VirusTotal API calls using existing validateDomain utility - Enable proper error logging with informative warning messages - Apply consistent domain validation to both check and report methods Fixes #37 - prevents app crashes when domains are invalid or unresolvable Co-authored-by: Jasper Mayone <jaspermayone@users.noreply.github.com>

claude · 2025-06-29T14:15:50Z

Claude finished @jaspermayone's task —— View job

Code Review for VirusTotal Error Handling PR

Todo List:

Read CLAUDE.md for repository guidelines (none found)
Analyze changes in src/func/domain/parseData.ts
Analyze changes in src/services/VirusTotal.ts
Review code quality and best practices
Check for potential bugs and issues
Evaluate performance considerations
Assess security concerns
Review test coverage and suggestions
Provide comprehensive feedback

📋 Review Summary

Overall, this is a solid defensive programming improvement that addresses crash-prone scenarios with VirusTotal API integration. The changes are well-structured and follow security best practices.

✅ Code Quality & Best Practices

Excellent improvements:

Proper null safety: The null checks in parseData.ts:50-56 prevent runtime crashes when VirusTotal returns unexpected data structures
Input validation: Domain validation in VirusTotal.ts:25-33 and VirusTotal.ts:70-78 prevents malformed inputs from reaching the API
Consistent error handling: Both check() and report() methods now validate domains using the same pattern
Graceful degradation: The service returns null instead of crashing, allowing other security checks to continue

Code style observations:

The error handling pattern is consistent and follows TypeScript best practices
Good use of the existing DomainValidationError class for typed error handling
Warning logs provide good visibility for debugging without being overly verbose

🐛 Potential Issues

Minor concerns:

Database insertion still occurs on error (VirusTotal.ts:45-49): If the API call succeeds but the domain is later found to be invalid during processing, the raw data is still stored. Consider validating earlier.
Error message consistency: The warning messages use different formats:
- Line 29: "VirusTotal: Skipping invalid domain "${domain}": ${error.message}"
- Line 74: "VirusTotal: Skipping report for invalid domain "${domain}": ${error.message}"
Consider using a consistent format for better log parsing.
Async/await inconsistency (VirusTotal.ts:89-120): The report() method uses .catch() chaining while check() uses try/catch. Consider using try/catch consistently.

⚡ Performance Considerations

Positive impacts:

Reduced API calls: Invalid domains are filtered out before expensive HTTP requests
Early validation: Domain validation happens before sanitization and API calls, saving resources
Non-blocking errors: The service doesn't throw exceptions that could interrupt the processing pipeline

Suggestions:

The validation regex DOMAIN_REGEX in sanitizeDomain.ts:2 is relatively efficient for the validation scope

🔒 Security Assessment

Security improvements:

Input sanitization: Domain validation prevents potential injection attacks via malformed domain inputs
URL prevention: The validation explicitly blocks URL-like inputs (sanitizeDomain.ts:24-30)
Error information leakage: Warning logs are appropriately vague and don't expose sensitive API details

Security best practices followed:

No sensitive data logged in error messages
Graceful failure modes that don't expose internal state
Input validation before external API calls

🧪 Test Coverage Recommendations

The PR mentions a test plan but I don't see automated tests. Consider adding:

Unit tests for parseData.ts:

describe('parseData with null virusTotalData', () => {
  it('should handle null virusTotalData gracefully', async () => {
    const result = await parseData(mockWalshyData, ..., null, ...);
    expect(result).toBe(false); // or expected behavior
  });
});

Unit tests for VirusTotal.ts:

describe('VirusTotal domain validation', () => {
  it('should skip invalid domains and log warning', async () => {
    const consoleSpy = jest.spyOn(console, 'warn');
    const result = await virusTotalService.domain.check('invalid-domain!@#');
    expect(result).toBeNull();
    expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining('Skipping invalid domain'));
  });
});

Integration tests for the complete flow with various domain types (valid, invalid, malformed)

🎯 Final Recommendations

Consider extracting validation: The domain validation pattern is repeated. Consider creating a higher-order function:

private async withDomainValidation<T>(domain: string, operation: (sanitized: string) => Promise<T>): Promise<T | null>

Add JSDoc for new behavior: Document the null return behavior in the method documentation.
Consider rate limiting awareness: With better error handling, ensure the service doesn't retry invalid domains unnecessarily.

🏆 Overall Assessment

This PR successfully addresses the crash issues mentioned in #37. The implementation is defensive, secure, and maintains backward compatibility. The changes follow established patterns in the codebase and improve system reliability.

Approval Status: ✅ Recommended - This is a solid improvement that makes the system more robust without introducing new risks.

jaspermayone · 2025-06-29T14:15:51Z

@claude do a dns lookup as part of the domain validation

coderabbitai · 2025-06-29T14:15:51Z

Walkthrough

The changes add defensive checks and domain validation before making VirusTotal API calls. The VirusTotalService methods now validate domains and skip API calls for invalid ones, logging warnings instead of throwing errors. Additionally, safer property access is implemented in data parsing to prevent runtime exceptions from undefined nested properties.

Changes

File(s)	Change Summary
src/services/VirusTotal.ts	Added synchronous domain sanitization and validation before API calls; improved error handling and logging.
src/func/domain/parseData.ts	Added defensive null/undefined checks for nested property access in VirusTotal data parsing.
src/utils/sanitizeDomain.ts	Changed `validateDomain` to async with DNS resolution check; throws error if domain does not resolve.
src/services/* (AbuseCh, GoogleSafebrowsing, IpQualityScore, PhishReport, SecurityTrails, SinkingYahts, UrlScan, Walshy)	Removed unnecessary `await` from `sanitizeDomain` calls, treating it as synchronous in all these services.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant VirusTotalService
    participant Logger

    Client->>VirusTotalService: check(domain)
    VirusTotalService->>VirusTotalService: sanitizeDomain(domain)
    VirusTotalService->>VirusTotalService: validateDomain(domain)
    alt Domain invalid
        VirusTotalService->>Logger: log warning
        VirusTotalService-->>Client: return null
    else Domain valid
        VirusTotalService->>VirusTotalService: call VirusTotal API
        VirusTotalService-->>Client: return result
    end

Assessment against linked issues

Objective	Addressed	Explanation
Skip VirusTotal API call if domain is invalid/unresolvable (#37)	✅
App should continue running without crashing on invalid domains (#37)	✅
Log a warning or note when VirusTotal is skipped for invalid domains (#37)	✅
Ensure VT integration is wrapped in error handling (try-catch or equivalent) (#37)	✅
Update database schema to make VT data optional (#37)	❌	No schema/database changes observed in the provided code changes.

Poem

A hop and a skip, I checked the domain—
If it's not valid, I won't complain!
No more crashes, no more dread,
Just a gentle warning logged instead.
With careful checks, my code’s robust—
VirusTotal calls only when I trust!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Summary of Changes

Hello @jaspermayone, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stability and error handling of the VirusTotal integration. It introduces proactive domain validation to prevent unnecessary API calls for invalid inputs and adds defensive programming with null checks to avoid crashes when processing potentially incomplete data from the VirusTotal API. The changes ensure more graceful degradation and clearer logging for problematic scenarios.

Highlights

Improved Robustness: Added null and undefined checks within parseData.ts to gracefully handle cases where VirusTotal data might be incomplete or missing, preventing potential crashes when accessing nested properties like last_analysis_stats.
Pre-API Call Domain Validation: Implemented domain validation using the existing validateDomain utility in the VirusTotalService's check and report methods. This ensures that API calls to VirusTotal are only made for valid domains, skipping calls and logging warnings for malformed inputs.
Graceful Error Handling & Logging: Enhanced error handling in VirusTotalService to log informative warning messages for invalid domains or API errors, rather than throwing exceptions or silently failing. This provides better transparency and prevents the application from crashing due to external service issues or bad input.
Consistent Application of Validation: Applied the new domain validation logic consistently across both the check and report functionalities within the VirusTotalService for uniform behavior.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2025-06-29T14:16:08Z

Claude finished @jaspermayone's task —— View job

Enhanced Domain Validation with DNS Lookup ✅

I've successfully enhanced the domain validation to include DNS lookup functionality as requested.

Changes Made:

1. Enhanced Domain Validation (src/utils/sanitizeDomain.ts):

Added DNS lookup using Node.js built-in dns.promises.resolve()
Made validateDomain async to handle DNS resolution
Enhanced error handling for DNS resolution failures
Domain validation now includes:
- Length validation (1-253 characters)
- Format validation using regex
- URL component prevention
- NEW: DNS lookup verification - ensures domain actually exists

2. Updated VirusTotal Service (src/services/VirusTotal.ts):

Made validation calls async with await validateDomain()
Proper error handling for DNS resolution failures
Graceful logging when domains don't resolve

3. Code Quality Improvements:

Fixed incorrect await usage with synchronous sanitizeDomain() across 9 service files
Improved code consistency and clarity

How It Works:

When a domain is validated, it now:

Checks basic format and length requirements
Performs DNS lookup to verify the domain exists
Only proceeds with VirusTotal API calls for domains that actually resolve
Logs informative warnings for domains that don't resolve

This prevents unnecessary API calls and provides better error handling for invalid/non-existent domains. The enhancement has been committed to the PR branch at commit 8b73e77.

gemini-code-assist

Code Review

This pull request significantly enhances the robustness and error handling of the VirusTotal integration. The addition of explicit null checks in parseData.ts prevents potential crashes from incomplete API responses, and the proactive domain validation in VirusTotal.ts for both check and report methods is a crucial improvement, preventing unnecessary external API calls for invalid inputs. The refined error logging also contributes to better transparency and debugging.

jaspermayone

-        // FIXME: this is terrible practice, handle ratelimits better after issue addressed.
-        // console.error(`VirusTotal API error for domain ${domain}:`, error);
+        // Log the error for transparency, but don't throw to prevent crashes
+        console.warn(`VirusTotal API error for domain "${domain}":`, error instanceof Error ? error.message : error);


To fix the issue, the format string in the console.warn statement on line 54 of src/services/VirusTotal.ts should be updated to use the %s specifier for the domain parameter. This ensures that the domain is treated as a string and prevents any format specifiers in the user-controlled input from being interpreted.

Steps:

Replace the format string `VirusTotal API error for domain "${domain}":` with "VirusTotal API error for domain '%s':".

Pass the domain parameter as an argument to the format string.

No additional imports or dependencies are required for this fix.

Copilot

Pull Request Overview

This PR enhances error resilience by validating and sanitizing domains before VirusTotal API calls and preventing null-related crashes in data parsing.

Add and apply validateDomain with graceful skips and warnings for invalid domains in both check and report methods
Switch sanitizeDomain calls from async to sync (removed await)
Add null checks in parseData.ts to guard against missing VirusTotal data

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
src/services/VirusTotal.ts	Introduced `validateDomain` checks, adjusted `sanitizeDomain` usage, and improved error logging
src/func/domain/parseData.ts	Added nested null checks around `virusTotalData` to prevent crashes

Comments suppressed due to low confidence (2)

src/services/VirusTotal.ts:22

sanitizeDomain previously returned a promise (used with await). Removing await means sanitizedDomain is now a Promise instead of a string. Add back await before sanitizeDomain(domain).

        const sanitizedDomain = sanitizeDomain(domain);

src/services/VirusTotal.ts:67

Same issue here: sanitizeDomain returns a promise and needs await to produce a string. Please restore await sanitizeDomain(domain).

      const sanitizedDomain = sanitizeDomain(domain);

Copilot · 2025-06-29T14:16:39Z

+    virusTotalData &&
+    virusTotalData.data &&
+    virusTotalData.data.attributes &&
+    virusTotalData.data.attributes.last_analysis_stats &&
+    (virusTotalData.data.attributes.last_analysis_stats.malicious > 0 ||
+      virusTotalData.data.attributes.last_analysis_stats.suspicious > 0)


[nitpick] This long chain of null checks can be simplified using optional chaining (e.g., virusTotalData?.data?.attributes?.last_analysis_stats?.malicious > 0).

Suggested change

virusTotalData &&

virusTotalData.data &&

virusTotalData.data.attributes &&

virusTotalData.data.attributes.last_analysis_stats &&

(virusTotalData.data.attributes.last_analysis_stats.malicious > 0 ||

virusTotalData.data.attributes.last_analysis_stats.suspicious > 0)

virusTotalData?.data?.attributes?.last_analysis_stats?.malicious > 0 ||

virusTotalData?.data?.attributes?.last_analysis_stats?.suspicious > 0

…rolled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Signed-off-by: Jasper Mayone <me@jaspermayone.com>

claude · 2025-06-29T14:18:00Z

Claude finished @jaspermayone's task —— View job

Code Review for VirusTotal Error Handling PR

Todo List:

Read CLAUDE.md for repository guidelines (none found)
Analyze changes in src/func/domain/parseData.ts
Analyze changes in src/services/VirusTotal.ts
Review code quality and best practices
Check for potential bugs and issues
Evaluate performance considerations
Assess security concerns
Review test coverage and suggestions
Provide comprehensive feedback

📋 Review Summary

Overall, this is a solid defensive programming improvement that addresses crash-prone scenarios with VirusTotal API integration. The changes are well-structured and follow security best practices.

✅ Code Quality & Best Practices

Excellent improvements:

Proper null safety: The null checks in parseData.ts:50-56 prevent runtime crashes when VirusTotal returns unexpected data structures
Input validation: Domain validation in VirusTotal.ts:25-33 and VirusTotal.ts:70-78 prevents malformed inputs from reaching the API
Consistent error handling: Both check() and report() methods now validate domains using the same pattern
Graceful degradation: The service returns null instead of crashing, allowing other security checks to continue

Code style observations:

The error handling pattern is consistent and follows TypeScript best practices
Good use of the existing DomainValidationError class for typed error handling
Warning logs provide good visibility for debugging without being overly verbose

🐛 Potential Issues

Minor concerns:

Database insertion still occurs on error (VirusTotal.ts:45-49): If the API call succeeds but the domain is later found to be invalid during processing, the raw data is still stored. Consider validating earlier.
Error message consistency: The warning messages use different formats:
- Line 29: "VirusTotal: Skipping invalid domain "${domain}": ${error.message}"
- Line 74: "VirusTotal: Skipping report for invalid domain "${domain}": ${error.message}"
Consider using a consistent format for better log parsing.
Async/await inconsistency (VirusTotal.ts:89-120): The report() method uses .catch() chaining while check() uses try/catch. Consider using try/catch consistently.

Optional chaining suggestion (parseData.ts:50-56): The nested null checks could be simplified using TypeScript's optional chaining:

virusTotalData?.data?.attributes?.last_analysis_stats?.malicious > 0 ||
virusTotalData?.data?.attributes?.last_analysis_stats?.suspicious > 0

⚡ Performance Considerations

Positive impacts:

Reduced API calls: Invalid domains are filtered out before expensive HTTP requests
Early validation: Domain validation happens before sanitization and API calls, saving resources
Non-blocking errors: The service doesn't throw exceptions that could interrupt the processing pipeline

Suggestions:

The validation regex DOMAIN_REGEX in sanitizeDomain.ts:2 is relatively efficient for the validation scope

🔒 Security Assessment

Security improvements:

Input sanitization: Domain validation prevents potential injection attacks via malformed domain inputs
URL prevention: The validation explicitly blocks URL-like inputs (sanitizeDomain.ts:24-30)
Error information leakage: Warning logs are appropriately vague and don't expose sensitive API details

Security best practices followed:

No sensitive data logged in error messages
Graceful failure modes that don't expose internal state
Input validation before external API calls

⚠️ Security Note: GitHub Advanced Security flagged the use of externally-controlled format strings in the console.warn statements (VirusTotal.ts:29,74). While the risk is minimal due to domain sanitization, consider using parameterized logging if available.

🧪 Test Coverage Recommendations

The PR mentions a test plan but I don't see automated tests. Consider adding:

Unit tests for parseData.ts:

describe('parseData with null virusTotalData', () => {
  it('should handle null virusTotalData gracefully', async () => {
    const result = await parseData(mockWalshyData, ..., null, ...);
    expect(result).toBe(false); // or expected behavior
  });
});

Unit tests for VirusTotal.ts:

describe('VirusTotal domain validation', () => {
  it('should skip invalid domains and log warning', async () => {
    const consoleSpy = jest.spyOn(console, 'warn');
    const result = await virusTotalService.domain.check('invalid-domain!@#');
    expect(result).toBeNull();
    expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining('Skipping invalid domain'));
  });
});

Integration tests for the complete flow with various domain types (valid, invalid, malformed)

🎯 Final Recommendations

Consider extracting validation: The domain validation pattern is repeated. Consider creating a higher-order function:

private async withDomainValidation<T>(domain: string, operation: (sanitized: string) => Promise<T>): Promise<T | null>

Add JSDoc for new behavior: Document the null return behavior in the method documentation.
Consider rate limiting awareness: With better error handling, ensure the service doesn't retry invalid domains unnecessarily.

🏆 Overall Assessment

This PR successfully addresses the crash issues mentioned in #37. The implementation is defensive, secure, and maintains backward compatibility. The changes follow established patterns in the codebase and improve system reliability.

Approval Status: ✅ Recommended - This is a solid improvement that makes the system more robust without introducing new risks.

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

src/services/VirusTotal.ts (1)
110-120: Use /votes endpoint instead of /comments for vote submissions

The VirusTotal API v3 requires a POST to /domains/{domain}/votes when casting a vote, not /comments. Update the second axios call accordingly:

• File: src/services/VirusTotal.ts
Lines: ~110–120
- axios.post(
-   `https://www.virustotal.com/api/v3/domains/${sanitizedDomain}/comments`,
-   voteData,
-   { headers: headersWithVirusTotal }
- )
+ axios.post(
+   `https://www.virustotal.com/api/v3/domains/${sanitizedDomain}/votes`,
+   voteData,
+   { headers: headersWithVirusTotal }
+ )
Reference: https://docs.virustotal.com/reference/domain-votes-post

🧹 Nitpick comments (1)

src/func/domain/parseData.ts (1)
50-55: Simplify with optional chaining for better readability.

The defensive null checks are excellent for preventing runtime errors when VirusTotal returns incomplete data. However, this can be simplified using optional chaining for better readability.

Apply this diff to use optional chaining:
-  } else if (
-    virusTotalData &&
-    virusTotalData.data &&
-    virusTotalData.data.attributes &&
-    virusTotalData.data.attributes.last_analysis_stats &&
-    (virusTotalData.data.attributes.last_analysis_stats.malicious > 0 ||
-      virusTotalData.data.attributes.last_analysis_stats.suspicious > 0)
-  ) {
+  } else if (
+    (virusTotalData?.data?.attributes?.last_analysis_stats?.malicious > 0 ||
+      virusTotalData?.data?.attributes?.last_analysis_stats?.suspicious > 0)
+  ) {

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf6ebb8 and d1b4333.

📒 Files selected for processing (2)

src/func/domain/parseData.ts (1 hunks)
src/services/VirusTotal.ts (4 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

src/services/VirusTotal.ts (1)

src/utils/sanitizeDomain.ts (3)

sanitizeDomain (40-40)

validateDomain (40-40)

DomainValidationError (40-40)

🪛 Biome (1.9.4)

src/func/domain/parseData.ts

[error] 50-53: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

🪛 GitHub Check: CodeQL

src/services/VirusTotal.ts

[failure] 54-54: Use of externally-controlled format string
Format string depends on a user-provided value.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: claude-review
GitHub Check: Codacy Security Scan

🔇 Additional comments (3)

src/services/VirusTotal.ts (3)

6-6: LGTM: Clean import of domain validation utilities.

The import properly brings in the necessary utilities for domain sanitization and validation with appropriate error handling.

22-33: Excellent domain validation implementation with proper error handling.

The validation logic correctly sanitizes domains first, then validates them before making API calls. The graceful handling of DomainValidationError by logging warnings and returning null prevents unnecessary API calls for invalid domains.

67-78: Consistent domain validation implementation.

Good implementation that mirrors the validation logic from the check method, ensuring consistent behavior across both API methods.

- Add DNS lookup to validateDomain using Node.js dns.promises.resolve() - Make validateDomain async to handle DNS resolution properly - Update VirusTotal service to await async domain validation - Fix incorrect await usage with synchronous sanitizeDomain across services - Improve error handling for DNS resolution failures - Add informative warnings when domains don't resolve This enhancement prevents unnecessary API calls for non-existent domains and provides better validation by checking if domains actually exist. Co-authored-by: Jasper Mayone <jaspermayone@users.noreply.github.com>

coderabbitai

Actionable comments posted: 5

🔭 Outside diff range comments (1)

src/services/IpQualityScore.ts (1)
42-46: Fix duplicate database insert.

The same data is being inserted twice into the database, which appears to be unintentional and could cause data duplication issues.
-      await db.insert(rawAPIData).values({
-        sourceAPI: "IpQualityScore",
-        domain: dbDomain!.id!,
-        data: data,
-      });

♻️ Duplicate comments (1)

src/services/VirusTotal.ts (1)
54-54: Fix format string vulnerability in logging.

Despite past review comments indicating this was addressed, the static analysis tool correctly identifies a format string vulnerability. The domain variable is embedded directly in the template string, which could allow format string attacks.
-        console.warn(`VirusTotal API error for domain "${domain}":`, error instanceof Error ? error.message : error);
+        console.warn(`VirusTotal API error for domain:`, { domain, error: error instanceof Error ? error.message : error });

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3de9af and 8b73e77.

📒 Files selected for processing (10)

src/services/AbuseCh.ts (1 hunks)
src/services/GoogleSafebrowsing.ts (1 hunks)
src/services/IpQualityScore.ts (1 hunks)
src/services/PhishReport.ts (1 hunks)
src/services/SecurityTrails.ts (1 hunks)
src/services/SinkingYahts.ts (1 hunks)
src/services/UrlScan.ts (1 hunks)
src/services/VirusTotal.ts (4 hunks)
src/services/Walshy.ts (2 hunks)
src/utils/sanitizeDomain.ts (3 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (7)

src/services/AbuseCh.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/Walshy.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/UrlScan.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/SinkingYahts.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/SecurityTrails.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/PhishReport.ts (1)

src/utils/sanitizeDomain.ts (1)

sanitizeDomain (49-49)

src/services/VirusTotal.ts (1)

src/utils/sanitizeDomain.ts (3)

sanitizeDomain (49-49)

validateDomain (49-49)

DomainValidationError (49-49)

🪛 GitHub Check: CodeQL

src/services/VirusTotal.ts

[failure] 54-54: Use of externally-controlled format string
Format string depends on a user-provided value.

🔇 Additional comments (9)

src/utils/sanitizeDomain.ts (2)

1-1: DNS import looks good.

The import of the DNS promises API is appropriate for the new DNS resolution functionality.

12-12: Function signature change correctly reflects the new async behavior.

The change from synchronous to asynchronous is necessary for the DNS resolution functionality.

src/services/SinkingYahts.ts (1)

21-21: sanitizeDomain is synchronous—removal of await is correct

In src/utils/sanitizeDomain.ts (line 44), sanitizeDomain is defined as
const sanitizeDomain = (domain: string): string => { … },
confirming it returns a string synchronously.

All service calls now correctly invoke it without await.

No further changes required; approving.

src/services/SecurityTrails.ts (1)

22-22: LGTM - consistent with synchronous sanitizeDomain refactor.

This change aligns with the coordinated refactor to make sanitizeDomain synchronous across all service files.

src/services/IpQualityScore.ts (1)

23-23: LGTM - consistent with synchronous sanitizeDomain refactor.

This change aligns with the coordinated refactor to make sanitizeDomain synchronous across all service files.

src/services/GoogleSafebrowsing.ts (1)

22-22: LGTM - consistent with synchronous sanitizeDomain refactor.

This change aligns with the coordinated refactor to make sanitizeDomain synchronous across all service files.

src/services/VirusTotal.ts (3)

6-6: LGTM - proper imports for domain validation.

The imports correctly bring in the necessary validation utilities to implement graceful error handling for invalid domains.

22-33: Excellent domain validation implementation.

The synchronous sanitization followed by asynchronous validation with proper error handling is well-implemented. Invalid domains are logged and gracefully skipped rather than causing crashes.

67-78: Consistent domain validation in report method.

The validation implementation in the report method follows the same pattern as the check method, providing consistent error handling across both methods.

coderabbitai · 2025-06-29T14:25:25Z

+  // Perform DNS lookup to verify domain exists
+  try {
+    await dns.resolve(domain);
+  } catch (error) {
+    throw new DomainValidationError(`Domain does not resolve: ${domain}`);
+  }
+


💡 Verification agent

🧩 Analysis chain

Consider adding timeout and improving error handling for DNS resolution.

The DNS resolution logic could be more robust:

DNS lookups can fail due to temporary network issues or DNS server problems

No timeout is specified, which could lead to long delays

Consider whether DNS validation should be mandatory or optional

// Perform DNS lookup to verify domain exists try { - await dns.resolve(domain); + await Promise.race([ + dns.resolve(domain), + new Promise((_, reject) => + setTimeout(() => reject(new Error('DNS timeout')), 5000) + ) + ]); } catch (error) { - throw new DomainValidationError(`Domain does not resolve: ${domain}`); + throw new DomainValidationError(`Domain does not resolve or DNS timeout: ${domain}`); }

Let me verify how this validation is being used across the codebase:

🏁 Script executed:

#!/bin/bash # Check how validateDomain is being used across the codebase echo "=== validateDomain usage ===" rg -A 3 "validateDomain" --type ts echo "=== DomainValidationError handling ===" rg -A 5 "DomainValidationError" --type ts

Length of output: 4661

Add DNS lookup timeout and enhance error handling in sanitizeDomain.ts

To fail fast on hanging DNS queries and ensure downstream code handles timeouts correctly, update the DNS resolution block in src/utils/sanitizeDomain.ts as shown below. We’ve verified that validateDomain and its DomainValidationError are only used in src/services/VirusTotal.ts, where errors are already caught and logged.

src/utils/sanitizeDomain.ts // Perform DNS lookup to verify domain exists try { - await dns.resolve(domain); + await Promise.race([ + dns.resolve(domain), + new Promise<never>((_, reject) => + setTimeout(() => reject(new Error('DNS timeout')), 5000) + ) + ]); } catch (error) { - throw new DomainValidationError(`Domain does not resolve: ${domain}`); + throw new DomainValidationError( + `Domain does not resolve or DNS timeout: ${domain}` + ); } return true;

No additional changes are needed—every invocation of validateDomain already catches DomainValidationError and skips invalid domains.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Perform DNS lookup to verify domain exists

try {

await dns.resolve(domain);

} catch (error) {

throw new DomainValidationError(`Domain does not resolve: ${domain}`);

}

// Perform DNS lookup to verify domain exists

try {

await Promise.race([

dns.resolve(domain),

new Promise<never>((_, reject) =>

setTimeout(() => reject(new Error('DNS timeout')), 5000)

)

]);

} catch (error) {

throw new DomainValidationError(

`Domain does not resolve or DNS timeout: ${domain}`

);

}

return true;

🤖 Prompt for AI Agents

In src/utils/sanitizeDomain.ts around lines 34 to 40, the DNS lookup currently lacks a timeout and has basic error handling. Modify the DNS resolution to include a timeout mechanism that rejects the lookup if it takes too long, ensuring the function fails fast on hanging queries. Also, enhance the error handling to differentiate timeout errors from other DNS errors, throwing a DomainValidationError with an appropriate message in both cases.

coderabbitai · 2025-06-29T14:25:26Z

      );

-      let sanitizedDomain = await sanitizeDomain(domain);
+      let sanitizedDomain = sanitizeDomain(domain);


🛠️ Refactor suggestion

Correct removal of await, but consider adding domain validation.

The removal of await is correct since sanitizeDomain is synchronous. However, unlike the VirusTotal service, this doesn't validate domains before making API calls.

Consider adding domain validation to prevent unnecessary API calls with invalid domains:

+import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain"; check: async (domain: string) => { // metrics.increment("services.abusech.domain.check"); + let sanitizedDomain = sanitizeDomain(domain); + + try { + await validateDomain(sanitizedDomain); + } catch (error) { + if (error instanceof DomainValidationError) { + console.warn(`Skipping AbuseCh check for invalid domain: ${sanitizedDomain}`, error.message); + return { error: "Invalid domain" }; + } + throw error; + } let response = await axios.get( `https://urlhaus.abuse.ch/api/v1/host/${domain}`,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let sanitizedDomain = sanitizeDomain(domain);

// at top of src/services/AbuseCh.ts, alongside your other imports

import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain";

export const AbuseChService = {

check: async (domain: string) => {

// metrics.increment("services.abusech.domain.check");

let sanitizedDomain = sanitizeDomain(domain);

try {

await validateDomain(sanitizedDomain);

} catch (error) {

if (error instanceof DomainValidationError) {

console.warn(

`Skipping AbuseCh check for invalid domain: ${sanitizedDomain}`,

error.message

);

return { error: "Invalid domain" };

}

throw error;

}

let response = await axios.get(

`https://urlhaus.abuse.ch/api/v1/host/${domain}`

);

// …rest of your logic…

},

};

🤖 Prompt for AI Agents

In src/services/AbuseCh.ts at line 29, after removing the incorrect await from sanitizeDomain, add a domain validation step before proceeding with API calls. Implement a function or use a utility to check if the domain string is valid, and only continue with the API call if the domain passes validation. This prevents unnecessary requests with invalid domains and aligns with the VirusTotal service's approach.

coderabbitai · 2025-06-29T14:25:26Z

      // metrics.increment("services.phishreport.domain.check");

-      const sanitizedDomain = await sanitizeDomain(domain);
+      const sanitizedDomain = sanitizeDomain(domain);


🛠️ Refactor suggestion

Correct removal of await, but missing domain validation.

The removal of await is correct since sanitizeDomain is synchronous. However, this service lacks domain validation before making API calls, which could result in unnecessary requests for invalid domains.

Consider adding domain validation consistent with the VirusTotal service approach:

+import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain"; check: async (domain: string) => { // metrics.increment("services.phishreport.domain.check"); const sanitizedDomain = sanitizeDomain(domain); + + try { + await validateDomain(sanitizedDomain); + } catch (error) { + if (error instanceof DomainValidationError) { + console.warn(`Skipping PhishReport check for invalid domain: ${sanitizedDomain}`, error.message); + return { error: "Invalid domain" }; + } + throw error; + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const sanitizedDomain = sanitizeDomain(domain);

// At the top of src/services/PhishReport.ts

import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain";

export const PhishReport = {

check: async (domain: string) => {

// metrics.increment("services.phishreport.domain.check");

const sanitizedDomain = sanitizeDomain(domain);

try {

await validateDomain(sanitizedDomain);

} catch (error) {

if (error instanceof DomainValidationError) {

console.warn(

`Skipping PhishReport check for invalid domain: ${sanitizedDomain}`,

error.message

);

return { error: "Invalid domain" };

}

throw error;

}

// …rest of existing logic…

},

};

🤖 Prompt for AI Agents

In src/services/PhishReport.ts at line 22, after sanitizing the domain, add domain validation to ensure the domain is valid before proceeding with API calls. Implement a validation check similar to the VirusTotal service, and if the domain is invalid, handle it appropriately by skipping the API call or returning an error. This prevents unnecessary requests for invalid domains.

coderabbitai · 2025-06-29T14:25:26Z

    check: async (domain: string) => {
      // metrics.increment("services.walshy.domain.check");
-      const sanitizedDomain = await sanitizeDomain(domain);
+      const sanitizedDomain = sanitizeDomain(domain);


🛠️ Refactor suggestion

Correct removal of await, but both methods lack domain validation.

The removal of await is correct since sanitizeDomain is synchronous. However, both check and report methods should validate domains before making API calls to prevent unnecessary requests with invalid domains.

Consider adding domain validation to both methods:

+import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain"; check: async (domain: string) => { // metrics.increment("services.walshy.domain.check"); const sanitizedDomain = sanitizeDomain(domain); + + try { + await validateDomain(sanitizedDomain); + } catch (error) { + if (error instanceof DomainValidationError) { + console.warn(`Skipping Walshy check for invalid domain: ${sanitizedDomain}`, error.message); + return { error: "Invalid domain" }; + } + throw error; + } report: async (domain: string) => { // metrics.increment("services.walshy.domain.report"); const sanitizedDomain = sanitizeDomain(domain); + + try { + await validateDomain(sanitizedDomain); + } catch (error) { + if (error instanceof DomainValidationError) { + console.warn(`Skipping Walshy report for invalid domain: ${sanitizedDomain}`, error.message); + return { error: "Invalid domain" }; + } + throw error; + }

Also applies to: 52-52

🤖 Prompt for AI Agents

In src/services/Walshy.ts at lines 21 and 52, both the check and report methods call sanitizeDomain without validating the domain input. Add domain validation logic before these calls to ensure the domain is valid and prevent unnecessary API requests. Implement a validation function or use existing validation utilities to verify the domain format and handle invalid domains appropriately before proceeding.

coderabbitai · 2025-06-29T14:25:26Z

    check: async (domain: string) => {
      // metrics.increment("services.urlscan.domain.check");
-      const sanitizedDomain = await sanitizeDomain(domain);
+      const sanitizedDomain = sanitizeDomain(domain);


🛠️ Refactor suggestion

Correct removal of await, but validation is especially important for UrlScan.

The removal of await is correct since sanitizeDomain is synchronous. However, domain validation is particularly important for UrlScan since it performs actual web scanning, which is resource-intensive and should not be wasted on invalid domains.

Add domain validation before the expensive scanning operations:

+import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain"; check: async (domain: string) => { // metrics.increment("services.urlscan.domain.check"); const sanitizedDomain = sanitizeDomain(domain); + + try { + await validateDomain(sanitizedDomain); + } catch (error) { + if (error instanceof DomainValidationError) { + console.warn(`Skipping UrlScan check for invalid domain: ${sanitizedDomain}`, error.message); + return { error: "Invalid domain" }; + } + throw error; + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const sanitizedDomain = sanitizeDomain(domain);

// at the top of the file, next to the existing import of sanitizeDomain

import { sanitizeDomain } from "src/utils/sanitizeDomain";

import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain";

export const UrlScan = {

check: async (domain: string) => {

// metrics.increment("services.urlscan.domain.check");

const sanitizedDomain = sanitizeDomain(domain);

try {

await validateDomain(sanitizedDomain);

} catch (error) {

if (error instanceof DomainValidationError) {

console.warn(

`Skipping UrlScan check for invalid domain: ${sanitizedDomain}`,

error.message

);

return { error: "Invalid domain" };

}

throw error;

}

// ... existing scanning logic ...

},

};

🤖 Prompt for AI Agents

In src/services/UrlScan.ts at line 21, after sanitizing the domain with sanitizeDomain, add a validation step to check if the sanitized domain is valid before proceeding with any scanning operations. Implement a domain validation function or use an existing one to verify the domain format and reject or handle invalid domains early to avoid unnecessary resource usage during scanning.

Copilot AI review requested due to automatic review settings June 29, 2025 14:15

jaspermayone requested a review from a team as a code owner June 29, 2025 14:15

github-actions Bot added the bug label Jun 29, 2025

gemini-code-assist Bot reviewed Jun 29, 2025

View reviewed changes

Comment thread src/func/domain/parseData.ts

github-advanced-security AI found potential problems Jun 29, 2025

View reviewed changes

Copilot AI reviewed Jun 29, 2025

View reviewed changes

Potential fix for code scanning alert no. 193: Use of externally-cont…

c3de9af

…rolled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Signed-off-by: Jasper Mayone <me@jaspermayone.com>

coderabbitai Bot reviewed Jun 29, 2025

View reviewed changes

Comment thread src/services/VirusTotal.ts

coderabbitai Bot reviewed Jun 29, 2025

View reviewed changes

jaspermayone merged commit 75e4d85 into main Jun 29, 2025
19 of 22 checks passed

jaspermayone deleted the claude/issue-37-20250629_140833 branch June 29, 2025 14:26

@@ -53,3 +53,3 @@
                     // Log the error for transparency, but don't throw to prevent crashes
-                    console.warn(`VirusTotal API error for domain "${domain}":`, error instanceof Error ? error.message : error);
+                    console.warn("VirusTotal API error for domain '%s':", domain, error instanceof Error ? error.message : error);
                     return null;

-      let sanitizedDomain = sanitizeDomain(domain);
+// at top of src/services/AbuseCh.ts, alongside your other imports
+import { validateDomain, DomainValidationError } from "src/utils/sanitizeDomain";
+export const AbuseChService = {
+  check: async (domain: string) => {
+    // metrics.increment("services.abusech.domain.check");
+    let sanitizedDomain = sanitizeDomain(domain);
+    try {
+      await validateDomain(sanitizedDomain);
+    } catch (error) {
+      if (error instanceof DomainValidationError) {
+        console.warn(
+          `Skipping AbuseCh check for invalid domain: ${sanitizedDomain}`,
+          error.message
+        );
+        return { error: "Invalid domain" };
+      }
+      throw error;
+    }
+    let response = await axios.get(
+      `https://urlhaus.abuse.ch/api/v1/host/${domain}`
+    );
+    // …rest of your logic…
+  },
+};

Conversation

jaspermayone commented Jun 29, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

claude Bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review for VirusTotal Error Handling PR

📋 Review Summary

✅ Code Quality & Best Practices

🐛 Potential Issues

⚡ Performance Considerations

🔒 Security Assessment

🧪 Test Coverage Recommendations

🎯 Final Recommendations

🏆 Overall Assessment

Uh oh!

jaspermayone commented Jun 29, 2025

Uh oh!

coderabbitai Bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Assessment against linked issues

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

claude Bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Enhanced Domain Validation with DNS Lookup ✅

Changes Made:

How It Works:

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Steps:

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review for VirusTotal Error Handling PR

📋 Review Summary

✅ Code Quality & Best Practices

🐛 Potential Issues

⚡ Performance Considerations

🔒 Security Assessment

🧪 Test Coverage Recommendations

🎯 Final Recommendations

🏆 Overall Assessment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jaspermayone commented Jun 29, 2025 •

edited by coderabbitai Bot

Loading

claude Bot commented Jun 29, 2025 •

edited

Loading

coderabbitai Bot commented Jun 29, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

claude Bot commented Jun 29, 2025 •

edited

Loading

claude Bot commented Jun 29, 2025 •

edited

Loading