Better exception logging

Currently it's difficult to process exceptions in logger plugins. There are mostly two main issues:


### Exception details not provided in log event

The `ERROR` event seems to be intended to have more details about the exception (the planned [`ErrorEvent`](https://github.com/snakemake/snakemake-interface-logger-plugins/pull/32/changes#diff-d85e49da68413262c5da3e274a2e1bcce75f5a8f4a26a58df06530e993cbb16aR161) dataclass has multiple different attributes). Currently this is emitted from [`snakemake.exceptions.print_exception()`](https://github.com/snakemake/snakemake/blob/0fc5fd14bf8707812abac14ccfeac0eb8093005d/src/snakemake/exceptions.py#L171). This formats the  exception using `format_exception_to_string()` for the log message, but the only additional information attached is the name of the exception type. This makes it hard for plugins to do anything with the error other than display the preformatted message provided.

A quick and easy solution is to attach the exception instance to the log record for the `ERROR` event in the standard way, using the `exc_info` argument to the logging method. This makes everything accessible to loggers, including the source file and line number. The `linemaps` argument to `print_exception()` could also be included in the `extra` dict, so that loggers can set the correct line numbers in tracebacks.


### Multiple error events

If an error occurs during a job, two separate events are emitted: `ERROR` and `JOB_ERROR`. Details about the exception itself can be attached to `ERROR` because the function that creates the event is passed the exception instance. However it is not possible to know which job the error corresponds to. `JOB_ERROR` is the opposite, it indicates which job had an error but does not have the exception details attached to it.

Solving this might be a lot more complicated. `JOB_ERROR` is emitted from `Job.log_error()`, but there are multiple different code paths leading to it. Here's what I found from my analysis:

```mermaid
graph TD

ERROR["`ERROR`"]
JOB_ERROR["`JOB_ERROR`"]

subgraph executor_funcs ["Executors"]
	executor_callback["`local.Executor._callback(job_info)`"]
	touch_executor_run_job["`touch.Executor.run_job()`"]
	%%executor_callback ~~~ touch_executor_run_job
end

abstractexecutor_report_job_error["`AbstractExecutor.report_job_error(job_info, msg)`"]
abstractexecutor_print_job_error["`AbstractExecutor.print_job_error(job_info, msg)`"]

print_exception["`exceptions.print_exception(ex)`"]
job_log_error["`Job.log_error(msg)`"]
groupjob_log_error["`GroupJob.log_error(msg)`"]
jobscheduler_schedule["`JobScheduler.schedule()`"]

executor_funcs -->|ex| print_exception
executor_funcs -->|job_info| abstractexecutor_report_job_error

print_exception --> ERROR

abstractexecutor_report_job_error -->|job_info, msg| abstractexecutor_print_job_error
abstractexecutor_print_job_error -->|msg| job_log_error
groupjob_log_error -->|"()"| job_log_error --> JOB_ERROR
jobscheduler_schedule -.->|"()"| job_log_error & groupjob_log_error
jobscheduler_schedule --->|"()"| print_exception
```

Based on this, it might be possible to add an optional `job_id` or `job_ids` argument to `print_exception` that could be attached to the event, which could be provided in `local.Excutor._callback()`. There would still be two events, but it would be possible to match them with each other.

This doesn't cover the paths from `JobScheduler.schedule`

### Caveats

I don't know how any of this changes with remote executors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better exception logging #53

Exception details not provided in log event

Multiple error events

Caveats

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Better exception logging #53

Description

Exception details not provided in log event

Multiple error events

Caveats

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions