Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Error Code] Error Code Module Refactoring #4356

Open
1 of 2 tasks
casionone opened this issue Mar 10, 2023 · 2 comments
Open
1 of 2 tasks

[Feature][Error Code] Error Code Module Refactoring #4356

casionone opened this issue Mar 10, 2023 · 2 comments
Labels

Comments

@casionone
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Problem Description

Current issues:

  • The definition of the historical error code value is relatively random, the readability needs to be improved, and there are duplication problems, the specification needs to be determined, and the stock needs to be optimized
  • If the RPC call link is long, the root error will be lost. When locating the problem, it needs to be checked layer by layer, which is inconvenient to locate the problem. On the premise that trace is not introduced, it is necessary to consider how to conveniently locate log problems and locate exceptions of multi-level calls
  • There will be more complex calling relationships among wds ecological components. As linkis is the base of basic components, it is necessary to consider how to provide a more general error code module

当前存在的问题

  • 历史错误码码值定义的比较随意,可读性需要提升,且存在重复问题,规范需要确定,存量的需要优化
  • RPC调用链路长会丢失根异常,定位问题时,需要一层一层排查,不方便问题定位。在未引入trace的前提下 ,需要考虑如何方便的进行日志问题定位, 多层级调用的异常定位
  • wds生态组件之间 会存在比较复杂的调用关系,linkis作为基础组件的基座,需要考虑如何提供比较通用的错误码模块

Description

Achieved Goal

  • It is necessary to formulate a set of error code specifications to meet the usage scenarios of WDS ecological components, and a set of general and usable error code modules, which can be used by other components in the form of jar packages.
  • Able to achieve abnormal services that can clearly perceive the root cause through the abnormal information of the interface
  • The exception information supplements the service label, which identifies the root component and service that throws the exception, such as DSS-ProjectServer(ip:host), Linkis-SparkEC(ip:host), Hadoop-HDFS, Spark-Job(app_1111-job1); And error type labels, such as user initialization error, user input verification error, user permission error, DSS service exception, Linkis component exception, underlying computing and storage exception, exception caused by change, etc.

实现的目标

  • 需要制定一套错误码规范 满足WDS生态组件的使用场景 ,一套通用的可服用的错误码模块,以jar包方式 供其他组件使用。
  • 能够达到 通过接口异常信息,能比较清晰的感知到根因的异常服务
  • 异常信息补充服务标签,标识出抛出异常的根组件和服务,如DSS-ProjectServer(ip:host)、Linkis-SparkEC(ip:host)、Hadoop-HDFS、Spark-Job(app_1111-job1);以及错误类型标签,如用户初始化错误、用户输入校验错误、用户权限错误、DSS服务异常、Linkis组件异常、底层计算存储异常、变更中导致异常等,

Use case

No response

Solutions

No response

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@github-actions
Copy link

😊 Welcome to the Apache Linkis community!!

We are glad that you are contributing by opening this issue.

Please make sure to include all the relevant context.
We will be here shortly.

If you are interested in contributing to our website project, please let us know!
You can check out our contributing guide on
👉 How to Participate in Project Contribution.

Community

WeChat Assistant WeChat Public Account

Mailing Lists

Name Description Subscribe Unsubscribe Archive
[email protected] community activity information subscribe unsubscribe archive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants
@casionone and others