New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48059][CORE] Implement the structured log framework on the java side #46301
Conversation
@gengliangwang |
We currently support three ways to write logs:
After this pr, we can prohibit
|
@panbingkun Thanks for bringing this up.
and found no log4j usages. Is this for external systems? If yes, the external system should be able to use the current Logging trait as well, right? |
@gengliangwang spark/core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java Lines 285 to 290 in 9caa6f7
spark/connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java Line 336 in 9caa6f7
|
@panbingkun I see. There are about 88 loggings with variables
Since all the logs are using |
I think this is a good suggestion. Let me try it. |
} | ||
} | ||
|
||
public void error(String msg, MDC... mdcs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the following design is shown in the document,
But I think the parameter void error(String msg, MDC... mdcs)
is more in line with the style of Java? WDYT @gengliangwang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way is fine to me
@gengliangwang |
if (mdcs == null || mdcs.length == 0) { | ||
slf4jLogger.warn(msg); | ||
} else { | ||
withLogContext(msg, mdcs, null, mt -> slf4jLogger.warn(mt.message)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check the log level before calling withLogContext
. For example, if the log level is ERROR, we don't need to enter the method withLogContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
import org.apache.spark.internal.LogKeys; | ||
import org.apache.spark.internal.MDC; | ||
|
||
public abstract class LoggerSuiteBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ideally, we can have a followup to reduce the duplicated test code.
Thanks, merging to master |
Thank you very much. |
@panbingkun Awesome, thanks for the work! |
…a side ### What changes were proposed in this pull request? The pr aims to implement the structured log framework on the `java side`. ### Why are the changes needed? Currently, the structured log framework on the `scala side` is basically available, but the`Spark Core` code also includes some `Java code`, which also needs to be connected to the structured log framework. ### Does this PR introduce _any_ user-facing change? Yes, only for developers. ### How was this patch tested? - Add some new UT. - Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#46301 from panbingkun/structured_logger_java. Authored-by: panbingkun <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? Two new classes `org.apache.spark.internal.Logger` and `org.apache.spark.internal.LoggerFactory` were introduced from #46301. Given that Logger is a widely recognized **interface** in Log4j, it may lead to confusion to have a class with the same name. To avoid this and clarify its purpose within the Spark framework, I propose renaming `org.apache.spark.internal.Logger` to `org.apache.spark.internal.SparkLogger`. Similarly, to maintain consistency, `org.apache.spark.internal.LoggerFactory` should be renamed to `org.apache.spark.internal.SparkLoggerFactory`. ### Why are the changes needed? To avoid naming confusion and clarify the java Spark logger purpose within the logging framework ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? GA tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #46600 from gengliangwang/refactorLogger. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
What changes were proposed in this pull request?
The pr aims to implement the structured log framework on the
java side
.Why are the changes needed?
Currently, the structured log framework on the
scala side
is basically available, but theSpark Core
code also includes someJava code
, which also needs to be connected to the structured log framework.Does this PR introduce any user-facing change?
Yes, only for developers.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.