Skip to content

Logging with new format and named logger#735

Open
gxuu wants to merge 5 commits intofinos:mainfrom
gxuu:apr-better-log
Open

Logging with new format and named logger#735
gxuu wants to merge 5 commits intofinos:mainfrom
gxuu:apr-better-log

Conversation

@gxuu
Copy link
Copy Markdown
Contributor

@gxuu gxuu commented Apr 17, 2026

READY FOR REVIEW.

LOGGING MIGRATION GUIDE FOR PRIVATE WORKER MANAGERS (FOR HUMAN):

  • Log format has been changed, no action is required if your worker managers running through worker manager entry point.
  • Global logging is no longer being used, please use a named logger instead. Consult orb worker manager for the usage.
  • If your adapter lives outside this repo, name your logger under the scaler tree so it inherits scaler's handlers:
    logger = logging.getLogger(f"scaler.external.{__name__}") — otherwise your logs will silently go nowhere.

HUMAN SHOULD STOP READING HERE

LOGGING MIGRATION GUIDE FOR PRIVATE WORKER MANAGERS (FOR AI):

Context

Two recent changes in opengris-scaler (PRs #691 and #653) affect every
worker-manager adapter that logs output. Internal / out-of-repo adapters
must be updated to match, or their log output will either (a) not appear,
or (b) drift from the unified scaler format.

What changed upstream

  1. Unified log format (Better logging #691) — all scaler processes emit:
    <timestamp> <level> <process_name>[<pid>]: <message>
    The <process_name> is set by setup_logger(..., process_name=...),
    called once in scaler.entry_points.scaler._run_worker_manager before
    your adapter's run() executes. You do NOT call setup_logger yourself.

  2. Named loggers, off the root logger (Scaler log output can be silently suppressed by third-party libraries that reconfigure the root logger #653) — scaler no longer
    configures the root logger. It configures the named logger "scaler"
    with propagate=False. Any logger whose dotted name starts with
    scaler. inherits scaler's handlers and format; anything outside that
    tree does not.

  3. ORB workaround removed. Calls like:
    setup_logger(self._logging_paths, self._logging_config_file, self._logging_level)
    inside __aenter__ (or anywhere inside an adapter) are now dead. The
    reason the workaround existed — third-party SDKs mutating root after
    scaler configured it — is no longer relevant because scaler no longer
    uses root.

Required changes for an internal worker-manager adapter

Step 1 — Add a module-level named logger

At the top of every .py file in the adapter that currently uses
logging.<method>(...) module-level calls, add:

import logging

logger = logging.getLogger(f"scaler.external.{__name__}")

The scaler.external. prefix places the logger under the scaler tree
so it inherits scaler's handlers, format, and level automatically. If
your adapter is already under the scaler. namespace (unlikely for
out-of-repo code), use logging.getLogger(__name__) directly.

Step 2 — Replace module-level logging calls

Mechanical substitution across every .py in the adapter:

logging.info(...)       → logger.info(...)
logging.warning(...)    → logger.warning(...)
logging.error(...)      → logger.error(...)
logging.exception(...)  → logger.exception(...)
logging.debug(...)      → logger.debug(...)

Leave import logging — it's still needed for the logging.getLogger
call and any logging.INFO-level constants.

Step 3 — Remove dead logging-setup code

Delete the following patterns wherever they appear in the adapter:

# In __init__ — remove these lines:
self._logging_paths       = config.logging_config.paths
self._logging_level       = config.logging_config.level
self._logging_config_file = config.logging_config.config_file
# In __aenter__ / run() / similar — remove:
setup_logger(self._logging_paths, self._logging_config_file, self._logging_level)
# Remove the import if now unused:
from scaler.utility.logging.utility import setup_logger

Step 4 — Do not call setup_logger from the adapter

_run_worker_manager in scaler.entry_points.scaler calls
setup_logger(..., process_name=config._tag) as the first statement of
the subprocess. Adapters MUST NOT call it again — doing so resets the
format with the wrong process_name.

Step 5 — Verify

Run the adapter and confirm:

  1. Log lines appear in the unified format
    <time> <level> <process_name>[<pid>]: <message>
    with <process_name> equal to config._tag.
  2. No warnings about "no handler attached to logger …".
  3. A third-party library calling logging.basicConfig(...) from within
    the subprocess does NOT change scaler's log output.

Verification checklist

  • Every .py in the adapter has logger = logging.getLogger("scaler.external." + __name__) (or equivalent scaler-prefixed name).
  • grep -rn 'logging\.\(info\|warning\|error\|exception\|debug\)(' <adapter> returns zero matches.
  • No setup_logger call remains in adapter code.
  • No self._logging_paths / _logging_level / _logging_config_file field storage in adapter classes.
  • Running the adapter under scaler_all produces log lines whose process-name prefix matches the configured _tag.

What NOT to change

  • Do NOT change logging.INFO, logging.WARNING, etc. constants — only the method calls.
  • Do NOT rename the logger variable — stick to this convention for grep-ability.
  • Do NOT attach handlers to your adapter's logger manually. The scaler
    logger already has them; propagation handles the rest.
  • Do NOT set logger.propagate = False on the adapter's logger. You want
    it to propagate up to the scaler parent logger.

Comment thread src/cpp/scaler/logging/logging.h
@gxuu gxuu changed the title Apr better log logging with new format and named logger Apr 17, 2026
@gxuu gxuu changed the title logging with new format and named logger Logging with new format and named logger Apr 17, 2026
@gxuu gxuu marked this pull request as ready for review April 17, 2026 22:05
from scaler.utility.exceptions import ClientCancelledException, ClientQuitException, ClientShutdownException
from scaler.utility.identifiers import ClientID

logger = logging.getLogger(__name__)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__name__ will varies based on different name, are you sure they will use same named logger "scaler" every time?

I think better solution to getLogger each time is define the logger as module variable in utility, then import it

e.g.:

from scaler.utility.logging.logger import logger

logger.info()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think __name__ is the name of the current module, so logging calls from processor.py will use the name processor for example

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharpener6,

@magniloquency is right. We want to print out components' name:

2026-04-20 09:13:34+0800 INFO scheduler[2340581]: VanillaInformationController: exited
2026-04-20 09:13:34+0800 INFO object_storage_server[2340577]: ObjectStorageServer: stopped by user

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This to be honest, I would like that logger to have some process_name passing to setup_logger, to show it's scheduler or object_storage_server, like

setup_logger(process_name="scheduler", logging_level=...)

all files that have logger = logging.getLogger(__name__) is not very elegant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The existing implementation follows what official document recommends.
  • setup_logger is called for each process already and it must be called only once. This, and getLogger are orthogonal.

gxuu added 3 commits April 23, 2026 03:58
Change log output from `[INFO]2026-04-08 09:58:52-0400: ...` to
`2026-04-08 09:58:52-0400 INFO scheduler[23451]: ...` so operators can
identify which subprocess (scheduler / worker_manager / object_storage_server /
gui) emitted each line via the `<process_name>[<pid>]` prefix.

Python side: `setup_logger` now takes `process_name` and bakes it into the
formatter; all 6 CLI entry points pass their component name.

C++ side: `scaler::ymq::Logger` gains a `name` constructor arg and supports
`%(name)s` / `%(process)d` tokens so in-house C++ daemons (ymq, object_storage)
emit the same format.

Signed-off-by: gxu <georgexu420@163.com>
Library code should not reconfigure the root logger — host applications
(or third-party SDKs such as ORB) may configure root independently and
would silently override or suppress scaler output.

Key changes:
- `scaler/__init__.py` attaches a `NullHandler` to the "scaler" logger so
  library imports (e.g. `from scaler import Client`) don't emit "no handler"
  warnings or touch root.
- `setup_logger` now configures the "scaler" logger with `propagate=False`
  (instead of root). Daemon entry points still get consistent formatting;
  submodule loggers inherit via the `scaler.*` namespace.
- Every module uses `logger = logging.getLogger(__name__)` and emits through
  that logger. `logging.<method>()` module-level calls were all migrated.
- `get_logger_info(logging.getLogger("scaler"))` reads scaler's effective
  format/level/paths for the C++ object storage server.
- Removed the ORB "setup_logger after __aenter__" workaround: ORB can no
  longer clobber scaler's logging now that we no longer share the root
  logger.

Signed-off-by: gxu <georgexu420@163.com>
Signed-off-by: gxu <georgexu420@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants