[pull] master from Netflix:master #106

pull · 2023-05-04T00:27:56Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

okteto-cloud · 2023-05-04T00:27:59Z

Your preview environment savingoyal-metaflow-pr-106 failed to deploy.

metaflow/plugins/cards/card_viewer/viewer.html

+         * Setup a new card and start the watcher
+         */
+        function handleLoadCard(name, value) {
+          document.getElementsByTagName("title").innerHTML = name;


metaflow/plugins/cards/card_viewer/viewer.html

+        function handleLoadCard(name, value) {
+          document.getElementsByTagName("title").innerHTML = name;
+          const iframe = document.getElementById("card-frame");
+          iframe.src = `/card/${value}?embed=true`;


* Emit graph info to event logger, add runtime start metric * Remove graph_info logging from resume * Simplify run and resume metrics * Remove logger update context * Refactor monitor and logger to not use update_context * Address comments

* use the correct path * suggested changes

* better error message with dump * only print on final failure

* Move DeployedFlow and Triggered run to a more inheritance model. The _enrich_* methods were making it very hard to generate proper stubs. This will hopefully improve that. * Fix circular dependency and address comments Some doc cleanup as well. * Removed extraneous code * Forgot file * Update NBDeployer to forward to underlying deployer * Fix circular deps and other crap * Added stub-gen for deployer (try 2) * Fix stub test * Fix dataclients * fix docstrings for deployer (#2119) * Renamed metadata directory to metadata_provider (and related changes) There was a clash between the `metadata` function and the `metadata` module which caused issues with stubs (at least the new way of generating) * Forgot one * Typo * Make nbdocs happy * add docstrings for injected methods --------- Co-authored-by: Romain Cledat <[email protected]>

This reverts commit 65fd888.

* trigger_on_finish sorta works * trigger deco works for event * trigger events changes * run black * ok this one ran black * black ran for real * deleting some things i missed * add format_deploytime_value() to both decos * fixes error with types * ran black * fixes failing cases * remove json.loads and add tuple to param type * ran black * function within parameter * Delete local config file * fixing borked cases * refactor code * shouldn't be changes to flowspec.py * fixing bugs * add deploy time trigger inits to argo parameter handling (#2146) * run black * undo modifications to to_pod * reset util file * pr comment * remove print --------- Co-authored-by: kayla seeley <[email protected]> Co-authored-by: Sakari Ikonen <[email protected]>

This reverts commit 50298d7.

* Add an async sigint handler * Improve async runner interface * Fix error handling in *handle_timeout * Fix async_read_from_file_when_ready * Simplify async_kill_processes_and_descendants * Group kill_process_and_descendants into single command * Call sigint manager only for live processes * Guard against empty pid list in sigint handler * Cleanup comments * Reimplement CommandManager.kill in terms of kill_processes_and_descendants * Use fifo for attribute file * Fix `temporary_fifo` type annotation * Fix `temporary_fifo` type annotation on py37 and py38 * Fix rewriting the current run metadata * Wrap execs in `async_kill_processes_and_descendants` in try/except

* trigger_on_finish sorta works * trigger deco works for event * trigger events changes * run black * ok this one ran black * black ran for real * deleting some things i missed * add format_deploytime_value() to both decos * fixes error with types * ran black * fixes failing cases * remove json.loads and add tuple to param type * ran black * function within parameter * Delete local config file * fixing borked cases * refactor code * shouldn't be changes to flowspec.py * fixing bugs * add deploy time trigger inits to argo parameter handling (#2146) * run black * undo modifications to to_pod * reset util file * pr comment * remove print * remove bad append * Fix issue with to_pod and deploy time fields --------- Co-authored-by: kayla seeley <[email protected]> Co-authored-by: KaylaSeeley <[email protected]> Co-authored-by: KaylaSeeley <[email protected]> Co-authored-by: Sakari Ikonen <[email protected]> Co-authored-by: Romain Cledat <[email protected]>

…2155)

s3 based tests on minio will require the label `ok-to-test` or `approved`

1. Added tests 2. Fixed invalid stub generation: ``` def __init__(self, func: typing.Callable[[<class 'metaflow_extensions.nflx.plugins.datatools.dataframe.MetaflowDataFrame'>[T_MDF_co], typing.Optional[<class 'metaflow_extensions.nflx.plugins.functions.core.function_parameters.FunctionParameters'>[T_FP_co]]], <class 'metaflow_extensions.nflx.plugins.datatools.dataframe.MetaflowDataFrame'>[T_MDF_co]], **kwargs: typing.Dict[str, typing.Any]): ``` to: ``` def __init__(self, func: typing.Callable[[metaflow.plugins.datatools.dataframe.MetaflowDataFrame[T_MDF_co], typing.Optional[metaflow.mf_extensions.nflx.plugins.functions.core.function_parameters.FunctionParameters[T_FP_co]]], metaflow.plugins.datatools.dataframe.MetaflowDataFrame[T_MDF_co]], **kwargs: typing.Dict[str, typing.Any]): ``` --------- Co-authored-by: Nissan Pow <[email protected]>

- in some cases (if you called TWO commands like create and trigger) with a Runner/Deployer on the same flow, the flow/step mutators would be applied *twice* which could lead to weird errors (duplicate decorators) - using a None default for configs failed (#2449)

This commit adds support for specifying Kubernetes container image pull secrets in the `kubernetes` step decorator. As an example: ```python @kubernetes( image='docker.io/some-private-repo/image', image_pull_secrets=['regcred'] ) @step def some_step: ... ``` Example output from `kubectl describe pod <some-pod>` on the `metaflow-dev` stack where `metaflow` was installed via `pip install -e </path/to/repo/checkout>`: ```console Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned default/t-f03abf3d-2rpgp-j72q8 to minikube Normal Pulling 1m kubelet Pulling image "docker.io/some-private-repo/image" Normal Pulled 1m kubelet Successfully pulled image "docker.io/some-private-repo/image" in 669ms (670ms including waiting) Normal Created 1m kubelet Created container some-step Normal Started 1m kubelet Started container some-step ``` Resolves #1261

Earlier the `save_card` function ended up saving cards at two locations. Thereby doubling the number of cards. (this was a temporary fix; which we never ended up getting rolled back) > Why did we do this ? : When cards were introduced there was an assumption made about task-ids being unique. This assumption was incorrect. Only the pathspec is unique and there are no such guarantees about task-ids. When task-ids are non-unique, card read would result in finding incorrect cards (because cards were stored based on task-ids). If we had immediately switched from storing cards based on `task-ids` to a `step-name/task-id`, then card reading would have crashed for many users. It would especially happen for users who are accessing cards created by a newer MF client from an older version of MF client. It will also easily end up breaking the metaflow-ui (which maybe using a client from an older version). But since now its been over 3 years and there have been multiple newer versions of clients rolled out, we can safely remove this since most cards will be created by clients in the last 3 years. It keep the listing/reading path intact. But only removes the dual write part.

adds a top-level option to toggle force-rebuilding the environment ``` python HelloConda.py --environment conda --force-rebuild-environment run ``` the introduction of the new option is not mandatory. we could simply proceed with an internal envvar to enable the functionality for now.

closes #2466

When injecting commands to the runner, the stubs generated were not accurate or complete. This addresses this issue

- cards can now have metadata about important information created for them like ids/types/ other info in decorators - This also allows us for book-keeping the attempts as cards are created. - We write the metadata in the cli so we put something stateful on top to ensure we don't end up writing the metadata multiple times. - Overall, since cards are being written async, we can only ensure best effort registration from metaflow side since we don't necessarily block when we save cards. - This commit implementation will save metadata on a per-card basis. Instead of updating all card metadata at once. We do this because we will otherwise have multiple copies of the metadata in the database. - The metadata writing is optional and can be disabled via the config variable. - This addition also allows us to update the metaflow-service code for filtering cards based on attempt basis. --------- Co-authored-by: Sakari Ikonen <[email protected]>

There is an error when trying to retrieve the IMDSv2 token because the header value is `int`. This causes all the requests to fall back to IMDSv1.

Reverts #2478

Reverts #2479 --------- Co-authored-by: Valay Dave <[email protected]>

implementing support for running custom functions as exit hooks with argo workflows, local, kubernetes and batch runtimes. also refactors current lifecycle hook setup to use the same primitives. Suggested UX for defining custom exit hooks in the flow file: ```python from time import sleep from typing import Optional from metaflow import step, FlowSpec, Parameter, exit_hook, Run def success_print(): print("Flow completed successfully. Performing some success related notifications.") def failure_print(run: Optional[Run]): print("Flow failed.") if run is not None: print("No run was registered.") return print("failed run id:", run.id) print( "Failed tasks:\n", "\n".join( [task.pathspec for step in run for task in step if not task.successful] ), ) @exit_hook(on_success=[success_print]) @exit_hook(on_error=[failure_print]) class NotifyFlow(FlowSpec): should_fail = Parameter(name="should_fail", default=False) @step def start(self): print("Starting 👋") print("Should fail?", self.should_fail) if self.should_fail: sleep(60) raise Exception("failing as expected") self.next(self.end) @step def end(self): print("Done! 🏁") if __name__ == "__main__": NotifyFlow() ``` --------- Co-authored-by: Madhur Tandon <[email protected]> Co-authored-by: madhur-ob <[email protected]>

introduces a `get_secrets` function that exposes the same capabilities as the `@secrets` decorator, but returns them as a dictionary instead of setting environment variables. The use case is to make accessing secrets possible in user-code as well. example: ```python @step def start(self): secrets_dict = get_secrets(sources=["secret_source"], role="IAM_role_for_accessing") for secret_spec, contents_dict in secrets_dict.items(): pass # do something with the fetched secrets ```

Changes the secret fetching function to a more comfortable UX: ```python from metaflow.plugins.secrets import get_secret secret_dict = get_secret(source="some-secret-source") ```

Reverts #2461

increases resource request for argo exit hook pods

reapplying the new packaging changes for releasing. Co-authored-by: Romain <[email protected]>

pull bot added the ⤵️ pull label May 4, 2023

github-advanced-security bot found potential problems Jan 11, 2024

View reviewed changes

valayDave and others added 27 commits October 25, 2024 09:13

[kubernetes][jobsets] dont set replicas to 0 (#2118)

a37555b

prepare release 2.12.28 (#2125)

6b6bc41

ping r version to 4.4.1 to fix test (#2127)

b4c5f29

dont support slurm with airflow and sfn (#2134)

5450af5

use METAFLOW_CARD_LOCALROOT for card server (#2121)

4dc910a

* use the correct path * suggested changes

better error message with dump (#2130)

65fd888

* better error message with dump * only print on final failure

Update escape hatch sys path when used in jupyter notebook (#2132)

81eb31b

Prepare release 2.12.29 (#2136)

0bc4a96

Revert "better error message with dump (#2130)" (#2141)

a3312cc

This reverts commit 65fd888.

bump version to 2.12.30 (#2142)

05f9756

[jobsets] Fix killing jobsets using deletion (#2149)

d8c7cc5

Bump version to 2.12.31

a3d087e

[runtime] run-hbs for local runtime's run command (#2150)

15c86b9

bump version to 2.12.32 (#2151)

6355a1b

[parallel] exclude _parallel_ubf_iter from merge_artifacts (#2152)

7b594b4

Revert "Deploy time triggers (#2133)" (#2153)

f3a1857

This reverts commit 50298d7.

Update version.py (#2154)

8ccbbbe

fix: S3 paths with fragments not working (#2156)

2efd754

Add generic typing to multicore_utils.py (#2147)

c7cb8b5

bump version to 2.12.34 (#2158)

c72c93f

fix --only-json issues (#2162)

9e532f2

feature: Support choosing Kubernetes QoS class through the decorator (#…

0469eeb

…2155)

madhur-ob and others added 30 commits June 10, 2025 10:55

S3 tests dev stack using minio (#2445)

8abe23f

s3 based tests on minio will require the label `ok-to-test` or `approved`

bump version to 2.15.16 (#2450)

5bbfb0c

bump version (#2455)

74c2cf4

remove references to dual writes in card datastore (#2458)

84c1ad8

remove card dual writes (#2457)

79b7aa8

release: 2.15.18 (#2465)

cc20a0a

vendor imghdr into metaflow (#2467)

69a040b

closes #2466

Fixup how commands show up as part of the Runner API (#2468)

b798979

When injecting commands to the runner, the stubs generated were not accurate or complete. This addresses this issue

Fix IMDSv2 Error (#2470)

0e3693c

There is an error when trying to retrieve the IMDSv2 token because the header value is `int`. This causes all the requests to fall back to IMDSv1.

[stubs] support for TypedDicts (#2478)

dc28441

Revert "[stubs] support for TypedDicts" (#2479)

0c32acb

Reverts #2478

[stubs] support for TypedDicts (#2480)

65fb7d4

Reverts #2479 --------- Co-authored-by: Valay Dave <[email protected]>

prep patch release (#2481)

00eb520

fix: change get_secrets to get_secret (#2482)

0affc3e

Changes the secret fetching function to a more comfortable UX: ```python from metaflow.plugins.secrets import get_secret secret_dict = get_secret(source="some-secret-source") ```

prep patch release (#2483)

b673d4d

[RfR] New packaging support (#2461)

8120701

Revert "[RfR] New packaging support" (#2485)

c1aa505

Reverts #2461

feature: increase argo exit hook resources (#2484)

d164253

increases resource request for argo exit hook pods

release: 2.15.21 (#2486)

3516b41

New packaging support (#2487)

61d9bb7

reapplying the new packaging changes for releasing. Co-authored-by: Romain <[email protected]>

[WIP] Feat/user decorators v2 (#2463)

85fb152

Minor release (#2488)

7e5645f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] master from Netflix:master #106

[pull] master from Netflix:master #106

Uh oh!

pull bot commented May 4, 2023 •

edited

Loading

Uh oh!

okteto-cloud bot commented May 4, 2023 •

edited

Loading

Uh oh!

Check warning

Check warning

Uh oh!

[pull] master from Netflix:master #106

Are you sure you want to change the base?

[pull] master from Netflix:master #106

Uh oh!

Conversation

pull bot commented May 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

okteto-cloud bot commented May 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Check warning

Uh oh!

Check warning

Uh oh!

Uh oh!

pull bot commented May 4, 2023 •

edited

Loading

okteto-cloud bot commented May 4, 2023 •

edited

Loading