This repository was archived by the owner on Jan 29, 2026. It is now read-only.
Open
Conversation
[don't merge yet] Fix package names and travis config
Update instructions with Cloud Object Storage
* arch diag * arch diag * Update README.md * adding specs * adding specs
* update none VM_TYPE * polish export commands
* Extent make test-submit waiting time
…BM#22) * assign default edit role to lcm * add helm value options for 1.7 and below
* adding prereqs and bumping user guide in front * adding prereqs and bumping user guide in front
…ors) (IBM#24) * add caffe2 and pytorch cpu support * update LCM, learner config file, and example jobs * fix pytorch example bug * Update gpu-guide.md * Update gpu-guide.md * merge CPU and GPU examples into a single example * add more tf framework versions * fix typo * add S3 prereq
* update UI instructions * fix command
* adding contributors * Update README.md
* Updating maintainers file * Update MAINTAINERS.md
* add converting script * update converter readme and update tensorflow version * update troubleshooting * Update README.md * Update gpu-guide.md * Update README.md * Update README.md * Update README.md
* Adding references to Watson Studio * Update README.md * Rename README.md to ffdl-wml.md * Update README.md * Create train-deploy-wml.md * Update train-deploy-wml.md * Update README.md * Update ffdl-wml.md * Update ffdl-wml.md * Update ffdl-wml.md * Update README.md * Update README.md * update WML instructions * revert tf example * update caffe manifest
* Update feature-gates for k8s 1.9.4 and above * Update troubleshooting * Update README.md
…ild (IBM#51) * * Add codebase configuration for device plugin and custom learner images * Add developer guide for those who want to do a custom FfDL build * update developer-guide * fix declare type
Plus minor fixes
* Creating CLA * Update CLA.md * Update CONTRIBUTING.md
* Update ART Notebook after PR IBM#79 - Load cluster configuration from environment variables - Require PUBLIC_IP and KUBECONFIG instead of CLUSTER_NAME and VM_TYPE - Use storage type "mount_cos" (s3fs) instead of "s3_datastore" * Update ART demo notebook after PR IBM#79 - Load cluster configuration from environment variables - Require PUBLIC_IP and KUBECONFIG instead of CLUSTER_NAME and VM_TYPE - Use storage type "mount_cos" (s3fs) instead of "s3_datastore"
* update dl framework versions * update examples with new framework tags
* update fashion mnist example with seldon 0.2 * fix readme
* Pointed travis testing to do hostmount minikube * Debugging permissions error. * Fix to mkdir problems. * Fixed Makefile syntax. * Printing debugging information about pods. * Printing debugging information about pods. * Printing debugging information about pods. * Printing debugging information incl kubectl get pod. * Enabled debug mode. * Again. * Set debug as default. * tracing from the trainer to lcm * more debugging * added lower level logging * dist: xenial * Update .travis.yml * fix typo * Trying to fix Travis issue. * Fixed Travis issue. * Followed Tommy's request and increased resource limits to values from before. Might break CI. * Parameterized memory values like Tommy requested. * Attempt to fix CI. * Removed excessive debug statements and cleaned comments. Probably breaks code. * DLaaS pull june 14, with security mods * fixed glide problem * Added Image.go etc. files, deleted learner_test.go * temporarily disable framework validation * FIXME: Disable validation check for bucket until conditionalize for s3fs vs. option. * fixed two bugs related to volume mounting * I think mostly just logging changes * basic success * Add FfDL.iml to .gitignore * removed docker ref to csf_env.properties * Test for mount_cos before attempting s3 validation * fixed hostmount by pre-setup of model code in Makefile * fixed missing import * log HELM_DEPLOY_DIR, add a bunch of logging for the ci test * Added create-volumes to jenkins file, more verbose docker build for ui * Wound back Angular to 6.0.8 * Quiet docker-build-ui docker build * merged bin/create_static_volumes_config2.sh into bin/create_static_volumes_config.sh
* update prebuild image version, update helm chart to 0.1.1 * fix make deploy bug
…ce (IBM#110) * make helm charts and scripts compatible to deploy FfDL on any namespace * allow users to export all the enviornment variables in a txt file * Update readme with new notice * Fix typo * Update static volumes config v2 namespace parameter * capitalize NAMESPACE, update Makefile, developer guide, and trobleshooting.
LGTM. Ran fine / fixed statsd issue on Ubuntu 18.04 Vagrant VM.
* Simplifying README * Simplifying README * Create detailed-installation-instructions.md * Update README.md * Update detailed-installation-instructions.md * Update and rename detailed-installation-instructions.md to detailed-installation-guide.md * Update detailed-installation-guide.md * Update detailed-installation-guide.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md
* added pytorch distributed example draft * use pytorch community image * added experimental pytorch example * update launch example * add onnx example * update c10d dist example * update cuda method for gpu and group definition for c10d * update c10d to based on SEP 11 build on PyTorch master. * update custom pytorch version name * update data parallelism examples * update distributed training examples * update pytorch c10d examples * delete dummy file * fix minor bugs * update example to sep 25 build * update readme * update changes for distributed CPU job * update multi-gpu code * update multi-gpu code * update world_size for multi-gpu senario * update world_size for multi-gpu senario * add seldon ngraph example, update data parallelism with multi gpu * remove unnecessary device code * added pytorch mpi core changes and seldon readme * added pytorch mpi core changes and seldon readme * update readme * update example readme and remove old distributed example * update c10d-paralleism example naming and readme. * update readme with better naming and consistancy
* Create PyTorch.md * Update PyTorch.md * Update PyTorch.md * Update PyTorch.md * Update PyTorch.md * arch-image * Update PyTorch.md * Adding temporary linkage to PyTorch 1.0 * Fix broken link for Seldon example * Correcting Horovod naming
* Travis CI: lint Python for syntax errors and undefined names In Travis CI, add a Python linting step that runs [flake8](http://flake8.pycqa.org) to find syntax errors and undefined names. [flake8](http://flake8.pycqa.org) testing of https://github.com/IBM/FfDL on Python 3.7.0 $ __flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics__ ``` ./etc/examples/c10d-onnx-mpi/model-files/train_dist_onnx_mpi.py:93:55: F821 undefined name 'bsz' num_batches = ceil(len(train_set.dataset) / float(bsz)) ^ ./etc/examples/c10d-dist-onnx/model-files/train_dist_onnx.py:121:14: E999 SyntaxError: positional argument follows keyword argument ', epoch ', epoch, '. avg_loss: ', ^ 1 E999 SyntaxError: positional argument follows keyword argument 1 F821 undefined name 'bsz' 2 ``` __E901,E999,F821,F822,F823__ are the "_showstopper_" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. Most other flake8 issues are merely "style violations" -- useful for readability but they do not effect runtime safety. * F821: undefined name `name` * F822: undefined name `name` in `__all__` * F823: local variable name referenced before assignment * E901: SyntaxError or IndentationError * E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree * Undefined name: bsz --> batch_size * Fix syntax error: print() does not accept an 'rank=' parameter
* update converter bx commands to ibmcloud * update converter bx commands to ibmcloud
* Updated setup script to K8s 1.13. * Modified Makefile
*Total -- 4,710.69kb -> 3,241.39kb (31.19%) /dashboard/src/assets/img/ffdl-blue.png -- 6.10kb -> 3.07kb (49.66%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/ffdl-blue.png -- 6.10kb -> 3.07kb (49.66%) /dashboard/src/assets/img/ffdl-text.png -- 10.97kb -> 5.73kb (47.73%) /dashboard/src/assets/img/ffdl.png -- 6.04kb -> 3.31kb (45.14%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/p1.png -- 488.27kb -> 275.07kb (43.67%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/ui-example.png -- 80.93kb -> 47.47kb (41.35%) /docs/images/ui-example.png -- 80.93kb -> 47.47kb (41.35%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/ffdl-fashion.png -- 301.63kb -> 177.72kb (41.08%) /docs/images/ffdl-architecture.png -- 529.86kb -> 314.10kb (40.72%) /docs/images/ffdl-pattern-arch.png -- 413.11kb -> 246.18kb (40.41%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/fashion-arch.png -- 444.84kb -> 266.69kb (40.05%) /docs/images/horovod.png -- 73.13kb -> 44.93kb (38.56%) /demos/fashion-mnist-adversarial/images/ffdl-art-jupyter.png -- 258.40kb -> 160.20kb (38%) /community/FfDL-H2Oai/images/ffdl-h203.png -- 246.33kb -> 171.45kb (30.4%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/1.jpg -- 308.56kb -> 221.44kb (28.23%) /etc/examples/images/pytorch-ffdl-onnx.png -- 277.25kb -> 199.65kb (27.99%) /demos/fashion-mnist-training/sample-test-data/sneaker.jpg -- 29.37kb -> 24.93kb (15.13%) /demos/fashion-mnist-training/sample-test-data/trouser.jpg -- 26.84kb -> 22.79kb (15.08%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/5.jpg -- 42.41kb -> 36.41kb (14.15%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/4.jpg -- 29.22kb -> 25.33kb (13.3%) /demos/fashion-mnist-training/sample-test-data/sandal3.jpg -- 40.43kb -> 35.15kb (13.04%) /docs/images/ffdl-arch-web.png -- 191.11kb -> 166.52kb (12.87%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/7.jpg -- 34.31kb -> 30.11kb (12.25%) /demos/fashion-mnist-adversarial/images/adv_sample_predictions.png -- 142.53kb -> 125.72kb (11.79%) /demos/fashion-mnist-training/sample-test-data/coatwhite.jpg -- 34.56kb -> 30.51kb (11.73%) /demos/fashion-mnist-training/sample-test-data/sneakerbrown.jpg -- 46.75kb -> 41.43kb (11.37%) /demos/fashion-mnist-training/sample-test-data/dress2.jpg -- 46.95kb -> 41.65kb (11.3%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/0.jpg -- 37.63kb -> 33.53kb (10.9%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/8.jpg -- 81.46kb -> 72.65kb (10.82%) /demos/fashion-mnist-training/sample-test-data/trouser2.jpg -- 29.28kb -> 26.11kb (10.82%) /demos/fashion-mnist-training/sample-test-data/redtshirt.jpg -- 34.92kb -> 31.46kb (9.91%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/3.png -- 2.68kb -> 2.44kb (8.98%) /demos/fashion-mnist-training/sample-test-data/boot2.jpg -- 45.36kb -> 41.31kb (8.93%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/9.jpg -- 76.05kb -> 69.67kb (8.39%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/images/6.jpg -- 46.66kb -> 42.80kb (8.28%) /demos/fashion-mnist-adversarial/images/ffdl.png -- 15.40kb -> 14.30kb (7.18%) /demos/fashion-mnist-training/fashion-mnist-webapp/static/img/codait-logo.jpg -- 144.32kb -> 139.03kb (3.67%)
Adding name and year per when the file was added. Signed-off-by: Sahdev Zala <[email protected]> Signed-off-by: Sahdev Zala <[email protected]>
* refactor helm charts * first draft of the new Travis CI script * first draft of the new Travis CI script * add helm init for CI * enhance CI script * fix typo in script * package helm chart and move to docs repo for helm chart hosting preparation * condense the 4 helm charts into 3 * update detailed installation guide * update developer guide * update helm chart naming
Signed-off-by: Giovanni Rosa <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Developer's Certificate of Origin 1.1
Description
Hi!
The Dockerfile placed at "metrics/log_collectors/regex_extractor/Dockerfile" contains the best practice violation DL3008 detected by the hadolint tool.
The smell DL3008 occurs when the version pinning for the installed packages with apt is not specified. This could lead to unexpected behavior when building the Dockerfile.
In this pull request, we propose a fix for that smell generated by our fixing tool. We have verified that the patch is correct before opening the pull request.
To fix this smell, specifically, we use a heuristic approach that selects the most probable version tag for a given apt package corresponding to the latest version at the current date. The package versions are retrieved from the Canonical Launchpad APIs.
This change is only aimed at fixing that specific smell. If the fix is not valid or useful, please briefly indicate the reason and suggestions for possible improvements.
Thanks in advance