Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Utility tasks built-in or pulled out #429

Open
effigies opened this issue Feb 24, 2021 · 8 comments
Open

RFC: Utility tasks built-in or pulled out #429

effigies opened this issue Feb 24, 2021 · 8 comments
Labels
enhancement New feature or request to consider suggesting changes that require more discussion

Comments

@effigies
Copy link
Contributor

A lot of the things in nipype.interfaces.io and nipype.interfaces.utility would be useful to have around. Should we be making a task package for that or bundling directly into pydra.tasks?

@effigies effigies added enhancement New feature or request to consider suggesting changes that require more discussion labels Feb 24, 2021
@satra
Copy link
Contributor

satra commented Feb 24, 2021

i think it would be good to discuss which ones and if there are alternative approaches in pydra. and then we can discuss where.

for example, i think rename is being built into the spec and we will want to put datasink also into the spec. identityinterface is no longer required.

@effigies
Copy link
Contributor Author

Data grabbers and data sinks were the main things I was thinking about. But here's a list:

  • List operations (utility.base)
    • Merge
    • Select
    • Split
  • CSV Reader(utility.csv)
  • Data grabbers (io)
    • DataFinder
    • DataGrabber
    • S3DataGrabber
    • SSHDatatGrabber
    • SelectFiles

XNAT, BIDS, etc make sense not to put directly in pydra of course.

@satra
Copy link
Contributor

satra commented Feb 25, 2021

thanks. these should be relatively easy to move over, since most are just python functions. we should decide where they should go pydra.tasks.core.io/utility so core is a package that only pydra provides.

@djarecka
Copy link
Collaborator

djarecka commented Apr 24, 2021

I'm debugging a pydra workflow from pydra-glm-example and I'm thinking about nipype interface - SelectFile. I believe we should discourage using Nipype1Task, but just creating a FunctionTask before we create pydra.SelectFile as suggested here.

I should create some examples, am I right that SelectFiles is mostly used as a connection from infosource with iterables?

@effigies
Copy link
Contributor Author

I think it could be easily hooked up with iterables, but I don't know that it's "mostly used" that way. I haven't really used it, so I don't know for sure how others use it, and I've generally avoided iterables, so take that for what it's worth.

@satra
Copy link
Contributor

satra commented Apr 26, 2021

conceptually selectfiles is just a simple interface to getting data, whether that is connected to infosource or not is up to each workflow creator. the reason why infosource/inputnode (both are identityinterfaces) is used is for dataflow purposes, which should not be required in the context of pydra's design (which makes a workflow a tasks and splits can be applied to any inputs).

@djarecka
Copy link
Collaborator

djarecka commented Apr 26, 2021

but do we want to create pydra.SelectFiles? It's very easy to create a python function and just add splitter

@satra
Copy link
Contributor

satra commented Apr 26, 2021

i think we could have a set of utility functions that are general purpose across many use cases. but only if they are clear and prevents recreating the same code in many different workflows. if you put it in pydra, i would label the tasks as experimental in the sense that they could be moved out.

selectfiles is generally a non-cacheable function since it involves taking a look at folder that could have changed between runs and we may not want to necessarily hash the input directory. i think we need to be able to at least indicate that even if we end up not creating the function. so hashability should also be taken into consideration. perhaps think about how users would create/use such a function and see if it's better to provide one that reduces some of these complications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request to consider suggesting changes that require more discussion
Projects
None yet
Development

No branches or pull requests

3 participants