Skip to content

issues Search Results · repo:ipums/hlink language:Python

Filter by

83 results
 (71 ms)

83 results

inipums/hlink (press backspace or delete to remove)

There are lots of changes that we need to document for version 4.0.0. - [ ] training.param_grid is deprecated. - [ ] There is a new training.model_parameter_search option that replaces training.param_grid. ...
  • riley-harper
  • Opened 
    on Dec 19, 2024
  • #183

This is a bug, since the checkpoint directory must be on shared storage, but the tmp directory should not be on shared storage. Changing this might require some poking around and changing in the SparkConnection ...
configuration
type: bug
  • riley-harper
  • Opened 
    on Dec 13, 2024
  • #181

F-measure is another helpful model metric, which can be computed in terms of precision and recall: f-measure = 2 * ((precision * recall) / (precision + recall)) If you plug in the definitions of precision ...
component: model exploration
  • riley-harper
  • 1
  • Opened 
    on Dec 11, 2024
  • #179

This logic makes up a large chunk of the complexity of model exploration and takes a lot of time to compute. It is not used at all by researchers at IPUMS. Creating high-quality training data is also out ...
component: model exploration
  • riley-harper
  • Opened 
    on Dec 9, 2024
  • #176

Instead of accepting the entire training configuration in predict_using_thresholds, we would be better off just accepting a single decision argument, since that s the only thing that the function extracts ...
component: core
  • riley-harper
  • Opened 
    on Dec 5, 2024
  • #174

Some of the code in choose_classifier() explicitly excludes the threshold and threshold_ratio keys from the params dict. This is because these attributes are stored alongside the parameters in the config ...
component: core
  • riley-harper
  • Opened 
    on Dec 4, 2024
  • #172

Currently there are two ways to generate the list of model (hyper)parameters to search in model exploration. You can either provide a list of all of the models that you would like to test, or you can set ...
component: model exploration
type: feature
  • riley-harper
  • 4
  • Opened 
    on Nov 26, 2024
  • #167

https://hlink.docs.ipums.org/substitutions.html says that If the input column data equals a value in the first column of the substitution file, it is replaced with the data in the second column of the ...
documentation
type: bug
  • riley-harper
  • Opened 
    on Nov 20, 2024
  • #163

In addition to XGBoost (#161), we would also like to add support for LightGBM. This should work similarly to XGBoost, since we d also like to make LightGBM opt-in. From the documentation, it sounds like ...
type: feature
  • riley-harper
  • 3
  • Opened 
    on Nov 19, 2024
  • #162

This is a new IPUMS-motivated feature. We would like to integrate the XGBoost library into hlink so that you can use it like any of the other ML algorithms already available. Since XGBoost-Spark integration ...
type: feature
  • riley-harper
  • 3
  • Opened 
    on Nov 14, 2024
  • #161
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub