Skip to content

Snt25 378#48

Merged
claude-marie merged 6 commits intomainfrom
SNT25-378
Mar 17, 2026
Merged

Snt25 378#48
claude-marie merged 6 commits intomainfrom
SNT25-378

Conversation

@claude-marie
Copy link
Contributor

To sum up what was asked in that ticket:
https://bluesquare.atlassian.net/browse/SNT25-378

  1. add a USE_CALENDAR_YEAR_DENOMINATOR parameter
  2. move compute_month_seasonality_NEW and compute_month_seasonality in snt_utils.r
  3. update rainfall & cases to use these
  4. clarification & comments in the notebook
  5. new description for the pipeline using Giulia prompt (I'm still updating it in openhexa bc it is ugly as hell atm)
  6. some fixes in the reporting notebooks

"if (!(\"use_calendar_year_denominator\" %in% names(formals(compute_month_seasonality)))) {\n",
" log_msg(\"Loaded compute_month_seasonality() does not support use_calendar_year_denominator. Applying notebook-level compatible definition.\", level = \"info\")\n",
"\n",
" compute_month_seasonality <- function(input_dt, indicator, values_colname, vector_of_durations,\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point of adding a safety guard, but I don't think it's necessary to re-define the function here ... it adds too much clutter.
I'd issue a clear message instead ("Error: the function xxx does not exist! Enure that the snt_utils.r file is updated to latest version" or similar).

Also, @EstebanMontandon what is the logic for updating the snt_utils.r file?
Do we have a best practice to update this file in an already existing workspace?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we maintain the latest version of the snt_utils.r module in the repository, so every time we select the options "pull scripts" when executing a pipeline, this file gets updated (replacing the file currently in that workspace).
We haven't defined a specific best practice for managing updates to this file, but since all pipelines depend on the code defined here, standard good practices apply: changes must always be backward compatible to avoid breaking existing pipeline logic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I think @sPuntinG is correct in this point, we don't need to do extra validation when calling our own functions, these supposed to be designed so serve specific SNT computations, so we can safely ignore it

"outputs": [],
"source": [
"# Parameters\n",
"minimum_month_block_size <- as.integer(3)\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go as it looks like it overrides the pipeline parameters, no?

It should have something like this (this is what I added in the rainfall rerpoting nb in NER):

# Parameters - Load from dataset if available, otherwise use defaults
parameters_file <- paste0(COUNTRY_CODE, "_parameters.json")
seasonality_dataset <- config_json$SNT_DATASET_IDENTIFIERS$SNT_SEASONALITY_RAINFALL

# Try to load parameters from dataset
parameters <- tryCatch({
    # Get parameters JSON file from dataset (returns a list for JSON files)
    get_latest_dataset_file_in_memory(seasonality_dataset, parameters_file)
}, error = function(e) {
    log_msg(paste0("[WARNING] Could not load parameters from dataset, using defaults: ", conditionMessage(e)), level = "warning")
    NULL
})

# Set parameters from dataset or use defaults
if (!is.null(parameters) && is.list(parameters)) {
    minimum_month_block_size <- as.integer(parameters$minimum_month_block_size)
    maximum_month_block_size <- as.integer(parameters$maximum_month_block_size)
    threshold_for_seasonality <- as.numeric(parameters$threshold_for_seasonality)
    threshold_proportion_seasonal_years <- as.numeric(parameters$threshold_proportion_seasonal_years)
    log_msg(paste0("Parameters loaded from dataset: ", parameters_file))
} else {
    # Default values
    minimum_month_block_size <- as.integer(3)
    maximum_month_block_size <- as.integer(5)
    threshold_for_seasonality <- 0.6
    threshold_proportion_seasonal_years <- 0.5
    log_msg("Using default parameter values")
}

"openhexa <- import(\"openhexa.sdk\")"
"openhexa <- import(\"openhexa.sdk\")\n",
"\n",
"# Compatibility guard: if an older snt_utils.r is loaded (without the new argument),\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in CASES seasonality: I don't think that we need to define the function here again (just issue a descriptive message prompting to update the snt_utils.r file)

" year_column,\n",
" month_column\n",
") {\n",
"# Function to find the most frequent FIRST starting month for a given admin unit and block duration\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question/curiosity: this is your new approach for identifying the starting month @claude-marie ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is, I can tell you about on a quick call if you want :)

Copy link
Contributor

@sPuntinG sPuntinG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice job!
But please check my comments and implement the suggested changes (if they makes sense to you too) before merging :)

@sPuntinG
Copy link
Contributor

sPuntinG commented Mar 11, 2026

Also @claude-marie on this:

  1. new description for the pipeline using Giulia prompt (I'm still updating it in openhexa bc it is ugly as hell atm)

do you also find the MD editor in the OH pipelines description a bit annoying to work with?
(that is also way I was suggesting to put the link to the GitHub pipeline readme.md for the better looking formatted description ... !)

@claude-marie
Copy link
Contributor Author

@sPuntinG YES, the description part is super annoying. It doesn't understand markdown and you have to do everything yourself which is super annoying. It would be great to be able to drop a MD

@claude-marie claude-marie requested a review from sPuntinG March 16, 2026 16:16
Copy link
Contributor

@sPuntinG sPuntinG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good :)

I just couldn't make sense of the merge conflict because of the ipynb structure that is really painful to diff ... so I trust you'll handle that 😅

@claude-marie claude-marie merged commit 0d543dd into main Mar 17, 2026
@claude-marie claude-marie deleted the SNT25-378 branch March 17, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants