Skip to content

Add trainer metrics back to UI.#527

Open
kmontemayor2-sc wants to merge 8 commits intomainfrom
kmonte/add-trainer-metrics
Open

Add trainer metrics back to UI.#527
kmontemayor2-sc wants to merge 8 commits intomainfrom
kmonte/add-trainer-metrics

Conversation

@kmontemayor2-sc
Copy link
Collaborator

@kmontemayor2-sc kmontemayor2-sc commented Mar 3, 2026

Scope of work done

We should support this, for now we only have Loss as a metric. See https://console.cloud.google.com/vertex-ai/pipelines/locations/us-central1/runs/hom-cora-sup-test-on-20260303-200910?project=external-snap-ci-github-gigl which produces https://console.cloud.google.com/vertex-ai/metadata/locations/us-central1/metadata-stores/default/artifacts/3974059031304843089?project=external-snap-ci-github-gigl which records loss

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

kmonte and others added 2 commits March 3, 2026 18:39
Adapt the /codex-verify slash command from the BAGL repo to target GiGL
directly: update the preamble to reference GiGL instead of BAGL, remove
the submodule warning, and rewrite the full-scope directory list and
find commands to match GiGL's actual project structure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kmontemayor2-sc
Copy link
Collaborator Author

/all_test

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 19:54:04UTC : 🔄 Integration Test started.

@ 21:18:58UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 19:54:04UTC : 🔄 Scala Unit Test started.

@ 20:03:10UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 19:54:06UTC : 🔄 E2E Test started.

@ 21:16:24UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 19:54:06UTC : 🔄 Lint Test started.

@ 20:01:21UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

GiGL Automation

@ 19:54:07UTC : 🔄 Python Unit Test started.

@ 21:00:05UTC : ✅ Workflow completed successfully.

@kmontemayor2-sc kmontemayor2-sc changed the title Kmonte/add trainer metrics Add trainer metrics back to UI. Mar 3, 2026
Copy link
Collaborator

@mkolodner-sc mkolodner-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work Kyle! This generally LGTM, one note is that it's not immediately clear where the metrics can be found (i.e. how do we locate https://console.cloud.google.com/vertex-ai/metadata/locations/us-central1/metadata-stores/default/artifacts/3974059031304843089?project=external-snap-ci-github-gigl, or should they be referencing gs://gigl-cicd-perm/hom_cora_sup_test_on_20260303_200910/trainer/models/trainer_eval_metrics.json in the pipeline you linked).

Can we add some notes to the two example files about where the metrics are exported to and how they can be accessed?

@kmontemayor2-sc
Copy link
Collaborator Author

/e2e_test

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

GiGL Automation

@ 21:02:07UTC : 🔄 E2E Test started.

@ 22:21:52UTC : ❌ Workflow failed.
Please check the logs for more details.

@kmontemayor2-sc kmontemayor2-sc marked this pull request as ready for review March 6, 2026 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants